summaryrefslogtreecommitdiff
path: root/include/linux/sunrpc
AgeCommit message (Collapse)Author
2018-12-19SUNRPC: remove RPCAUTH_AUTH_NO_CRKEY_TIMEOUTNeilBrown
This is no longer used. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19NFS: move credential expiry tracking out of SUNRPC into NFS.NeilBrown
NFS needs to know when a credential is about to expire so that it can modify write-back behaviour to finish the write inside the expiry time. It currently uses functions in SUNRPC code which make use of a fairly complex callback scheme and flags in the generic credientials. As I am working to discard the generic credentials, this has to change. This patch moves the logic into NFS, in part by finding and caching the low-level credential in the open_context. We then make direct cred-api calls on that. This makes the code much simpler and removes a dependency on generic rpc credentials. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: add side channel to use non-generic cred for rpc call.NeilBrown
The credential passed in rpc_message.rpc_cred is always a generic credential except in one instance. When gss_destroying_context() calls rpc_call_null(), it passes a specific credential that it needs to destroy. In this case the RPC acts *on* the credential rather than being authorized by it. This special case deserves explicit support and providing that will mean that rpc_message.rpc_cred is *always* generic, allowing some optimizations. So add "tk_op_cred" to rpc_task and "rpc_op_cred" to the setup data. Use this to pass the cred down from rpc_call_null(), and have rpcauth_bindcred() notice it and bind it in place. Credit to kernel test robot <fengguang.wu@intel.com> for finding a bug in earlier version of this patch. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: introduce RPC_TASK_NULLCREDS to request auth_noneNeilBrown
In almost all cases the credential stored in rpc_message.rpc_cred is a "generic" credential. One of the two expections is when an AUTH_NULL credential is used such as for RPC ping requests. To improve consistency, don't pass an explicit credential in these cases, but instead pass NULL and set a task flag, similar to RPC_TASK_ROOTCREDS, which requests that NULL credentials be used by default. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19NFS/SUNRPC: don't lookup machine credential until rpcauth_bindcred().NeilBrown
When NFS creates a machine credential, it is a "generic" credential, not tied to any auth protocol, and is really just a container for the princpal name. This doesn't get linked to a genuine credential until rpcauth_bindcred() is called. The lookup always succeeds, so various places that test if the machine credential is NULL, are pointless. As a step towards getting rid of generic credentials, this patch gets rid of generic machine credentials. The nfs_client and rpc_client just hold a pointer to a constant principal name. When a machine credential is wanted, a special static 'struct rpc_cred' pointer is used. rpcauth_bindcred() recognizes this, finds the principal from the client, and binds the correct credential. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: discard RPC_DO_ROOTOVERRIDE()NeilBrown
it is never used. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: remove machine_cred field from struct auth_credNeilBrown
The cred is a machine_cred iff ->principal is set, so there is no need for the extra flag. There is one case which deserves some explanation. nfs4_root_machine_cred() calls rpc_lookup_machine_cred() with a NULL principal name which results in not getting a machine credential, but getting a root credential instead. This appears to be what is expected of the caller, and is clearly the result provided by both auth_unix and auth_gss which already ignore the flag. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: remove uid and gid from struct auth_credNeilBrown
Use cred->fsuid and cred->fsgid instead. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: remove groupinfo from struct auth_cred.NeilBrown
We can use cred->groupinfo (from the 'struct cred') instead. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19SUNRPC: add 'struct cred *' to auth_cred and rpc_credNeilBrown
The SUNRPC credential framework was put together before Linux has 'struct cred'. Now that we have it, it makes sense to use it. This first step just includes a suitable 'struct cred *' pointer in every 'struct auth_cred' and almost every 'struct rpc_cred'. The rpc_cred used for auth_null has a NULL 'struct cred *' as nothing else really makes sense. For rpc_cred, the pointer is reference counted. For auth_cred it isn't. struct auth_cred are either allocated on the stack, in which case the thread owns a reference to the auth, or are part of 'struct generic_cred' in which case gc_base owns the reference, and "acred" shares it. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-02SUNRPC: Fix a memory leak in call_encode()Trond Myklebust
If we retransmit an RPC request, we currently end up clobbering the value of req->rq_rcv_buf.bvec that was allocated by the initial call to xprt_request_prepare(req). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-28svcrdma: Optimize the logic that selects the R_key to invalidateChuck Lever
o Select the R_key to invalidate while the CPU cache still contains the received RPC Call transport header, rather than waiting until we're about to send the RPC Reply. o Choose Send With Invalidate if there is exactly one distinct R_key in the received transport header. If there's more than one, the client will have to perform local invalidation after it has already waited for remote invalidation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-11-04Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
2018-11-01SUNRPC: Use atomic(64)_t for seq_send(64)Paul Burton
The seq_send & seq_send64 fields in struct krb5_ctx are used as atomically incrementing counters. This is implemented using cmpxchg() & cmpxchg64() to implement what amount to custom versions of atomic_fetch_inc() & atomic64_fetch_inc(). Besides the duplication, using cmpxchg64() has another major drawback in that some 32 bit architectures don't provide it. As such commit 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") resulted in build failures for some architectures. Change seq_send to be an atomic_t and seq_send64 to be an atomic64_t, then use atomic(64)_* functions to manipulate the values. The atomic64_t type & associated functions are provided even on architectures which lack real 64 bit atomic memory access via CONFIG_GENERIC_ATOMIC64 which uses spinlocks to serialize access. This fixes the build failures for architectures lacking cmpxchg64(). A potential alternative that was raised would be to provide cmpxchg64() on the 32 bit architectures that currently lack it, using spinlocks. However this would provide a version of cmpxchg64() with semantics a little different to the implementations on architectures with real 64 bit atomics - the spinlock-based implementation would only work if all access to the memory used with cmpxchg64() is *always* performed using cmpxchg64(). That is not currently a requirement for users of cmpxchg64(), and making it one seems questionable. As such avoiding cmpxchg64() outside of architecture-specific code seems best, particularly in cases where atomic64_t seems like a better fit anyway. The CONFIG_GENERIC_ATOMIC64 implementation of atomic64_* functions will use spinlocks & so faces the same issue, but with the key difference that the memory backing an atomic64_t ought to always be accessed via the atomic64_* functions anyway making the issue moot. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-30Merge tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Olga added support for the NFSv4.2 asynchronous copy protocol. We already supported COPY, by copying a limited amount of data and then returning a short result, letting the client resend. The asynchronous protocol should offer better performance at the expense of some complexity. The other highlight is Trond's work to convert the duplicate reply cache to a red-black tree, and to move it and some other server caches to RCU. (Previously these have meant taking global spinlocks on every RPC) Otherwise, some RDMA work and miscellaneous bugfixes" * tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linux: (30 commits) lockd: fix access beyond unterminated strings in prints nfsd: Fix an Oops in free_session() nfsd: correctly decrement odstate refcount in error path svcrdma: Increase the default connection credit limit svcrdma: Remove try_module_get from backchannel svcrdma: Remove ->release_rqst call in bc reply handler svcrdma: Reduce max_send_sges nfsd: fix fall-through annotations knfsd: Improve lookup performance in the duplicate reply cache using an rbtree knfsd: Further simplify the cache lookup knfsd: Simplify NFS duplicate replay cache knfsd: Remove dead code from nfsd_cache_lookup SUNRPC: Simplify TCP receive code SUNRPC: Replace the cache_detail->hash_lock with a regular spinlock SUNRPC: Remove non-RCU protected lookup NFS: Fix up a typo in nfs_dns_ent_put NFS: Lockless DNS lookups knfsd: Lockless lookup of NFSv4 identities. SUNRPC: Lockless server RPCSEC_GSS context lookup knfsd: Allow lockless lookups of the exports ...
2018-10-29svcrdma: Increase the default connection credit limitChuck Lever
Reduce queuing on clients by allowing more credits by default. 64 is the default NFSv4.1 slot table size on Linux clients. This size prevents the credit limit from putting RPC requests to sleep again after they have already slept waiting for a session slot. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Replace the cache_detail->hash_lock with a regular spinlockTrond Myklebust
Now that the reader functions are all RCU protected, use a regular spinlock rather than a reader/writer lock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Remove non-RCU protected lookupTrond Myklebust
Clean up the cache code by removing the non-RCU protected lookup. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Allow cache lookups to use RCU protection rather than the r/w spinlockTrond Myklebust
Instead of the reader/writer spinlock, allow cache lookups to use RCU for looking up entries. This is more efficient since modifications can occur while other entries are being looked up. Note that for now, we keep the reader/writer spinlock until all users have been converted to use RCU-safe freeing of their cache entries. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-26Merge tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix the NFSv4.1 r/wsize sanity checking - Reset the RPC/RDMA credit grant properly after a disconnect - Fix a missed page unlock after pg_doio() Features and optimisations: - Overhaul of the RPC client socket code to eliminate a locking bottleneck and reduce the latency when transmitting lots of requests in parallel. - Allow parallelisation of the RPCSEC_GSS encoding of an RPC request. - Convert the RPC client socket receive code to use iovec_iter() for improved efficiency. - Convert several NFS and RPC lookup operations to use RCU instead of taking global locks. - Avoid the need for BH-safe locks in the RPC/RDMA back channel. Bugfixes and cleanups: - Fix lock recovery during NFSv4 delegation recalls - Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case. - Fixes for the RPC connection metrics - Various RPC client layer cleanups to consolidate stream based sockets - RPC/RDMA connection cleanups - Simplify the RPC/RDMA cleanup after memory operation failures - Clean ups for NFS v4.2 copy completion and NFSv4 open state reclaim" * tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (97 commits) SUNRPC: Convert the auth cred cache to use refcount_t SUNRPC: Convert auth creds to use refcount_t SUNRPC: Simplify lookup code SUNRPC: Clean up the AUTH cache code NFS: change sign of nfs_fh length sunrpc: safely reallow resvport min/max inversion nfs: remove redundant call to nfs_context_set_write_error() nfs: Fix a missed page unlock after pg_doio() SUNRPC: Fix a compile warning for cmpxchg64() NFSv4.x: fix lock recovery during delegation recall SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc() xprtrdma: Squelch a sparse warning xprtrdma: Clean up xprt_rdma_disconnect_inject xprtrdma: Add documenting comments xprtrdma: Report when there were zero posted Receives xprtrdma: Move rb_flags initialization xprtrdma: Don't disable BH's in backchannel server xprtrdma: Remove memory address of "ep" from an error message xprtrdma: Rename rpcrdma_qp_async_error_upcall xprtrdma: Simplify RPC wake-ups on connect ...
2018-10-23SUNRPC: Convert the auth cred cache to use refcount_tTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-23SUNRPC: Convert auth creds to use refcount_tTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-23SUNRPC: Simplify lookup codeTrond Myklebust
We no longer need to worry about whether or not the entry is hashed in order to figure out if the contents are valid. We only care whether or not the refcount is non-zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-03SUNRPC: Add lockless lookup of the server's auth domainTrond Myklebust
Avoid taking the global auth_domain_lock in most lookups of the auth domain by adding an RCU protected lookup. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-09-30SUNRPC: Replace krb5_seq_lock with a lockless schemeTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Unexport xdr_partial_copy_from_skb()Trond Myklebust
It is no longer used outside of net/sunrpc/socklib.c Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Allow AF_LOCAL sockets to use the generic stream receiveTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Simplify TCP receive code by switching to using iteratorsTrond Myklebust
Most of this code should also be reusable with other socket types. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter()Trond Myklebust
Add a bvec array to struct xdr_buf, and have the client allocate it when we need to receive data into pages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add a label for RPC calls that require allocation on receiveTrond Myklebust
If the RPC call relies on the receive call allocating pages as buffers, then let's label it so that we a) Don't leak memory by allocating pages for requests that do not expect this behaviour b) Can optimise for the common case where calls do not require allocation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Fix priority queue fairnessTrond Myklebust
Fix up the priority queue to not batch by owner, but by queue, so that we allow '1 << priority' elements to be dequeued before switching to the next priority queue. The owner field is still used to wake up requests in round robin order by owner to avoid single processes hogging the RPC layer by loading the queues. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Convert xprt receive queue to use an rbtreeTrond Myklebust
If the server is slow, we can find ourselves with quite a lot of entries on the receive queue. Converting the search from an O(n) to O(log(n)) can make a significant difference, particularly since we have to hold a number of locks while searching. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Cleanup: remove the unused 'task' argument from the request_send()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Clean up transport write space handlingTrond Myklebust
Treat socket write space handling in the same way we now treat transport congestion: by denying the XPRT_LOCK until the transport signals that it has free buffer space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Turn off throttling of RPC slots for TCP socketsTrond Myklebust
The theory was that we would need to grab the socket lock anyway, so we might as well use it to gate the allocation of RPC slots for a TCP socket. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Support for congestion control when queuing is enabledTrond Myklebust
Both RDMA and UDP transports require the request to get a "congestion control" credit before they can be transmitted. Right now, this is done when the request locks the socket. We'd like it to happen when a request attempts to be transmitted for the first time. In order to support retransmission of requests that already hold such credits, we also want to ensure that they get queued first, so that we don't deadlock with requests that have yet to obtain a credit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Improve latency for interactive tasksTrond Myklebust
One of the intentions with the priority queues was to ensure that no single process can hog the transport. The field task->tk_owner therefore identifies the RPC call's origin, and is intended to allow the RPC layer to organise queues for fairness. This commit therefore modifies the transmit queue to group requests by task->tk_owner, and ensures that we round robin among those groups. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Treat the task and request as separate in the xprt_ops->send_request()Trond Myklebust
When we shift to using the transmit queue, then the task that holds the write lock will not necessarily be the same as the one being transmitted. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Fix up the back channel transmitTrond Myklebust
Fix up the back channel code to recognise that it has already been transmitted, so does not need to be called again. Also ensure that we set req->rq_task. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Refactor RPC call encodingTrond Myklebust
Move the call encoding so that it occurs before the transport connection etc. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add a transmission queue for RPC requestsTrond Myklebust
Add the queue that will enforce the ordering of RPC task transmission. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Distinguish between the slot allocation list and receive queueTrond Myklebust
When storing a struct rpc_rqst on the slot allocation list, we currently use the same field 'rq_list' as we use to store the request on the receive queue. Since the structure is never on both lists at the same time, this is OK. However, for clarity, let's make that a union with different names for the different lists so that we can more easily distinguish between the two states. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Refactor xprt_transmit() to remove wait for reply codeTrond Myklebust
Allow the caller in clnt.c to call into the code to wait for a reply after calling xprt_transmit(). Again, the reason is that the backchannel code does not need this functionality. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Refactor xprt_transmit() to remove the reply queue codeTrond Myklebust
Separate out the action of adding a request to the reply queue so that the backchannel code can simply skip calling it altogether. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Rename xprt->recv_lock to xprt->queue_lockTrond Myklebust
We will use the same lock to protect both the transmit and receive queues. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add a helper to wake up a sleeping rpc_task and set its statusTrond Myklebust
Add a helper that will wake up a task that is sleeping on a specific queue, and will set the value of task->tk_status. This is mainly intended for use by the transport layer to notify the task of an error condition. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Refactor the transport request pinningTrond Myklebust
We are going to need to pin for both send and receive. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Add socket transmit queue offset trackingTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Rename TCP receive-specific state variablesTrond Myklebust
Since we will want to introduce similar TCP state variables for the transmission of requests, let's rename the existing ones to label that they are for the receive side. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Simplify identification of when the message send/receive is completeTrond Myklebust
Add states to indicate that the message send and receive are not yet complete. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>