summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-02-20Merge tag 'fs.idmapped.v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs idmapping updates from Christian Brauner: - Last cycle we introduced the dedicated struct mnt_idmap type for mount idmapping and the required infrastucture in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). As promised in last cycle's pull request message this converts everything to rely on struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevant on the mount level. Especially for non-vfs developers without detailed knowledge in this area this was a potential source for bugs. This finishes the conversion. Instead of passing the plain namespace around this updates all places that currently take a pointer to a mnt_userns with a pointer to struct mnt_idmap. Now that the conversion is done all helpers down to the really low-level helpers only accept a struct mnt_idmap argument instead of two namespace arguments. Conflating mount and other idmappings will now cause the compiler to complain loudly thus eliminating the possibility of any bugs. This makes it impossible for filesystem developers to mix up mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. Everything associated with struct mnt_idmap is moved into a single separate file. With that change no code can poke around in struct mnt_idmap. It can only be interacted with through dedicated helpers. That means all filesystems are and all of the vfs is completely oblivious to the actual implementation of idmappings. We are now also able to extend struct mnt_idmap as we see fit. For example, we can decouple it completely from namespaces for users that don't require or don't want to use them at all. We can also extend the concept of idmappings so we can cover filesystem specific requirements. In combination with the vfs{g,u}id_t work we finished in v6.2 this makes this feature substantially more robust and thus difficult to implement wrong by a given filesystem and also protects the vfs. - Enable idmapped mounts for tmpfs and fulfill a longstanding request. A long-standing request from users had been to make it possible to create idmapped mounts for tmpfs. For example, to share the host's tmpfs mount between multiple sandboxes. This is a prerequisite for some advanced Kubernetes cases. Systemd also has a range of use-cases to increase service isolation. And there are more users of this. However, with all of the other work going on this was way down on the priority list but luckily someone other than ourselves picked this up. As usual the patch is tiny as all the infrastructure work had been done multiple kernel releases ago. In addition to all the tests that we already have I requested that Rodrigo add a dedicated tmpfs testsuite for idmapped mounts to xfstests. It is to be included into xfstests during the v6.3 development cycle. This should add a slew of additional tests. * tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits) shmem: support idmapped mounts for tmpfs fs: move mnt_idmap fs: port vfs{g,u}id helpers to mnt_idmap fs: port fs{g,u}id helpers to mnt_idmap fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap fs: port i_{g,u}id_{needs_}update() to mnt_idmap quota: port to mnt_idmap fs: port privilege checking helpers to mnt_idmap fs: port inode_owner_or_capable() to mnt_idmap fs: port inode_init_owner() to mnt_idmap fs: port acl to mnt_idmap fs: port xattr to mnt_idmap fs: port ->permission() to pass mnt_idmap fs: port ->fileattr_set() to pass mnt_idmap fs: port ->set_acl() to pass mnt_idmap fs: port ->get_acl() to pass mnt_idmap fs: port ->tmpfile() to pass mnt_idmap fs: port ->rename() to pass mnt_idmap fs: port ->mknod() to pass mnt_idmap fs: port ->mkdir() to pass mnt_idmap ...
2023-02-20Merge tag 'iversion-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull i_version updates from Jeff Layton: "This overhauls how we handle i_version queries from nfsd. Instead of having special routines and grabbing the i_version field directly out of the inode in some cases, we've moved most of the handling into the various filesystems' getattr operations. As a bonus, this makes ceph's change attribute usable by knfsd as well. This should pave the way for future work to make this value queryable by userland, and to make it more resilient against rolling back on a crash" * tag 'iversion-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: nfsd: remove fetch_iversion export operation nfsd: use the getattr operation to fetch i_version nfsd: move nfsd4_change_attribute to nfsfh.c ceph: report the inode version in getattr if requested nfs: report the inode version in getattr if requested vfs: plumb i_version handling into struct kstat fs: clarify when the i_version counter must be updated fs: uninline inode_query_iversion
2023-02-20Merge tag 'locks-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "The main change here is that I've broken out most of the file locking definitions into a new header file. I also went ahead and completed the removal of locks_inode function" * tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs: remove locks_inode filelock: move file locking definitions to separate header file
2023-02-20Merge tag 'tpm-v6.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "In additon to bug fixes, these are noteworthy changes: - In TPM I2C drivers, migrate from probe() to probe_new() (a new driver model in I2C). - TPM CRB: Pluton support - Add duplicate hash detection to the blacklist keyring in order to give more meaningful klog output than e.g. [1]" Link: https://askubuntu.com/questions/1436856/ubuntu-22-10-blacklist-problem-blacklisting-hash-13-message-on-boot [1] * tag 'tpm-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: add vendor flag to command code validation tpm: Add reserved memory event log tpm: Use managed allocation for bios event log tpm: tis_i2c: Convert to i2c's .probe_new() tpm: tpm_i2c_nuvoton: Convert to i2c's .probe_new() tpm: tpm_i2c_infineon: Convert to i2c's .probe_new() tpm: tpm_i2c_atmel: Convert to i2c's .probe_new() tpm: st33zp24: Convert to i2c's .probe_new() KEYS: asymmetric: Fix ECDSA use via keyctl uapi certs: don't try to update blacklist keys KEYS: Add new function key_create() certs: make blacklisted hash available in klog tpm_crb: Add support for CRB devices based on Pluton crypto: certs: fix FIPS selftest dependency
2023-02-20filemap: Remove lock_page_killable()Matthew Wilcox (Oracle)
There are no more callers; remove this function before any more appear. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-02-20Merge tag 'remove-get_kernel_pages-for-6.3' of ↵Linus Torvalds
https://git.linaro.org/people/jens.wiklander/linux-tee Pull TEE update from Jens Wiklander: "Remove get_kernel_pages() Vmalloc page support is removed from shm_get_kernel_pages() and the get_kernel_pages() call is replaced by calls to get_page(). With no remaining callers of get_kernel_pages() the function is removed" [ This looks like it's just some random 'tee' cleanup, but the bigger picture impetus for this is really to to to remove historical confusion with mixed use of kernel virtual addresses and 'struct page' pointers. Kernel virtual pointers in the vmalloc space is then particularly confusing - both for looking up a page pointer (when trying to then unify a "virtual address or page" interface) and _particularly_ when mixed with HIGHMEM support and the kmap*() family of remapping. This is particularly true with HIGHMEM getting much less test coverage with 32-bit architectures being increasingly legacy targets. So we actively wanted to remove get_kernel_pages() to make sure nobody else used it too, and thus the 'tee' part is "finally remove last user". See also commit 6647e76ab623 ("v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails") for a totally different version of a conceptually similar "let's stop this confusion of different ways of referring to memory". - Linus ] * tag 'remove-get_kernel_pages-for-6.3' of https://git.linaro.org/people/jens.wiklander/linux-tee: mm: Remove get_kernel_pages() tee: Remove call to get_kernel_pages() tee: Remove vmalloc page support highmem: Enhance is_kmap_addr() to check kmap_local_page() mappings
2023-02-20SUNRPC: Remove ->xpo_secure_port()Chuck Lever
There's no need for the cost of this extra virtual function call during every RPC transaction: the RQ_SECURE bit can be set properly in ->xpo_recvfrom() instead. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Clean up the svc_xprt_flags() macroChuck Lever
Make this macro more conventional: - Use BIT() instead of open-coding " 1UL << " - Don't display the "XPT_" in every flag name Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Move remaining internal definitions to gss_krb5_internal.hChuck Lever
The goal is to leave only protocol-defined items in gss_krb5.h so that it can be easily replaced by a generic header. Implementation specific items are moved to the new internal header. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Support the Camellia enctypesChuck Lever
RFC 6803 defines two encryption types that use Camellia ciphers (RFC 3713) and CMAC digests. Implement support for those in SunRPC's GSS Kerberos 5 mechanism. There has not been an explicit request to support these enctypes. However, this new set of enctypes provides a good alternative to the AES-SHA1 enctypes that are to be deprecated at some point. As this implementation is still a "beta", the default is to not build it automatically. Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add gk5e definitions for RFC 8009 encryption typesChuck Lever
Fill in entries in the supported_gss_krb5_enctypes array for the encryption types defined in RFC 8009. These new enctypes use the SHA-256 and SHA-384 message digest algorithms (as defined in FIPS-180) instead of the deprecated SHA-1 algorithm, and are thus more secure. Note that NIST has scheduled SHA-1 for deprecation: https://www.nist.gov/news-events/news/2022/12/nist-retires-sha-1-cryptographic-algorithm Thus these new encryption types are placed under a separate CONFIG option to enable distributors to separately introduce support for the AES-SHA2 enctypes and deprecate support for the current set of AES-SHA1 encryption types as their user space allows. As this implementation is still a "beta", the default is to not build it automatically. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add new subkey length fieldsChuck Lever
The aes256-cts-hmac-sha384-192 enctype specifies the length of its checksum and integrity subkeys as 192 bits, but the length of its encryption subkey (Ke) as 256 bits. Add new fields to struct gss_krb5_enctype that specify the key lengths individually, and where needed, use the correct new field instead of ->keylength. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Hoist KDF into struct gss_krb5_enctypeChuck Lever
Each Kerberos enctype can have a different KDF. Refactor the key derivation path to support different KDFs for the enctypes introduced in subsequent patches. In particular, expose the key derivation function in struct gss_krb5_enctype instead of the enctype's preferred random-to-key function. The latter is usually the identity function and is only ever called during key derivation, so have each KDF call it directly. A couple of extra clean-ups: - Deduplicate the set_cdata() helper - Have ->derive_key return negative errnos, in accordance with usual kernel coding conventions This patch is a little bigger than I'd like, but these are all mechanical changes and they are all to the same areas of code. No behavior change is intended. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Rename .encrypt_v2 and .decrypt_v2 methodsChuck Lever
Clean up: there is now only one encrypt and only one decrypt method, thus there is no longer a need for the v2-suffixed method names. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove ->encrypt and ->decrypt methods from struct gss_krb5_enctypeChuck Lever
Clean up: ->encrypt is set to only one value. Replace the two remaining call sites with direct calls to krb5_encrypt(). There have never been any call sites for the ->decrypt() method. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Replace KRB5_SUPPORTED_ENCTYPES macroChuck Lever
Now that all consumers of the KRB5_SUPPORTED_ENCTYPES macro are within the SunRPC layer, the macro can be replaced with something private and more flexible. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove another switch on ctx->enctypeChuck Lever
Replace another switch on encryption type so that it does not have to be modified when adding or removing support for an enctype. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Refactor the GSS-API Per Message calls in the Kerberos mechanismChuck Lever
Replace a number of switches on encryption type so that all of them don't have to be modified when adding or removing support for an enctype. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Obscure Kerberos integrity keysChuck Lever
There's no need to keep the integrity keys around if we instead allocate and key a pair of ahashes and keep those. This not only enables the subkeys to be destroyed immediately after deriving them, but it makes the Kerberos integrity code path more efficient. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Obscure Kerberos signing keysChuck Lever
There's no need to keep the signing keys around if we instead allocate and key an ahash and keep that. This not only enables the subkeys to be destroyed immediately after deriving them, but it makes the Kerberos signing code path more efficient. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Obscure Kerberos encryption keysChuck Lever
The encryption subkeys are not used after the cipher transforms have been allocated and keyed. There is no need to retain them in struct krb5_ctx. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Refactor set-up for aux_cipherChuck Lever
Hoist the name of the aux_cipher into struct gss_krb5_enctype to prepare for obscuring the encryption keys just after they are derived. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Improve Kerberos confounder generationChuck Lever
Other common Kerberos implementations use a fully random confounder for encryption. The reason for this is explained in the new comment added by this patch. The current get_random_bytes() implementation does not exhaust system entropy. Since confounder generation is part of Kerberos itself rather than the GSS-API Kerberos mechanism, the function is renamed and moved. Note that light top-down analysis shows that the SHA-1 transform is by far the most CPU-intensive part of encryption. Thus we do not expect this change to result in a significant performance impact. However, eventually it might be necessary to generate an independent stream of confounders for each Kerberos context to help improve I/O parallelism. Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove .conflen field from struct gss_krb5_enctypeChuck Lever
Now that arcfour-hmac is gone, the confounder length is again the same as the cipher blocksize for every implemented enctype. The gss_krb5_enctype::conflen field is no longer necessary. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove .blocksize field from struct gss_krb5_enctypeChuck Lever
It is not clear from documenting comments, specifications, or code usage what value the gss_krb5_enctype.blocksize field is supposed to store. The "encryption blocksize" depends only on the cipher being used, so that value can be derived where it's needed instead of stored as a constant. RFC 3961 Section 5.2 says: > cipher block size, c > This is the block size of the block cipher underlying the > encryption and decryption functions indicated above, used for key > derivation and for the size of the message confounder and initial > vector. (If a block cipher is not in use, some comparable > parameter should be determined.) It must be at least 5 octets. > > This is not actually an independent parameter; rather, it is a > property of the functions E and D. It is listed here to clarify > the distinction between it and the message block size, m. In the Linux kernel's implemenation of the SunRPC RPCSEC GSS Kerberos 5 mechanism, the cipher block size, which is dependent on the encryption and decryption transforms, is used only in krb5_derive_key(), so it is straightforward to replace it. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add header ifdefs to linux/sunrpc/gss_krb5.hChuck Lever
Standard convention: Ensure the contents of the header are included only once per source file. Tested-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Replace pool stats with per-CPU variablesChuck Lever
Eliminate the use of bus-locked operations in svc_xprt_enqueue(), which is a hot path. Replace them with per-cpu variables to reduce cross-CPU memory bus traffic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Use per-CPU counters to tally server RPC countsChuck Lever
- Improves counting accuracy - Reduces cross-CPU memory traffic Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Set rq_accept_statp inside ->accept methodsChuck Lever
To navigate around the space that svcauth_gss_accept() reserves for the RPC payload body length and sequence number fields, svcauth_gss_release() does a little dance with the reply's accept_stat, moving the accept_stat value in the response buffer down by two words. Instead, let's have the ->accept() methods each set the proper final location of the accept_stat to avoid having to move things. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Refactor RPC server dispatch methodChuck Lever
Currently, svcauth_gss_accept() pre-reserves response buffer space for the RPC payload length and GSS sequence number before returning to the dispatcher, which then adds the header's accept_stat field. The problem is the accept_stat field is supposed to go before the length and seq_num fields. So svcauth_gss_release() has to relocate the accept_stat value (see svcauth_gss_prepare_to_wrap()). To enable these fields to be added to the response buffer in the correct (final) order, the pointer to the accept_stat has to be made available to svcauth_gss_accept() so that it can set it before reserving space for the length and seq_num fields. As a first step, move the pointer to the location of the accept_stat field into struct svc_rqst. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Remove no-longer-used helper functionsChuck Lever
The svc_get/put helpers are no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Convert unwrap data paths to use xdr_stream for repliesChuck Lever
We're now moving svcxdr_init_encode() to /before/ the flavor's ->accept method has set rq_auth_slack. Add a helper that can set rq_auth_slack /after/ svcxdr_init_encode() has been called. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Push svcxdr_init_encode() into svc_process_common()Chuck Lever
Now that all vs_dispatch functions invoke svcxdr_init_encode(), it is common code and can be pushed down into the generic RPC server. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add XDR encoding helper for opaque_authChuck Lever
RFC 5531 defines an MSG_ACCEPTED Reply message like this: struct accepted_reply { opaque_auth verf; union switch (accept_stat stat) { case SUCCESS: ... In the current server code, struct opaque_auth encoding is open- coded. Introduce a helper that encodes an opaque_auth data item within the context of a xdr_stream. Done as part of hardening the server-side RPC header decoding and encoding paths. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Record gss_wrap() errors in svcauth_gss_wrap_priv()Chuck Lever
Match the error reporting in the other unwrap and wrap functions. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Record gss_get_mic() errors in svcauth_gss_wrap_integ()Chuck Lever
An error computing the checksum here is an exceptional event. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20NFSD: enhance inter-server copy cleanupDai Ngo
Currently nfsd4_setup_inter_ssc returns the vfsmount of the source server's export when the mount completes. After the copy is done nfsd4_cleanup_inter_ssc is called with the vfsmount of the source server and it searches nfsd_ssc_mount_list for a matching entry to do the clean up. The problems with this approach are (1) the need to search the nfsd_ssc_mount_list and (2) the code has to handle the case where the matching entry is not found which looks ugly. The enhancement is instead of nfsd4_setup_inter_ssc returning the vfsmount, it returns the nfsd4_ssc_umount_item which has the vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called it's passed with the nfsd4_ssc_umount_item directly to do the clean up so no searching is needed and there is no need to handle the 'not found' case. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [ cel: adjusted whitespace and variable/function names ] Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
2023-02-20SUNRPC: Convert unwrap_priv_data() to use xdr_streamChuck Lever
Done as part of hardening the server-side RPC header decoding path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Convert unwrap_integ_data() to use xdr_streamChuck Lever
Done as part of hardening the server-side RPC header decoding path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Convert svcauth_unix_accept() to use xdr_streamChuck Lever
Done as part of hardening the server-side RPC header decoding path. Since the server-side of the Linux kernel SunRPC implementation ignores the contents of the Call's machinename field, there's no need for its RPC_AUTH_UNIX authenticator to reject names that are larger than UNX_MAXNODENAME. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20SUNRPC: Add an XDR decoding helper for struct opaque_authChuck Lever
RFC 5531 defines the body of an RPC Call message like this: struct call_body { unsigned int rpcvers; unsigned int prog; unsigned int vers; unsigned int proc; opaque_auth cred; opaque_auth verf; /* procedure-specific parameters start here */ }; In the current server code, decoding a struct opaque_auth type is open-coded in several places, and is thus difficult to harden everywhere. Introduce a helper for decoding an opaque_auth within the context of a xdr_stream. This helper can be shared with all authentication flavor implemenations, even on the client-side. Done as part of hardening the server-side RPC header decoding paths. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20fs: namei: Allow follow_down() to uncover auto mountsRichard Weinberger
This function is only used by NFSD to cross mount points. If a mount point is of type auto mount, follow_down() will not uncover it. Add LOOKUP_AUTOMOUNT to the lookup flags to have ->d_automount() called when NFSD walks down the mount tree. Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Ian Kent <raven@themaw.net> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20net: make default_rps_mask a per netns attributePaolo Abeni
That really was meant to be a per netns attribute from the beginning. The idea is that once proper isolation is in place in the main namespace, additional demux in the child namespaces will be redundant. Let's make child netns default rps mask empty by default. To avoid bloating the netns with a possibly large cpumask, allocate it on-demand during the first write operation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20Merge tag 'kvmarm-6.3' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.3 - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place. - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company). - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests. - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM. - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested. - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems. - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor. - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host. - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer This also drags in arm64's 'for-next/sme2' branch, because both it and the PSCI relay changes touch the EL2 initialization code.
2023-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for net-next: 1) Add safeguard to check for NULL tupe in objects updates via NFT_MSG_NEWOBJ, this should not ever happen. From Alok Tiwari. 2) Incorrect pointer check in the new destroy rule command, from Yang Yingliang. 3) Incorrect status bitcheck in nf_conntrack_udp_packet(), from Florian Westphal. 4) Simplify seq_print_acct(), from Ilia Gavrilov. 5) Use 2-arg optimal variant of kfree_rcu() in IPVS, from Julian Anastasov. 6) TCP connection enters CLOSE state in conntrack for locally originated TCP reset packet from the reject target, from Florian Westphal. The fixes #2 and #3 in this series address issues from the previous pull nf-next request in this net-next cycle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_NS_OTHERHOSTEric Dumazet
Hosts can often receive neighbour discovery messages that are not for them. Use a dedicated drop reason to make clear the packet is dropped for this normal case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_BAD_OPTIONSEric Dumazet
This is a generic drop reason for any error detected in ndisc_parse_options(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20net: add location to trace_consume_skb()Eric Dumazet
kfree_skb() includes the location, it makes sense to add it to consume_skb() as well. After patch: taskd_EventMana 8602 [004] 420.406239: skb:consume_skb: skbaddr=0xffff893a4a6d0500 location=unix_stream_read_generic swapper 0 [011] 422.732607: skb:consume_skb: skbaddr=0xffff89597f68cee0 location=mlx4_en_free_tx_desc discipline 9141 [043] 423.065653: skb:consume_skb: skbaddr=0xffff893a487e9c00 location=skb_consume_udp swapper 0 [010] 423.073166: skb:consume_skb: skbaddr=0xffff8949ce9cdb00 location=icmpv6_rcv borglet 8672 [014] 425.628256: skb:consume_skb: skbaddr=0xffff8949c42e9400 location=netlink_dump swapper 0 [028] 426.263317: skb:consume_skb: skbaddr=0xffff893b1589dce0 location=net_rx_action wget 14339 [009] 426.686380: skb:consume_skb: skbaddr=0xffff893a51b552e0 location=tcp_rcv_state_process Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20rxrpc: Fix overproduction of wakeups to recvmsg()David Howells
Fix three cases of overproduction of wakeups: (1) rxrpc_input_split_jumbo() conditionally notifies the app that there's data for recvmsg() to collect if it queues some data - and then its only caller, rxrpc_input_data(), goes and wakes up recvmsg() anyway. Fix the rxrpc_input_data() to only do the wakeup in failure cases. (2) If a DATA packet is received for a call by the I/O thread whilst recvmsg() is busy draining the call's rx queue in the app thread, the call will left on the recvmsg() queue for recvmsg() to pick up, even though there isn't any data on it. This can cause an unexpected recvmsg() with a 0 return and no MSG_EOR set after the reply has been posted to a service call. Fix this by discarding pending calls from the recvmsg() queue that don't need servicing yet. (3) Not-yet-completed calls get requeued after having data read from them, even if they have no data to read. Fix this by only requeuing them if they have data waiting on them; if they don't, the I/O thread will requeue them when data arrives or they fail. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/3386149.1676497685@warthog.procyon.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-19IB/mlx5: Extend debug control for CC parametersEdward Srouji
This patch adds rtt_resp_dscp to the current debug controllability of congestion control (CC) parameters. rtt_resp_dscp can be read or written through debugfs. If set, its value overwrites the DSCP of the generated RTT response. Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/1dcc3440ee53c688f19f579a051ded81a2aaa70a.1676538714.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>