summaryrefslogtreecommitdiff
path: root/net/sunrpc
AgeCommit message (Collapse)Author
2022-12-13Merge tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust "Bugfixes: - Fix NULL pointer dereference in the mount parser - Fix memory stomp in decode_attr_security_label - Fix credential leak in _nfs4_discover_trunking() - Fix buffer leak in rpcrdma_req_create() - Fix leaked socket in rpc_sockname() - Fix deadlock between nfs4_open_recover_helper() and delegreturn - Fix an Oops in nfs_d_automount() - Fix potential race in nfs_call_unlink() - Multiple fixes for the open context mode - NFSv4.2 READ_PLUS fixes - Fix a regression in which small rsize/wsize values are being forbidden - Fail client initialisation if the NFSv4.x state manager thread can't run - Avoid spurious warning of lost lock that is being unlocked. - Ensure the initialisation of struct nfs4_label Features and cleanups: - Trigger the "ls -l" readdir heuristic sooner - Clear the file access cache upon login to ensure supplementary group info is in sync between the client and server - pnfs: Fix up the logging of layout stateids - NFSv4.2: Change the default KConfig value for READ_PLUS - Use sysfs_emit() instead of scnprintf() where appropriate" * tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) NFSv4.2: Change the default KConfig value for READ_PLUS NFSv4.x: Fail client initialisation if state manager thread can't run fs: nfs: sysfs: use sysfs_emit() to instead of scnprintf() NFS: use sysfs_emit() to instead of scnprintf() NFS: Allow very small rsize & wsize again NFSv4.2: Fix up READ_PLUS alignment NFSv4.2: Set the correct size scratch buffer for decoding READ_PLUS SUNRPC: Fix missing release socket in rpc_sockname() xprtrdma: Fix regbuf data not freed in rpcrdma_req_create() NFS: avoid spurious warning of lost lock that is being unlocked. nfs: fix possible null-ptr-deref when parsing param NFSv4: check FMODE_EXEC from open context mode in nfs4_opendata_access() NFS: make sure open context mode have FMODE_EXEC when file open for exec NFS4.x/pnfs: Fix up logging of layout stateids NFS: Fix a race in nfs_call_unlink() NFS: Fix an Oops in nfs_d_automount() NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturn NFSv4: Fix a credential leak in _nfs4_discover_trunking() NFS: Trigger the "ls -l" readdir heuristic sooner NFSv4.2: Fix initialisation of struct nfs4_label ...
2022-12-12Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "This release introduces support for the CB_RECALL_ANY operation. NFSD can send this operation to request that clients return any delegations they choose. The server uses this operation to handle low memory scenarios or indicate to a client when that client has reached the maximum number of delegations the server supports. The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst support for sparse files in local filesystems and the VFS is improved. Two major data structure fixes appear in this release: - The nfs4_file hash table is replaced with a resizable hash table to reduce the latency of NFSv4 OPEN operations. - Reference counting in the NFSD filecache has been hardened against races. In furtherance of removing support for NFSv2 in a subsequent kernel release, a new Kconfig option enables server-side support for NFSv2 to be left out of a kernel build. MAINTAINERS has been updated to indicate that changes to fs/exportfs should go through the NFSD tree" * tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits) NFSD: Avoid clashing function prototypes SUNRPC: Fix crasher in unwrap_integ_data() SUNRPC: Make the svc_authenticate tracepoint conditional NFSD: Use only RQ_DROPME to signal the need to drop a reply SUNRPC: Clean up xdr_write_pages() SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails NFSD: add CB_RECALL_ANY tracepoints NFSD: add delegation reaper to react to low memory condition NFSD: add support for sending CB_RECALL_ANY NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker trace: Relocate event helper files NFSD: pass range end to vfs_fsync_range() instead of count lockd: fix file selection in nlmsvc_cancel_blocked lockd: ensure we use the correct file descriptor when unlocking lockd: set missing fl_flags field when retrieving args NFSD: Use struct_size() helper in alloc_session() nfsd: return error if nfs4_setacl fails lockd: set other missing fields when unlocking files NFSD: Add an nfsd_file_fsync tracepoint sunrpc: svc: Remove an unused static function svc_ungetu32() ...
2022-12-12Merge tag 'pull-iov_iter' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: "iov_iter work; most of that is about getting rid of direction misannotations and (hopefully) preventing more of the same for the future" * tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: use less confusing names for iov_iter direction initializers iov_iter: saner checks for attempt to copy to/from iterator [xen] fix "direction" argument of iov_iter_kvec() [vhost] fix 'direction' argument of iov_iter_{init,bvec}() [target] fix iov_iter_bvec() "direction" argument [s390] memcpy_real(): WRITE is "data source", not destination... [s390] zcore: WRITE is "data source", not destination... [infiniband] READ is "data destination", not source... [fsi] WRITE is "data source", not destination... [s390] copy_oldmem_kernel() - WRITE is "data source", not destination csum_and_copy_to_iter(): handle ITER_DISCARD get rid of unlikely() on page_copy_sane() calls
2022-12-12Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-12-12Merge tag 'timers-core-2022-12-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for timers, timekeeping and drivers: Core: - The timer_shutdown[_sync]() infrastructure: Tearing down timers can be tedious when there are circular dependencies to other things which need to be torn down. A prime example is timer and workqueue where the timer schedules work and the work arms the timer. What needs to prevented is that pending work which is drained via destroy_workqueue() does not rearm the previously shutdown timer. Nothing in that shutdown sequence relies on the timer being functional. The conclusion was that the semantics of timer_shutdown_sync() should be: - timer is not enqueued - timer callback is not running - timer cannot be rearmed Preventing the rearming of shutdown timers is done by discarding rearm attempts silently. A warning for the case that a rearm attempt of a shutdown timer is detected would not be really helpful because it's entirely unclear how it should be acted upon. The only way to address such a case is to add 'if (in_shutdown)' conditionals all over the place. This is error prone and in most cases of teardown not required all. - The real fix for the bluetooth HCI teardown based on timer_shutdown_sync(). A larger scale conversion to timer_shutdown_sync() is work in progress. - Consolidation of VDSO time namespace helper functions - Small fixes for timer and timerqueue Drivers: - Prevent integer overflow on the XGene-1 TVAL register which causes an never ending interrupt storm. - The usual set of new device tree bindings - Small fixes and improvements all over the place" * tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) dt-bindings: timer: renesas,cmt: Add r8a779g0 CMT support dt-bindings: timer: renesas,tmu: Add r8a779g0 support clocksource/drivers/arm_arch_timer: Use kstrtobool() instead of strtobool() clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock() clocksource/drivers/timer-ti-dm: Clear settings on probe and free clocksource/drivers/timer-ti-dm: Make timer_get_irq static clocksource/drivers/timer-ti-dm: Fix warning for omap_timer_match clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use dt-bindings: timer: nuvoton,npcm7xx-timer: Allow specifying all clocks dt-bindings: timer: rockchip: Add rockchip,rk3128-timer clockevents: Repair kernel-doc for clockevent_delta2ns() clocksource/drivers/ingenic-ost: Define pm functions properly in platform_driver struct clocksource/drivers/sh_cmt: Access registers according to spec vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy Bluetooth: hci_qca: Fix the teardown problem for real timers: Update the documentation to reflect on the new timer_shutdown() API timers: Provide timer_shutdown[_sync]() timers: Add shutdown mechanism to the internal functions timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode ...
2022-12-10SUNRPC: Fix crasher in unwrap_integ_data()Chuck Lever
If a zero length is passed to kmalloc() it returns 0x10, which is not a valid address. gss_verify_mic() subsequently crashes when it attempts to dereference that pointer. Instead of allocating this memory on every call based on an untrusted size value, use a piece of dynamically-allocated scratch memory that is always available. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-12-10SUNRPC: Make the svc_authenticate tracepoint conditionalChuck Lever
Clean up: Simplify the tracepoint's only call site. Also, I noticed that when svc_authenticate() returns SVC_COMPLETE, it leaves rq_auth_stat set to an error value. That doesn't need to be recorded in the trace log. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-12-10SUNRPC: Clean up xdr_write_pages()Chuck Lever
Make it more evident how xdr_write_pages() updates the tail buffer by using the convention of naming the iov pointer variable "tail". I spent more than a couple of hours chasing through code to understand this, so someone is likely to find this useful later. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-12-10SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() failsChuck Lever
Fixes: 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS authentication.") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-12-06SUNRPC: Fix missing release socket in rpc_sockname()Wang ShaoBo
socket dynamically created is not released when getting an unintended address family type in rpc_sockname(), direct to out_release for calling sock_release(). Fixes: 2e738fdce22f ("SUNRPC: Add API to acquire source address") Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-12-06xprtrdma: Fix regbuf data not freed in rpcrdma_req_create()Zhang Xiaoxu
If rdma receive buffer allocate failed, should call rpcrdma_regbuf_free() to free the send buffer, otherwise, the buffer data will be leaked. Fixes: bb93a1ae2bf4 ("xprtrdma: Allocate req's regbufs at xprt create time") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-11-28SUNRPC: Remove unused svc_rqst::rq_lock fieldChuck Lever
Clean up after commit 22700f3c6df5 ("SUNRPC: Improve ordering of transport processing"). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-25use less confusing names for iov_iter direction initializersAl Viro
READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-11-24timers: Get rid of del_singleshot_timer_sync()Thomas Gleixner
del_singleshot_timer_sync() used to be an optimization for deleting timers which are not rearmed from the timer callback function. This optimization turned out to be broken and got mapped to del_timer_sync() about 17 years ago. Get rid of the undocumented indirection and use del_timer_sync() directly. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Link: https://lore.kernel.org/r/20221123201624.706987932@linutronix.de
2022-11-18treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld
This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-27SUNRPC: Fix crasher in gss_unwrap_resp_integ()Chuck Lever
If a zero length is passed to kmalloc() it returns 0x10, which is not a valid address. gss_unwrap_resp_integ() subsequently crashes when it attempts to dereference that pointer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27SUNRPC: Fix null-ptr-deref when xps sysfs alloc failedZhang Xiaoxu
There is a null-ptr-deref when xps sysfs alloc failed: BUG: KASAN: null-ptr-deref in sysfs_do_create_link_sd+0x40/0xd0 Read of size 8 at addr 0000000000000030 by task gssproxy/457 CPU: 5 PID: 457 Comm: gssproxy Not tainted 6.0.0-09040-g02357b27ee03 #9 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 kasan_report+0xa3/0x120 sysfs_do_create_link_sd+0x40/0xd0 rpc_sysfs_client_setup+0x161/0x1b0 rpc_new_client+0x3fc/0x6e0 rpc_create_xprt+0x71/0x220 rpc_create+0x1d4/0x350 gssp_rpc_create+0xc3/0x160 set_gssp_clnt+0xbc/0x140 write_gssp+0x116/0x1a0 proc_reg_write+0xd6/0x130 vfs_write+0x177/0x690 ksys_write+0xb9/0x150 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 When the xprt_switch sysfs alloc failed, should not add xprt and switch sysfs to it, otherwise, maybe null-ptr-deref; also initialize the 'xps_sysfs' to NULL to avoid oops when destroy it. Fixes: 2a338a543163 ("sunrpc: add a symlink from rpc-client directory to the xprt_switch") Fixes: d408ebe04ac5 ("sunrpc: add add sysfs directory per xprt under each xprt_switch") Fixes: baea99445dd4 ("sunrpc: add xprt_switch direcotry to sunrpc's sysfs") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-16Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-13Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "New Features: - Add NFSv4.2 xattr tracepoints - Replace xprtiod WQ in rpcrdma - Flexfiles cancels I/O on layout recall or revoke Bugfixes and Cleanups: - Directly use ida_alloc() / ida_free() - Don't open-code max_t() - Prefer using strscpy over strlcpy - Remove unused forward declarations - Always return layout states on flexfiles layout return - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error - Allow more xprtrdma memory allocations to fail without triggering a reclaim - Various other xprtrdma clean ups - Fix rpc_killall_tasks() races" * tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits) NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked SUNRPC: Add API to force the client to disconnect SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls SUNRPC: Fix races with rpc_killall_tasks() xprtrdma: Fix uninitialized variable xprtrdma: Prevent memory allocations from driving a reclaim xprtrdma: Memory allocation should be allowed to fail during connect xprtrdma: MR-related memory allocation should be allowed to fail xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc() xprtrdma: Clean up synopsis of rpcrdma_req_create() svcrdma: Clean up RPCRDMA_DEF_GFP SUNRPC: Replace the use of the xprtiod WQ in rpcrdma NFSv4.2: Add a tracepoint for listxattr NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2 NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration NFSv4: remove nfs4_renewd_prepare_shutdown() declaration fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment NFSv4/pNFS: Always return layout stats on layout return for flexfiles ...
2022-10-11treewide: use get_random_u32() when possibleJason A. Donenfeld
The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld
Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-10Merge tag 'sched-core-2022-10-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Debuggability: - Change most occurances of BUG_ON() to WARN_ON_ONCE() - Reorganize & fix TASK_ state comparisons, turn it into a bitmap - Update/fix misc scheduler debugging facilities Load-balancing & regular scheduling: - Improve the behavior of the scheduler in presence of lot of SCHED_IDLE tasks - in particular they should not impact other scheduling classes. - Optimize task load tracking, cleanups & fixes - Clean up & simplify misc load-balancing code Freezer: - Rewrite the core freezer to behave better wrt thawing and be simpler in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting all the fallout. Deadline scheduler: - Fix the DL capacity-aware code - Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period() - Relax/optimize locking in task_non_contending() Cleanups: - Factor out the update_current_exec_runtime() helper - Various cleanups, simplifications" * tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) sched: Fix more TASK_state comparisons sched: Fix TASK_state comparisons sched/fair: Move call to list_last_entry() in detach_tasks sched/fair: Cleanup loop_max and loop_break sched/fair: Make sure to try to detach at least one movable task sched: Show PF_flag holes freezer,sched: Rewrite core freezer logic sched: Widen TAKS_state literals sched/wait: Add wait_event_state() sched/completion: Add wait_for_completion_state() sched: Add TASK_ANY for wait_task_inactive() sched: Change wait_task_inactive()s match_state freezer,umh: Clean up freezer/initrd interaction freezer: Have {,un}lock_system_sleep() save/restore flags sched: Rename task_running() to task_on_cpu() sched/fair: Cleanup for SIS_PROP sched/fair: Default to false in test_idle_cores() sched/fair: Remove useless check in select_idle_core() sched/fair: Avoid double search on same cpu sched/fair: Remove redundant check in select_idle_smt() ...
2022-10-06SUNRPC: Add API to force the client to disconnectTrond Myklebust
Allow the caller to force a disconnection of the RPC client so that we can clear any pending requests that are buffered in the socket. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-06SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC callsTrond Myklebust
Add the helper rpc_cancel_tasks(), which uses a caller-defined selection function to define a set of in-flight RPC calls to cancel. This is mainly intended for pNFS drivers which are subject to a layout recall, and which may therefore want to cancel all pending I/O using that layout in order to redrive it after the layout recall has been satisfied. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-06SUNRPC: Fix races with rpc_killall_tasks()Trond Myklebust
Ensure that we immediately call rpc_exit_task() after waking up, and that the tk_rpc_status cannot get clobbered by some other function. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: Fix uninitialized variableChuck Lever
net/sunrpc/xprtrdma/frwr_ops.c:151:32: warning: variable 'rc' is uninitialized when used here [-Wuninitialized] trace_xprtrdma_frwr_alloc(mr, rc); ^~ net/sunrpc/xprtrdma/frwr_ops.c:127:8: note: initialize the variable 'rc' to silence this warning int rc; ^ = 0 1 warning generated. The tracepoint is intended to record the error returned from ib_alloc_mr(). In the current code there is no other purpose for @rc, so simply replace it. Reported-by: kernel test robot <lkp@intel.com> Fixes: d8cf39a280c3b0 ('xprtrdma: MR-related memory allocation should be allowed to fail') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: Prevent memory allocations from driving a reclaimChuck Lever
Many memory allocations that xprtrdma does can fail safely. Let's use this fact to avoid some potential deadlocks: Replace GFP_KERNEL with GFP flags that do not try hard to acquire memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: Memory allocation should be allowed to fail during connectChuck Lever
An attempt to establish a connection can always fail and then be retried. GFP_KERNEL allocation is not necessary here. Like MR allocation, establishing a connection is always done in a worker thread. The new GFP flags align with the flags that would be returned by rpc_task_gfp_mask() in this case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: MR-related memory allocation should be allowed to failChuck Lever
xprtrdma always drives a retry of MR allocation if it should fail. It should be safe to not use GFP_KERNEL for this purpose rather than sleeping in the memory allocator. In theory, if these weaker allocations are attempted first, memory exhaustion is likely to cause xprtrdma to fail fast and not then invoke the RDMA core APIs, which still might use GFP_KERNEL. Also note that rpc_task_gfp_mask() always sets __GFP_NORETRY and __GFP_NOWARN when an RPC-related allocation is being done in a worker thread. MR allocation is already always done in worker threads. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc()Chuck Lever
Currently all rpcrdma_regbuf_alloc() call sites pass the same value as their third argument. That argument can therefore be eliminated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05xprtrdma: Clean up synopsis of rpcrdma_req_create()Chuck Lever
Commit 1769e6a816df ("xprtrdma: Clean up rpcrdma_create_req()") added rpcrdma_req_create() with a GFP flags argument in case a caller might want to avoid waiting for memory. There has never been a caller that does not pass GFP_KERNEL as the third argument. That argument can therefore be eliminated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05svcrdma: Clean up RPCRDMA_DEF_GFPChuck Lever
xprt_rdma_bc_allocate() is now the only user of RPCRDMA_DEF_GFP. Replace that macro with the raw flags. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05SUNRPC: Replace the use of the xprtiod WQ in rpcrdmaChuck Lever
While setting up a new lab, I accidentally misconfigured the Ethernet port for a system that tried an NFS mount using RoCE. This made the NFS server unreachable. The following WARNING popped on the NFS client while waiting for the mount attempt to time out: kernel: workqueue: WQ_MEM_RECLAIM xprtiod:xprt_rdma_connect_worker [rpcrdma] is flushing !WQ_MEM_RECLAI> kernel: WARNING: CPU: 0 PID: 100 at kernel/workqueue.c:2628 check_flush_dependency+0xbf/0xca kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs 8021q garp stp mrp llc rfkill rpcrdma> kernel: CPU: 0 PID: 100 Comm: kworker/u8:8 Not tainted 6.0.0-rc1-00002-g6229f8c054e5 #13 kernel: Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0b 06/12/2017 kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma] kernel: RIP: 0010:check_flush_dependency+0xbf/0xca kernel: Code: 75 2a 48 8b 55 18 48 8d 8b b0 00 00 00 4d 89 e0 48 81 c6 b0 00 00 00 48 c7 c7 65 33 2e be> kernel: RSP: 0018:ffffb562806cfcf8 EFLAGS: 00010092 kernel: RAX: 0000000000000082 RBX: ffff97894f8c3c00 RCX: 0000000000000027 kernel: RDX: 0000000000000002 RSI: ffffffffbe3447d1 RDI: 00000000ffffffff kernel: RBP: ffff978941315840 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 00000000000008b0 R11: 0000000000000001 R12: ffffffffc0ce3731 kernel: R13: ffff978950c00500 R14: ffff97894341f0c0 R15: ffff978951112eb0 kernel: FS: 0000000000000000(0000) GS:ffff97987fc00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f807535eae8 CR3: 000000010b8e4002 CR4: 00000000003706f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: <TASK> kernel: __flush_work.isra.0+0xaf/0x188 kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 kernel: ? lock_timer_base+0x38/0x5f kernel: __cancel_work_timer+0xea/0x13d kernel: ? preempt_latency_start+0x2b/0x46 kernel: rdma_addr_cancel+0x70/0x81 [ib_core] kernel: _destroy_id+0x1a/0x246 [rdma_cm] kernel: rpcrdma_xprt_connect+0x115/0x5ae [rpcrdma] kernel: ? _raw_spin_unlock+0x14/0x29 kernel: ? raw_spin_rq_unlock_irq+0x5/0x10 kernel: ? finish_task_switch.isra.0+0x171/0x249 kernel: xprt_rdma_connect_worker+0x3b/0xc7 [rpcrdma] kernel: process_one_work+0x1d8/0x2d4 kernel: worker_thread+0x18b/0x24f kernel: ? rescuer_thread+0x280/0x280 kernel: kthread+0xf4/0xfc kernel: ? kthread_complete_and_exit+0x1b/0x1b kernel: ret_from_fork+0x22/0x30 kernel: </TASK> SUNRPC's xprtiod workqueue is WQ_MEM_RECLAIM, so any workqueue that one of its work items tries to cancel has to be WQ_MEM_RECLAIM to prevent a priority inversion. The internal workqueues in the RDMA/core are currently non-MEM_RECLAIM. Jason Gunthorpe says this about the current state of RDMA/core: > If you attempt to do a reconnection/etc from within a RECLAIM > context it will deadlock on one of the many allocations that are > made to support opening the connection. > > The general idea of reclaim is that the entire task context > working under the reclaim is marked with an override of the gfp > flags to make all allocations under that call chain reclaim safe. > > But rdmacm does allocations outside this, eg in the WQs processing > the CM packets. So this doesn't work and we will deadlock. > > Fixing it is a big deal and needs more than poking WQ_MEM_RECLAIM > here and there. So we will change the ULP in this case to avoid the use of WQ_MEM_RECLAIM where possible. Deadlocks that were possible before are not fixed, but at least we no longer have a false sense of confidence that the stack won't allocate memory during memory reclaim. Suggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03SUNRPC: move from strlcpy with unused retval to strscpyWolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03SUNRPC: use max_t() to simplify open codeZiyang Xuan
Use max_t() to simplify open code which uses "if...else" to get maximum of two values. Generated by coccinelle script: scripts/coccinelle/misc/minmax.cocci Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-03SUNRPC: Directly use ida_alloc()/free()Bo Liu
Use ida_alloc()/ida_free() instead of ida_simple_get()/ida_simple_remove(). The latter is deprecated and more verbose. Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-09-26SUNRPC: Fix typo in xdr_buf_subsegment's kdoc commentChuck Lever
Fix a typo. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26NFSD: Refactor common code out of dirlist helpersChuck Lever
The dust has settled a bit and it's become obvious what code is totally common between nfsd_init_dirlist_pages() and nfsd3_init_dirlist_pages(). Move that common code to SUNRPC. The new helper brackets the existing xdr_init_decode_pages() API. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26SUNRPC: Clarify comment that documents svc_max_payload()Chuck Lever
Note the function returns a per-transport value, not a per-request value (eg, one that is related to the size of the available send or receive buffer space). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26SUNRPC: Parametrize how much of argsize should be zeroedChuck Lever
Currently, SUNRPC clears the whole of .pc_argsize before processing each incoming RPC transaction. Add an extra parameter to struct svc_procedure to enable upper layers to reduce the amount of each operation's argument structure that is zeroed by SUNRPC. The size of struct nfsd4_compoundargs, in particular, is a lot to clear on each incoming RPC Call. A subsequent patch will cut this down to something closer to what NFSv2 and NFSv3 uses. This patch should cause no behavior changes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26SUNRPC: Optimize svc_process()Chuck Lever
Move exception handling code out of the hot path, and avoid the need for a bswap of a non-constant. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-12Merge tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - Fix SUNRPC call completion races with call_decode() that trigger a WARN_ON() - NFSv4.0 cannot support open-by-filehandle and NFS re-export - Revert "SUNRPC: Remove unreachable error condition" to allow handling of error conditions - Update suid/sgid mode bits after ALLOCATE and DEALLOCATE * tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: Remove unreachable error condition" NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0 SUNRPC: Fix call completion races with call_decode()
2022-09-08Revert "SUNRPC: Remove unreachable error condition"Dan Aloni
This reverts commit efe57fd58e1cb77f9186152ee12a8aa4ae3348e0. The assumption that it is impossible to return an ERR pointer from rpc_run_task() no longer holds due to commit 25cf32ad5dba ("SUNRPC: Handle allocation failure in rpc_new_task()"). Fixes: 25cf32ad5dba ('SUNRPC: Handle allocation failure in rpc_new_task()') Fixes: efe57fd58e1c ('SUNRPC: Remove unreachable error condition') Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-09-07freezer,sched: Rewrite core freezer logicPeter Zijlstra
Rewrite the core freezer to behave better wrt thawing and be simpler in general. By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is ensured frozen tasks stay frozen until thawed and don't randomly wake up early, as is currently possible. As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up two PF_flags (yay!). Specifically; the current scheme works a little like: freezer_do_not_count(); schedule(); freezer_count(); And either the task is blocked, or it lands in try_to_freezer() through freezer_count(). Now, when it is blocked, the freezer considers it frozen and continues. However, on thawing, once pm_freezing is cleared, freezer_count() stops working, and any random/spurious wakeup will let a task run before its time. That is, thawing tries to thaw things in explicit order; kernel threads and workqueues before doing bringing SMP back before userspace etc.. However due to the above mentioned races it is entirely possible for userspace tasks to thaw (by accident) before SMP is back. This can be a fatal problem in asymmetric ISA architectures (eg ARMv9) where the userspace task requires a special CPU to run. As said; replace this with a special task state TASK_FROZEN and add the following state transitions: TASK_FREEZABLE -> TASK_FROZEN __TASK_STOPPED -> TASK_FROZEN __TASK_TRACED -> TASK_FROZEN The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL (IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state is already required to deal with spurious wakeups and the freezer causes one such when thawing the task (since the original state is lost). The special __TASK_{STOPPED,TRACED} states *can* be restored since their canonical state is in ->jobctl. With this, frozen tasks need an explicit TASK_FROZEN wakeup and are free of undue (early / spurious) wakeups. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
2022-09-01SUNRPC: Fix call completion races with call_decode()Trond Myklebust
We need to make sure that the req->rq_private_buf is completely up to date before we make req->rq_reply_bytes_recvd visible to the call_decode() routine in order to avoid triggering the WARN_ON(). Reported-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 72691a269f0b ("SUNRPC: Don't reuse bvec on retransmission of the request") Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-08-22Merge tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client fixes from Trond Myklebust: "Stable fixes: - NFS: Fix another fsync() issue after a server reboot Bugfixes: - NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENT - NFS: Fix missing unlock in nfs_unlink() - Add sanity checking of the file type used by __nfs42_ssc_open - Fix a case where we're failing to set task->tk_rpc_status Cleanups: - Remove the NFS_CONTEXT_RESEND_WRITES flag that got obsoleted by the fsync() fix" * tag 'nfs-for-5.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: RPC level errors should set task->tk_rpc_status NFSv4.2 fix problems with __nfs42_ssc_open NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENT NFS: Cleanup to remove unused flag NFS_CONTEXT_RESEND_WRITES NFS: Remove a bogus flag setting in pnfs_write_done_resend_to_mds NFS: Fix another fsync() issue after a server reboot NFS: Fix missing unlock in nfs_unlink()
2022-08-19SUNRPC: RPC level errors should set task->tk_rpc_statusTrond Myklebust
Fix up a case in call_encode() where we're failing to set task->tk_rpc_status when an RPC level error occurred. Fixes: 9c5948c24869 ("SUNRPC: task should be exit if encode return EKEYEXPIRED more times") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-08-12net/sunrpc: fix potential memory leaks in rpc_sysfs_xprt_state_change()Xin Xiong
The issue happens on some error handling paths. When the function fails to grab the object `xprt`, it simply returns 0, forgetting to decrease the reference count of another object `xps`, which is increased by rpc_sysfs_xprt_kobj_get_xprt_switch(), causing refcount leaks. Also, the function forgets to check whether `xps` is valid before using it, which may result in NULL-dereferencing issues. Fix it by adding proper error handling code when either `xprt` or `xps` is NULL. Fixes: 5b7eb78486cd ("SUNRPC: take a xprt offline using sysfs") Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-10Merge tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - pNFS/flexfiles: Fix infinite looping when the RDMA connection errors out Bugfixes: - NFS: fix port value parsing - SUNRPC: Reinitialise the backchannel request buffers before reuse - SUNRPC: fix expiry of auth creds - NFSv4: Fix races in the legacy idmapper upcall - NFS: O_DIRECT fixes from Jeff Layton - NFSv4.1: Fix OP_SEQUENCE error handling - SUNRPC: Fix an RPC/RDMA performance regression - NFS: Fix case insensitive renames - NFSv4/pnfs: Fix a use-after-free bug in open - NFSv4.1: RECLAIM_COMPLETE must handle EACCES Features: - NFSv4.1: session trunking enhancements - NFSv4.2: READ_PLUS performance optimisations - NFS: relax the rules for rsize/wsize mount options - NFS: don't unhash dentry during unlink/rename - SUNRPC: Fail faster on bad verifier - NFS/SUNRPC: Various tracing improvements" * tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits) NFS: Improve readpage/writepage tracing NFS: Improve O_DIRECT tracing NFS: Improve write error tracing NFS: don't unhash dentry during unlink/rename NFSv4/pnfs: Fix a use-after-free bug in open NFS: nfs_async_write_reschedule_io must not recurse into the writeback code SUNRPC: Don't reuse bvec on retransmission of the request SUNRPC: Reinitialise the backchannel request buffers before reuse NFSv4.1: RECLAIM_COMPLETE must handle EACCES NFSv4.1 probe offline transports for trunking on session creation SUNRPC create a function that probes only offline transports SUNRPC export xprt_iter_rewind function SUNRPC restructure rpc_clnt_setup_test_and_add_xprt NFSv4.1 remove xprt from xprt_switch if session trunking test fails SUNRPC create an rpc function that allows xprt removal from rpc_clnt SUNRPC enable back offline transports in trunking discovery SUNRPC create an iterator to list only OFFLINE xprts NFSv4.1 offline trunkable transports on DESTROY_SESSION SUNRPC add function to offline remove trunkable transports SUNRPC expose functions for offline remote xprt functionality ...
2022-08-09Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "Work on 'courteous server', which was introduced in 5.19, continues apace. This release introduces a more flexible limit on the number of NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in courtesy state long after the lease expiration timeout. The client limit is adjusted based on the physical memory size of the server. The NFSD filecache is a cache of files held open by NFSv4 clients or recently touched by NFSv2 or NFSv3 clients. This cache had some significant scalability constraints that have been relieved in this release. Thanks to all who contributed to this work. A data corruption bug found during the most recent NFS bake-a-thon that involves NFSv3 and NFSv4 clients writing the same file has been addressed in this release. This release includes several improvements in CPU scalability for NFSv4 operations. In addition, Neil Brown provided patches that simplify locking during file lookup, creation, rename, and removal that enables subsequent work on making these operations more scalable. We expect to see that work materialize in the next release. There are also numerous single-patch fixes, clean-ups, and the usual improvements in observability" * tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits) lockd: detect and reject lock arguments that overflow NFSD: discard fh_locked flag and fh_lock/fh_unlock NFSD: use (un)lock_inode instead of fh_(un)lock for file operations NFSD: use explicit lock/unlock for directory ops NFSD: reduce locking in nfsd_lookup() NFSD: only call fh_unlock() once in nfsd_link() NFSD: always drop directory lock in nfsd_unlink() NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. NFSD: add posix ACLs to struct nfsd_attrs NFSD: add security label to struct nfsd_attrs NFSD: set attributes when creating symlinks NFSD: introduce struct nfsd_attrs NFSD: verify the opened dentry after setting a delegation NFSD: drop fh argument from alloc_init_deleg NFSD: Move copy offload callback arguments into a separate structure NFSD: Add nfsd4_send_cb_offload() NFSD: Remove kmalloc from nfsd4_do_async_copy() NFSD: Refactor nfsd4_do_copy() NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2) NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2) ...