linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-11-18	selftests: net: add more info to error in bpf_offload	Jakub Kicinski
	bpf_offload caught a spurious warning in TC recently, but the error message did not provide enough information to know what the problem is: FAIL: Found 'netdevsim' in command output, leaky extack? Add the extack to the output: FAIL: Unexpected command output, leaky extack? ('netdevsim', 'Warning: Filter with specified priority/protocol not found.') Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	MAINTAINERS: exclude can core, drivers and DT bindings from netdev ML	Jakub Kicinski
	CAN networking and drivers are maintained by Marc, Oliver and Vincent. Marc sends us already pull requests with reviewed and validated code. Exclude the CAN patch postings from the netdev@ mailing list to lower the patch volume there. Link: https://lore.kernel.org/20241113193709.395c18b0@kernel.org Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://patch.msgid.link/20241115195609.981049-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	net/smc: Run patches also by RDMA ML	Gerd Bayer
	Commits for the SMC protocol usually get carried through the netdev mailing list. Some portions use InfiniBand verbs that are discussed on the RDMA mailing list. So run patches by that list too to increase the likelihood that all interested parties can see them. Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	Merge branch 'mptcp-pm-lockless-list-traversal-and-cleanup'	Jakub Kicinski
	Matthieu Baerts says: ==================== mptcp: pm: lockless list traversal and cleanup Here are two patches improving the MPTCP in-kernel path-manager. - Patch 1: the get and dump endpoints operations are iterating over the endpoints list in a lockless way. - Patch 2: reduce the code duplication to lookup an endpoint. ==================== Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-0-f4a1bcb4ca2c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	mptcp: pm: avoid code duplication to lookup endp	Geliang Tang
	The helper __lookup_addr() can be used in mptcp_pm_nl_get_local_id() and mptcp_pm_nl_is_backup() to simplify the code, and avoid code duplication. Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-2-f4a1bcb4ca2c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	mptcp: pm: lockless list traversal to dump endp	Matthieu Baerts (NGI0)
	To return an endpoint to the userspace via Netlink, and to dump all of them, the endpoint list was iterated while holding the pernet->lock, but only to read the content of the list. In these cases, the spin locks can be replaced by RCU read ones, and use the _rcu variants to iterate over the entries list in a lockless way. Note that the __lookup_addr_by_id() helper has been modified to use the _rcu variants of list_for_each_entry(), but with an extra conditions, so it can be called either while the RCU read lock is held, or when the associated pernet->lock is held. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-1-f4a1bcb4ca2c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	stmmac: dwmac-intel-plat: remove redundant dwmac->data check in probe	Vitalii Mordan
	The driver’s compatibility with devices is confirmed earlier in platform_match(). Since reaching probe means the device is valid, the extra check can be removed to simplify the code. Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	net: txgbe: fix null pointer to pcs	Jiawen Wu
	For 1000BASE-X or SGMII interface mode, the PCS also need to be selected. Only return null pointer when there is a copper NIC with external PHY. Fixes: 02b2a6f91b90 ("net: txgbe: support copper NIC with external PHY") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20241115073508.1130046-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	net: txgbe: remove GPIO interrupt controller	Jiawen Wu
	Since the GPIO interrupt controller is always not working properly, we need to constantly add workaround to cope with hardware deficiencies. So just remove GPIO interrupt controller, and let the SFP driver poll the GPIO status. Fixes: b4a2496c17ed ("net: txgbe: fix GPIO interrupt blocking") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20241115071527.1129458-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	Merge branch 'eth-fbnic-cleanup-and-add-a-few-stats'	Jakub Kicinski
	Jakub Kicinski says: ==================== eth: fbnic: cleanup and add a few stats Cleanup trival problems with fbnic and add the PCIe and RPC (Rx parser) stats. All stats are read under rtnl_lock for now, so the code is pretty trivial. We'll need to add more locking when we start gathering drops used by .ndo_get_stats64. ==================== Link: https://patch.msgid.link/20241115015344.757567-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: add RPC hardware statistics	Sanman Pradhan
	Report Rx parser statistics via ethtool -S. The parser stats are 32b, so we need to add refresh to the service task to make sure we don't miss overflows. Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241115015344.757567-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: add PCIe hardware statistics	Sanman Pradhan
	Add PCIe hardware statistics support to the fbnic driver. These stats provide insight into PCIe transaction performance and error conditions. Which includes, read/write and completion TLP counts and DWORD counts and debug counters for tag, completion credit and NP credit exhaustion The stats are exposed via debugfs and can be used to monitor PCIe performance and debug PCIe issues. Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241115015344.757567-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: add basic debugfs structure	Jakub Kicinski
	Add the usual debugfs structure: fbnic/ $pci-id/ device-fileA device-fileB This patch only adds the directories, subsequent changes will add files. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20241115015344.757567-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: add missing header guards	Jakub Kicinski
	While adding the SPDX headers I noticed we're also missing a header guard. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20241115015344.757567-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: add missing SPDX headers	Jakub Kicinski
	Paolo noticed that we are missing SPDX headers, add them. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20241115015344.757567-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	eth: fbnic: don't disable the PCI device twice	Jakub Kicinski
	We use pcim_enable_device(), there is no need to call pci_disable_device(). Fixes: 546dd90be979 ("eth: fbnic: Add scaffolding for Meta's NIC driver") Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241115014809.754860-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	selftests: net: netlink-dumps: validation checks	Jakub Kicinski
	The sanity checks are going to get silently cast to unsigned and always pass. Cast the sizeof to signed size. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241115003248.733862-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	net/neighbor: clear error in case strict check is not set	Jakub Kicinski
	Commit 51183d233b5a ("net/neighbor: Update neigh_dump_info for strict data checking") added strict checking. The err variable is not cleared, so if we find no table to dump we will return the validation error even if user did not want strict checking. I think the only way to hit this is to send an buggy request, and ask for a table which doesn't exist, so there's no point treating this as a real fix. I only noticed it because a syzbot repro depended on it to trigger another bug. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241115003221.733593-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	rocker: fix link status detection in rocker_carrier_init()	Dmitry Antipov
	Since '1 << rocker_port->pport' may be undefined for port >= 32, cast the left operand to 'unsigned long long' like it's done in 'rocker_port_set_enable()' above. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20241114151946.519047-1-dmantipov@yandex.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	net: wwan: t7xx: Change PM_AUTOSUSPEND_MS to 5000	Jack Wu
	Because optimizing the power consumption of t7XX, change auto suspend time to 5000. The Tests uses a script to loop through the power_state of t7XX. (for example: /sys/bus/pci/devices/0000\:72\:00.0/power_state) * If Auto suspend is 20 seconds, test script show power_state have 0~5% of the time was in D3 state when host don't have data packet transmission. * Changed auto suspend time to 5 seconds, test script show power_state have 50%~80% of the time was in D3 state when host don't have data packet transmission. We tested Fibocom FM350 and our products using the t7xx and they all benefited from this. Signed-off-by: Jack Wu <wojackbb@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Link: https://patch.msgid.link/20241114102002.481081-1-wojackbb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	tools: ynl-gen: allow uapi headers in sub-dirs	Jakub Kicinski
	Binder places its headers under include/uapi/linux/android/ Make sure replace / with _ in the uAPI header guard, the c_upper() is more strict and only converts - to _. This is likely a good constraint to have, to enforce sane naming in enums etc. But paths may include /. Signed-off-by: Li Li <dualli@google.com> Link: https://patch.msgid.link/20241113193239.2113577-2-dualli@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18	Merge tag 'for-linus-6.13-rc1-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - a series for booting as a PVH guest, doing some cleanups after the previous work to make PVH boot code position independent - a fix of the xenbus driver avoiding a leak in an error case * tag 'for-linus-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Fix the issue of resource not being properly released in xenbus_dev_probe() x86/pvh: Avoid absolute symbol references in .head.text x86/xen: Avoid relocatable quantities in Xen ELF notes x86/pvh: Omit needless clearing of phys_base x86/pvh: Use correct size value in GDT descriptor x86/pvh: Call C code via the kernel virtual mapping
2024-11-18	Merge tag 'arm64-upstream' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for running Linux in a protected VM under the Arm Confidential Compute Architecture (CCA) - Guarded Control Stack user-space support. Current patches follow the x86 ABI of implicitly creating a shadow stack on clone(). Subsequent patches (already on the list) will add support for clone3() allowing finer-grained control of the shadow stack size and placement from libc - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are getting close with the upcoming dpISA support) - Other arch features: - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only exposed to user; uaccess support not merged yet) - MTE: hugetlbfs support and the corresponding kselftests - Optimise CRC32 using the PMULL instructions - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG - Optimise the kernel TLB flushing to use the range operations - POE/pkey (permission overlays): further cleanups after bringing the signal handler in line with the x86 behaviour for 6.12 - arm64 perf updates: - Support for the NXP i.MX91 PMU in the existing IMX driver - Support for Ampere SoCs in the Designware PCIe PMU driver - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC - Support for Samsung's 'Mongoose' CPU PMU - Support for PMUv3.9 finer-grained userspace counter access control - Switch back to platform_driver::remove() now that it returns 'void' - Add some missing events for the CXL PMU driver - Miscellaneous arm64 fixes/cleanups: - Page table accessors cleanup: type updates, drop unused macros, reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity check addresses before runtime P4D/PUD folding - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the FEAT_ECV for the generic timers) allowing Linux to boot with firmware deployments that don't set SCTLR_EL3.ECVEn - ACPI/arm64: tighten the check for the array of platform timer structures and adjust the error handling procedure in gtdt_parse_timer_block() - Optimise the cache flush for the uprobes xol slot (skip if no change) and other uprobes/kprobes cleanups - Fix the context switching of tpidrro_el0 when kpti is enabled - Dynamic shadow call stack fixes - Sysreg updates - Various arm64 kselftest improvements * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits) arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled kselftest/arm64: Try harder to generate different keys during PAC tests kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() arm64/ptrace: Clarify documentation of VL configuration via ptrace kselftest/arm64: Corrupt P0 in the irritator when testing SSVE acpi/arm64: remove unnecessary cast arm64/mm: Change protval as 'pteval_t' in map_range() kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c kselftest/arm64: Add FPMR coverage to fp-ptrace kselftest/arm64: Expand the set of ZA writes fp-ptrace does kselftets/arm64: Use flag bits for features in fp-ptrace assembler code kselftest/arm64: Enable build of PAC tests with LLVM=1 kselftest/arm64: Check that SVCR is 0 in signal handlers selftests/mm: Fix unused function warning for aarch64_write_signal_pkey() kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests kselftest/arm64: Fix build with stricter assemblers arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux() arm64/scs: Deal with 64-bit relative offsets in FDE frames ...
2024-11-18	Merge tag 'm68k-for-v6.13-tag1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Revive SCSI and early console support on MVME147 - Fix early kernel parameters using static keys - Prevent and improve handling of kernel configurations that lack specific platform, CPU, or MMU support, to avoid build failures - Miscellaneous fixes and improvements - Defconfig updates * tag 'm68k-for-v6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: defconfig: Update defconfigs for v6.12-rc1 m68k: mvme147: Reinstate early console m68k: Make sure NR_IRQS is never zero m68k: Select M68020 as fallback for classic m68k: Move Sun 3 into a top-level platform option m68k: kernel: Use str_read_write() helper function m68k: Initialize jump labels early during setup_arch() m68k: mvme147: Fix SCSI controller IRQ numbers m68k: mvme147: Make mvme147_sched_init() __init
2024-11-18	Merge tag 'mips_6.13' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: "Just cleanups and fixes" * tag 'mips_6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: dts: realtek: Add I2C controllers mips: dts: realtek: Add syscon-reboot node MIPS: loongson3_defconfig: Enable blk_dev_nvme by default MIPS: loongson3_defconfig: Update configs dependencies MAINTAINERS: Remove linux-mips.org references MAINTAINERS: Retire Ralf Baechle TC: Fix the wrong format specifier MIPS: kernel: proc: Use str_yes_no() helper function MIPS: mobileye: eyeq6h-epm6: Use eyeq6h in the board device tree mips: bmips: bcm6358/6368: define required brcm,bmips-cbr-reg MIPS: Allow using more than 32-bit addresses for reset vectors when possible mips: asm: fix warning when disabling MIPS_FP_SUPPORT mips: sgi-ip22: Replace "s[n]?printf" with sysfs_emit in sysfs callbacks
2024-11-18	Merge tag 's390-6.13-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add firmware sysfs interface which allows user space to retrieve the dump area size of the machine - Add 'measurement_chars_full' CHPID sysfs attribute to make the complete associated Channel-Measurements Characteristics Block available - Add virtio-mem support - Move gmap aka KVM page fault handling from the main fault handler to KVM code. This is the first step to make s390 KVM page fault handling similar to other architectures. With this first step the main fault handler does not have any special handling anymore, and therefore convert it to support LOCK_MM_AND_FIND_VMA - With gcc 14 s390 support for flag output operand support for inline assemblies was added. This allows for several optimizations: - Provide a cmpxchg inline assembly which makes use of this, and provide all variants of arch_try_cmpxchg() so that the compiler can generate slightly better code - Convert a few cmpxchg() loops to try_cmpxchg() loops - Similar to x86 add a CC_OUT() helper macro (and other macros), and convert all inline assemblies to make use of them, so that depending on compiler version better code can be generated - List installed host-key hashes in sysfs if the machine supports the Query Ultravisor Keys UVC - Add 'Retrieve Secret' ioctl which allows user space in protected execution guests to retrieve previously stored secrets from the Ultravisor - Add pkey-uv module which supports the conversion of Ultravisor retrievable secrets to protected keys - Extend the existing paes cipher to exploit the full AES-XTS hardware acceleration introduced with message-security assist extension 10 - Convert hopefully all sysfs show functions to use sysfs_emit() so that the constant flow of such patches stop - For PCI devices make use of the newly added Topology ID attribute to enable whole card multi-function support despite the change to PCHID per port. Additionally improve the overall robustness and usability of the multifunction support - Various other small improvements, fixes, and cleanups * tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (133 commits) s390/cio/ioasm: Convert to use flag output macros s390/cio/qdio: Convert to use flag output macros s390/sclp: Convert to use flag output macros s390/dasd: Convert to use flag output macros s390/boot/physmem: Convert to use flag output macros s390/pci: Convert to use flag output macros s390/kvm: Convert to use flag output macros s390/extmem: Convert to use flag output macros s390/string: Convert to use flag output macros s390/diag: Convert to use flag output macros s390/irq: Convert to use flag output macros s390/smp: Convert to use flag output macros s390/uv: Convert to use flag output macros s390/pai: Convert to use flag output macros s390/mm: Convert to use flag output macros s390/cpu_mf: Convert to use flag output macros s390/cpcmd: Convert to use flag output macros s390/topology: Convert to use flag output macros s390/time: Convert to use flag output macros s390/pageattr: Convert to use flag output macros ...
2024-11-18	Merge tag 'lsm-pr-20241112' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: "Thirteen patches, all focused on moving away from the current 'secid' LSM identifier to a richer 'lsm_prop' structure. This move will help reduce the translation that is necessary in many LSMs, offering better performance, and make it easier to support different LSMs in the future" * tag 'lsm-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: remove lsm_prop scaffolding netlabel,smack: use lsm_prop for audit data audit: change context data from secid to lsm_prop lsm: create new security_cred_getlsmprop LSM hook audit: use an lsm_prop in audit_names lsm: use lsm_prop in security_inode_getsecid lsm: use lsm_prop in security_current_getsecid audit: update shutdown LSM data lsm: use lsm_prop in security_ipc_getsecid audit: maintain an lsm_prop in audit_context lsm: add lsmprop_to_secctx hook lsm: use lsm_prop in security_audit_rule_match lsm: add the lsm_prop data structure
2024-11-18	Merge tag 'selinux-pr-20241112' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add support for netlink xperms Some time ago we added the concept of "xperms" to the SELinux policy so that we could write policy for individual ioctls, this builds upon this by using extending xperms to netlink so that we can write SELinux policy for individual netlnk message types and not rely on the fairly coarse read/write mapping tables we currently have. There are limitations involving generic netlink due to the multiplexing that is done, but it's no worse that what we currently have. As usual, more information can be found in the commit message. - Deprecate /sys/fs/selinux/user We removed the only known userspace use of this back in 2020 and now that several years have elapsed we're starting down the path of deprecating it in the kernel. - Cleanup the build under scripts/selinux A couple of patches to move the genheaders tool under security/selinux and correct our usage of kernel headers in the tools located under scripts/selinux. While these changes originated out of an effort to build Linux on different systems, they are arguably the right thing to do regardless. - Minor code cleanups and style fixes Not much to say here, two minor cleanup patches that came out of the netlink xperms work * tag 'selinux-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: Deprecate /sys/fs/selinux/user selinux: apply clang format to security/selinux/nlmsgtab.c selinux: streamline selinux_nlmsg_lookup() selinux: Add netlink xperm support selinux: move genheaders to security/selinux/ selinux: do not include <linux/*.h> headers from host programs
2024-11-18	Merge tag 'audit-pr-20241112' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "The audit patches are minimal this time around with one patch to correct some kdoc function parameters and one to leverage the `str_yes_no()` function; nothing very exciting" * tag 'audit-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: Use str_yes_no() helper function audit: Reorganize kerneldoc parameter names
2024-11-18	nfsd: allow for up to 32 callback session slots	Jeff Layton
	nfsd currently only uses a single slot in the callback channel, which is proving to be a bottleneck in some cases. Widen the callback channel to a max of 32 slots (subject to the client's target_maxreqs value). Change the cb_holds_slot boolean to an integer that tracks the current slot number (with -1 meaning "unassigned"). Move the callback slot tracking info into the session. Add a new u32 that acts as a bitmap to track which slots are in use, and a u32 to track the latest callback target_slotid that the client reports. To protect the new fields, add a new per-session spinlock (the se_lock). Fix nfsd41_cb_get_slot to always search for the lowest slotid (using ffs()). Finally, convert the session->se_cb_seq_nr field into an array of ints and add the necessary handling to ensure that the seqids get reset when the slot table grows after shrinking. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfs_common: must not hold RCU while calling nfsd_file_put_local	Mike Snitzer
	Move holding the RCU from nfs_to_nfsd_file_put_local to nfs_to_nfsd_net_put. It is the call to nfs_to->nfsd_serv_put that requires the RCU anyway (the puts for nfsd_file and netns were combined to avoid an extra indirect reference but that micro-optimization isn't possible now). This fixes xfstests generic/013 and it triggering: "Voluntary context switch within RCU read-side critical section!" [ 143.545738] Call Trace: [ 143.546206] <TASK> [ 143.546625] ? show_regs+0x6d/0x80 [ 143.547267] ? __warn+0x91/0x140 [ 143.547951] ? rcu_note_context_switch+0x496/0x5d0 [ 143.548856] ? report_bug+0x193/0x1a0 [ 143.549557] ? handle_bug+0x63/0xa0 [ 143.550214] ? exc_invalid_op+0x1d/0x80 [ 143.550938] ? asm_exc_invalid_op+0x1f/0x30 [ 143.551736] ? rcu_note_context_switch+0x496/0x5d0 [ 143.552634] ? wakeup_preempt+0x62/0x70 [ 143.553358] __schedule+0xaa/0x1380 [ 143.554025] ? _raw_spin_unlock_irqrestore+0x12/0x40 [ 143.554958] ? try_to_wake_up+0x1fe/0x6b0 [ 143.555715] ? wake_up_process+0x19/0x20 [ 143.556452] schedule+0x2e/0x120 [ 143.557066] schedule_preempt_disabled+0x19/0x30 [ 143.557933] rwsem_down_read_slowpath+0x24d/0x4a0 [ 143.558818] ? xfs_efi_item_format+0x50/0xc0 [xfs] [ 143.559894] down_read+0x4e/0xb0 [ 143.560519] xlog_cil_commit+0x1b2/0xbc0 [xfs] [ 143.561460] ? _raw_spin_unlock+0x12/0x30 [ 143.562212] ? xfs_inode_item_precommit+0xc7/0x220 [xfs] [ 143.563309] ? xfs_trans_run_precommits+0x69/0xd0 [xfs] [ 143.564394] __xfs_trans_commit+0xb5/0x330 [xfs] [ 143.565367] xfs_trans_roll+0x48/0xc0 [xfs] [ 143.566262] xfs_defer_trans_roll+0x57/0x100 [xfs] [ 143.567278] xfs_defer_finish_noroll+0x27a/0x490 [xfs] [ 143.568342] xfs_defer_finish+0x1a/0x80 [xfs] [ 143.569267] xfs_bunmapi_range+0x4d/0xb0 [xfs] [ 143.570208] xfs_itruncate_extents_flags+0x13d/0x230 [xfs] [ 143.571353] xfs_free_eofblocks+0x12e/0x190 [xfs] [ 143.572359] xfs_file_release+0x12d/0x140 [xfs] [ 143.573324] __fput+0xe8/0x2d0 [ 143.573922] __fput_sync+0x1d/0x30 [ 143.574574] nfsd_filp_close+0x33/0x60 [nfsd] [ 143.575430] nfsd_file_free+0x96/0x150 [nfsd] [ 143.576274] nfsd_file_put+0xf7/0x1a0 [nfsd] [ 143.577104] nfsd_file_put_local+0x18/0x30 [nfsd] [ 143.578070] nfs_close_local_fh+0x101/0x110 [nfs_localio] [ 143.579079] __put_nfs_open_context+0xc9/0x180 [nfs] [ 143.580031] nfs_file_clear_open_context+0x4a/0x60 [nfs] [ 143.581038] nfs_file_release+0x3e/0x60 [nfs] [ 143.581879] __fput+0xe8/0x2d0 [ 143.582464] __fput_sync+0x1d/0x30 [ 143.583108] __x64_sys_close+0x41/0x80 [ 143.583823] x64_sys_call+0x189a/0x20d0 [ 143.584552] do_syscall_64+0x64/0x170 [ 143.585240] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 143.586185] RIP: 0033:0x7f3c5153efd7 Fixes: 65f2a5c36635 ("nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put()") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: get rid of include ../internal.h	Al Viro
	added back in 2015 for the sake of vfs_clone_file_range(), which is in linux/fs.h these days Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur	Yang Erkun
	The action force umount(umount -f) will attempt to kill all rpc_task even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpc_task to nfs server. NFS CLIENT thread1 thread2 open("file") ... nfs4_do_open _nfs4_do_open _nfs4_open_and_get_state _nfs4_proc_open nfs4_run_open_task /* rpc_task1 / rpc_run_task rpc_wait_for_completion_task umount -f nfs_umount_begin rpc_killall_tasks rpc_signal_task rpc_task1 been wakeup and return -512 _nfs4_do_open // while loop ... nfs4_run_open_task / rpc_task2 */ rpc_run_task rpc_wait_for_completion_task While processing an open request, nfsd will first attempt to find or allocate an nfs4_openowner. If it finds an nfs4_openowner that is not marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since two rpc_task can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered. NFS SERVER nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // alloc oo1, stateid1 nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // find oo1, without NFS4_OO_CONFIRMED release_openowner unhash_openowner_locked list_del_init(&oo->oo_perclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2 nfsd4_process_open2 init_open_stateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4_get_vfs_file // alloc nfsd_file1 and nfsd_file_mark1 // all LEAK!!! nfsd4_process_open2 ... write_threads ... nfsd_destroy_serv nfsd_shutdown_net nfs4_state_shutdown_net nfs4_state_destroy_net destroy_client __destroy_client // won't find oo1!!! nfsd_shutdown_generic nfsd_file_cache_shutdown kmem_cache_destroy for nfsd_file_slab and nfsd_file_mark_slab // bark since nfsd_file1 // and nfsd_file_mark1 // still alive ======================================================================= BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() ----------------------------------------------------------------------- Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset\|head\|node=0\|zone=2\|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1ae/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Disabling lock debugging due to kernel taint Object 0xff11000110e2ac38 @offset=3128 Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 pid=800 nfsd_file_do_acquire+0x20f/0xa30 [nfsd] nfsd_file_acquire_opened+0x5f/0x90 [nfsd] nfs4_get_vfs_file+0x4c9/0x570 [nfsd] nfsd4_process_open2+0x713/0x1070 [nfsd] nfsd4_open+0x74b/0x8b0 [nfsd] nfsd4_proc_compound+0x70b/0xc20 [nfsd] nfsd_dispatch+0x1b4/0x3a0 [nfsd] svc_process_common+0x5b8/0xc50 [sunrpc] svc_process+0x2ab/0x3b0 [sunrpc] svc_handle_xprt+0x681/0xa20 [sunrpc] nfsd+0x183/0x220 [nfsd] kthread+0x199/0x1e0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and break nfsd4_open process to fix this problem. Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Add nfsd4_copy time-to-live	Chuck Lever
	Keep async copy state alive for a few lease cycles after the copy completes so that OFFLOAD_STATUS returns something meaningful. This means that NFSD's client shutdown processing needs to purge any of this state that happens to be waiting to die. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Add a laundromat reaper for async copy state	Chuck Lever
	RFC 7862 Section 4.8 states: > A copy offload stateid will be valid until either (A) the client > or server restarts or (B) the client returns the resource by > issuing an OFFLOAD_CANCEL operation or the client replies to a > CB_OFFLOAD operation. Instead of releasing async copy state when the CB_OFFLOAD callback completes, now let it live until the next laundromat run after the callback completes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Block DESTROY_CLIENTID only when there are ongoing async COPY operations	Chuck Lever
	Currently __destroy_client() consults the nfs4_client's async_copies list to determine whether there are ongoing async COPY operations. However, NFSD now keeps copy state in that list even when the async copy has completed, to enable OFFLOAD_STATUS to find the COPY results for a while after the COPY has completed. DESTROY_CLIENTID should not be blocked if the client's async_copies list contains state for only completed copy operations. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD	Chuck Lever
	RFC 7862 permits callback services to respond to CB_OFFLOAD with NFS4ERR_DELAY. Currently NFSD drops the CB_OFFLOAD in that case. To improve the reliability of COPY offload, NFSD should rather send another CB_OFFLOAD completion notification. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Free async copy information in nfsd4_cb_offload_release()	Chuck Lever
	RFC 7862 Section 4.8 states: > A copy offload stateid will be valid until either (A) the client > or server restarts or (B) the client returns the resource by > issuing an OFFLOAD_CANCEL operation or the client replies to a > CB_OFFLOAD operation. Currently, NFSD purges the metadata for an async COPY operation as soon as the CB_OFFLOAD callback has been sent. It does not wait even for the client's CB_OFFLOAD response, as the paragraph above suggests that it should. This makes the OFFLOAD_STATUS operation ineffective during the window between the completion of an asynchronous COPY and the server's receipt of the corresponding CB_OFFLOAD response. This is important if, for example, the client responds with NFS4ERR_DELAY, or the transport is lost before the server receives the response. A client might use OFFLOAD_STATUS to query the server about the still pending asynchronous COPY, but NFSD will respond to OFFLOAD_STATUS as if it had never heard of the presented copy stateid. This patch starts to address this issue by extending the lifetime of struct nfsd4_copy at least until the server has seen the client's CB_OFFLOAD response, or the CB_OFFLOAD has timed out. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Fix nfsd4_shutdown_copy()	Chuck Lever
	nfsd4_shutdown_copy() is just this: while ((copy = nfsd4_get_copy(clp)) != NULL) nfsd4_stop_copy(copy); nfsd4_get_copy() bumps @copy's reference count, preventing nfsd4_stop_copy() from releasing @copy. A while loop like this usually works by removing the first element of the list, but neither nfsd4_get_copy() nor nfsd4_stop_copy() alters the async_copies list. Best I can tell, then, is that nfsd4_shutdown_copy() continues to loop until other threads manage to remove all the items from this list. The spinning loop blocks shutdown until these items are gone. Possibly the reason we haven't seen this issue in the field is because client_has_state() prevents __destroy_client() from calling nfsd4_shutdown_copy() if there are any items on this list. In a subsequent patch I plan to remove that restriction. Fixes: e0639dc5805a ("NFSD introduce async copy feature") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	NFSD: Add a tracepoint to record canceled async COPY operations	Chuck Lever
	Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: make nfsd4_session->se_flags a bool	Jeff Layton
	While this holds the flags from the CREATE_SESSION request, nothing ever consults them. The only flag used is NFS4_SESSION_DEAD. Make it a simple bool instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: remove nfsd4_session->se_bchannel	Jeff Layton
	This field is written and is never consulted again. Remove it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: make use of warning provided by refcount_t	NeilBrown
	refcount_t, by design, checks for unwanted situations and provides warnings. It is rarely useful to have explicit warnings with refcount usage. In this case we have an explicit warning if a refcount_t reaches zero when decremented. Simply using refcount_dec() will provide a similar warning and also mark the refcount_t as saturated to avoid any possible use-after-free. This patch drops the warning and uses refcount_dec() instead of refcount_dec_and_test(). Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: Don't fail OP_SETCLIENTID when there are too many clients.	NeilBrown
	Failing OP_SETCLIENTID or OP_EXCHANGE_ID should only happen if there is memory allocation failure. Putting a hard limit on the number of clients is not really helpful as it will either happen too early and prevent clients that the server can easily handle, or too late and allow clients when the server is swamped. The calculated limit is still useful for expiring courtesy clients where there are "too many" clients, but it shouldn't prevent the creation of active clients. Testing of lots of clients against small-mem servers reports repeated NFS4ERR_DELAY responses which doesn't seem helpful. There may have been reports of similar problems in production use. Also remove an outdated comment - we do use a slab cache. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	svcrdma: fix miss destroy percpu_counter in svc_rdma_proc_init()	Ye Bin
	There's issue as follows: RPC: Registered rdma transport module. RPC: Registered rdma backchannel transport module. RPC: Unregistered rdma transport module. RPC: Unregistered rdma backchannel transport module. BUG: unable to handle page fault for address: fffffbfff80c609a PGD 123fee067 P4D 123fee067 PUD 123fea067 PMD 10c624067 PTE 0 Oops: Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI RIP: 0010:percpu_counter_destroy_many+0xf7/0x2a0 Call Trace: <TASK> __die+0x1f/0x70 page_fault_oops+0x2cd/0x860 spurious_kernel_fault+0x36/0x450 do_kern_addr_fault+0xca/0x100 exc_page_fault+0x128/0x150 asm_exc_page_fault+0x26/0x30 percpu_counter_destroy_many+0xf7/0x2a0 mmdrop+0x209/0x350 finish_task_switch.isra.0+0x481/0x840 schedule_tail+0xe/0xd0 ret_from_fork+0x23/0x80 ret_from_fork_asm+0x1a/0x30 </TASK> If register_sysctl() return NULL, then svc_rdma_proc_cleanup() will not destroy the percpu counters which init in svc_rdma_proc_init(). If CONFIG_HOTPLUG_CPU is enabled, residual nodes may be in the 'percpu_counters' list. The above issue may occur once the module is removed. If the CONFIG_HOTPLUG_CPU configuration is not enabled, memory leakage occurs. To solve above issue just destroy all percpu counters when register_sysctl() return NULL. Fixes: 1e7e55731628 ("svcrdma: Restore read and write stats") Fixes: 22df5a22462e ("svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter") Fixes: df971cd853c0 ("svcrdma: Convert rdma_stat_recv to a per-CPU counter") Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	xdrgen: Remove program_stat_to_errno() call sites	Chuck Lever
	Refactor: Translating an on-the-wire value to a local host errno is architecturally a job for the proc function, not the XDR decoder. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	xdrgen: Update the files included in client-side source code	Chuck Lever
	In particular, client-side source code needs the definition of "struct rpc_procinfo" and does not want header files that pull in "struct svc_rqst". Otherwise, the source does not compile. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	xdrgen: Remove check for "nfs_ok" in C templates	Chuck Lever
	Obviously, "nfs_ok" is defined only for NFS protocols. Other XDR protocols won't know "nfs_ok" from Adam. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	xdrgen: Remove tracepoint call site	Chuck Lever
	This tracepoint was a "note to self" and is not operational. It is added only to client-side code, which so far we haven't needed. It will cause immediate breakage once we start generating client code, though, so remove it now. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-18	nfsd: release svc_expkey/svc_export with rcu_work	Yang Erkun
	The last reference for `cache_head` can be reduced to zero in `c_show` and `e_show`(using `rcu_read_lock` and `rcu_read_unlock`). Consequently, `svc_export_put` and `expkey_put` will be invoked, leading to two issues: 1. The `svc_export_put` will directly free ex_uuid. However, `e_show`/`c_show` will access `ex_uuid` after `cache_put`, which can trigger a use-after-free issue, shown below. ================================================================== BUG: KASAN: slab-use-after-free in svc_export_show+0x362/0x430 [nfsd] Read of size 1 at addr ff11000010fdc120 by task cat/870 CPU: 1 UID: 0 PID: 870 Comm: cat Not tainted 6.12.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 print_address_description.constprop.0+0x2c/0x3a0 print_report+0xb9/0x280 kasan_report+0xae/0xe0 svc_export_show+0x362/0x430 [nfsd] c_show+0x161/0x390 [sunrpc] seq_read_iter+0x589/0x770 seq_read+0x1e5/0x270 proc_reg_read+0xe1/0x140 vfs_read+0x125/0x530 ksys_read+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Allocated by task 830: kasan_save_stack+0x20/0x40 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __kmalloc_node_track_caller_noprof+0x1bc/0x400 kmemdup_noprof+0x22/0x50 svc_export_parse+0x8a9/0xb80 [nfsd] cache_do_downcall+0x71/0xa0 [sunrpc] cache_write_procfs+0x8e/0xd0 [sunrpc] proc_reg_write+0xe1/0x140 vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 868: kasan_save_stack+0x20/0x40 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x37/0x50 kfree+0xf3/0x3e0 svc_export_put+0x87/0xb0 [nfsd] cache_purge+0x17f/0x1f0 [sunrpc] nfsd_destroy_serv+0x226/0x2d0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e 2. We cannot sleep while using `rcu_read_lock`/`rcu_read_unlock`. However, `svc_export_put`/`expkey_put` will call path_put, which subsequently triggers a sleeping operation due to the following `dput`. ============================= WARNING: suspicious RCU usage 5.10.0-dirty #141 Not tainted ----------------------------- ... Call Trace: dump_stack+0x9a/0xd0 ___might_sleep+0x231/0x240 dput+0x39/0x600 path_put+0x1b/0x30 svc_export_put+0x17/0x80 e_show+0x1c9/0x200 seq_read_iter+0x63f/0x7c0 seq_read+0x226/0x2d0 vfs_read+0x113/0x2c0 ksys_read+0xc9/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x67/0xd1 Fix these issues by using `rcu_work` to help release `svc_expkey`/`svc_export`. This approach allows for an asynchronous context to invoke `path_put` and also facilitates the freeing of `uuid/exp/key` after an RCU grace period. Fixes: 9ceddd9da134 ("knfsd: Allow lockless lookups of the exports") Signed-off-by: Yang Erkun <yangerkun@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>