summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
15 hoursMerge tag 'sysctl-6.17-rc1' of ↵HEADmasterLinus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move sysctls out of the kern_table array This is the final move of ctl_tables into their respective subsystems. Only 5 (out of the original 50) will remain in kernel/sysctl.c file; these handle either sysctl or common arch variables. By decentralizing sysctl registrations, subsystem maintainers regain control over their sysctl interfaces, improving maintainability and reducing the likelihood of merge conflicts. - docs: Remove false positives from check-sysctl-docs Stopped falsely identifying sysctls as undocumented or unimplemented in the check-sysctl-docs script. This script can now be used to automatically identify if documentation is missing. * tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (23 commits) docs: Downgrade arm64 & riscv from titles to comment docs: Replace spaces with tabs in check-sysctl-docs docs: Remove colon from ctltable title in vm.rst docs: Add awk section for ucount sysctl entries docs: Use skiplist when checking sysctl admin-guide docs: nixify check-sysctl-docs sysctl: rename kern_table -> sysctl_subsys_table kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.c uevent: mv uevent_helper into kobject_uevent.c sysctl: Removed unused variable sysctl: Nixify sysctl.sh sysctl: Remove superfluous includes from kernel/sysctl.c sysctl: Remove (very) old file changelog sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.c sysctl: move cad_pid into kernel/pid.c sysctl: Move tainted ctl_table into kernel/panic.c Input: sysrq: mv sysrq into drivers/tty/sysrq.c fork: mv threads-max into kernel/fork.c parisc/power: Move soft-power into power.c mm: move randomize_va_space into memory.c ...
17 hoursMerge tag 's390-6.17-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Standardize on the __ASSEMBLER__ macro that is provided by GCC and Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both uapi and non-uapi headers - Explicitly include <linux/export.h> in architecture and driver files which contain an EXPORT_SYMBOL() and remove the include from the files which do not contain the EXPORT_SYMBOL() - Use the full title of "z/Architecture Principles of Operation" manual and the name of a section where facility bits are listed - Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid unnecessary slowing down of the build and confusing external kABI tools that process symtypes data - Print additional unrecoverable machine check information to make the root cause analysis easier - Move cmpxchg_user_key() handling to uaccess library code, since the generated code is large anyway and there is no benefit if it is inlined - Fix a problem when cmpxchg_user_key() is executing a code with a non-default key: if a system is IPL-ed with "LOAD NORMAL", and the previous system used storage keys where the fetch-protection bit was set for some pages, and the cmpxchg_user_key() is located within such page, a protection exception happens - Either the external call or emergency signal order is used to send an IPI to a remote CPU. Use the external order only, since it is at least as good and sometimes even better, than the emergency signal - In case of an early crash the early program check handler prints more or less random value of the last breaking event address, since it is not initialized properly. Copy the last breaking event address from the lowcore to pt_regs to address this - During STP synchronization check udelay() can not be used, since the first CPU modifies tod_clock_base and get_tod_clock_monotonic() might return a non-monotonic time. Instead, busy-loop on other CPUs, while the the first CPU actually handles the synchronization operation - When debugging the early kernel boot using QEMU with the -S flag and GDB attached, skip the decompressor and start directly in kernel - Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture Principles of Operation" manual - Remove the in-kernel time steering support in favour of the new s390 PTP driver, which allows the kernel clock steered more precisely - Remove a possible false-positive warning in pte_free_defer(), which could be triggered in a valid case KVM guest process is initializing * tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits) s390/mm: Remove possible false-positive warning in pte_free_defer() s390/stp: Default to enabled s390/stp: Remove leap second support s390/time: Remove in-kernel time steering s390/sclp: Use monotonic clock in sclp_sync_wait() s390/smp: Use monotonic clock in smp_emergency_stop() s390/time: Use monotonic clock in get_cycles() s390/pai_crypto: Rename PAI Crypto event 4210 scripts/gdb/symbols: make lx-symbols skip the s390 decompressor s390/boot: Introduce jump_to_kernel() function s390/stp: Remove udelay from stp_sync_clock() s390/early: Copy last breaking event address to pt_regs s390/smp: Remove conditional emergency signal order code usage s390/uaccess: Merge cmpxchg_user_key() inline assemblies s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions s390/uaccess: Initialize code pages executed with non-default access key s390/skey: Provide infrastructure for executing with non-default access key s390/uaccess: Make cmpxchg_user_key() library code s390/page: Add memory clobber to page_set_storage_key() s390/page: Cleanup page_set_storage_key() inline assemblies ...
19 hoursMerge tag 'sched-core-2025-07-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler changes: - Better tracking of maximum lag of tasks in presence of different slices duration, for better handling of lag in the fair scheduler (Vincent Guittot) - Clean up and standardize #if/#else/#endif markers throughout the entire scheduler code base (Ingo Molnar) - Make SMP unconditional: build the SMP scheduler's data structures and logic on UP kernel too, even though they are not used, to simplify the scheduler and remove around 200 #ifdef/[#else]/#endif blocks from the scheduler (Ingo Molnar) - Reorganize cgroup bandwidth control interface handling for better interfacing with sched_ext (Tejun Heo) Balancing: - Bump sd->max_newidle_lb_cost when newidle balance fails (Chris Mason) - Remove sched_domain_topology_level::flags to simplify the code (Prateek Nayak) - Simplify and clean up build_sched_topology() (Li Chen) - Optimize build_sched_topology() on large machines (Li Chen) Real-time scheduling: - Add initial version of proxy execution: a mechanism for mutex-owning tasks to inherit the scheduling context of higher priority waiters. Currently limited to a single runqueue and conditional on CONFIG_EXPERT, and other limitations (John Stultz, Peter Zijlstra, Valentin Schneider) - Deadline scheduler (Juri Lelli): - Fix dl_servers initialization order (Juri Lelli) - Fix DL scheduler's root domain reinitialization logic (Juri Lelli) - Fix accounting bugs after global limits change (Juri Lelli) - Fix scalability regression by implementing less agressive dl_server handling (Peter Zijlstra) PSI: - Improve scalability by optimizing psi_group_change() cpu_clock() usage (Peter Zijlstra) Rust changes: - Make Task, CondVar and PollCondVar methods inline to avoid unnecessary function calls (Kunwu Chan, Panagiotis Foliadis) - Add might_sleep() support for Rust code: Rust's "#[track_caller]" mechanism is used so that Rust's might_sleep() doesn't need to be defined as a macro (Fujita Tomonori) - Introduce file_from_location() (Boqun Feng) Debugging & instrumentation: - Make clangd usable with scheduler source code files again (Peter Zijlstra) - tools: Add root_domains_dump.py which dumps root domains info (Juri Lelli) - tools: Add dl_bw_dump.py for printing bandwidth accounting info (Juri Lelli) Misc cleanups & fixes: - Remove play_idle() (Feng Lee) - Fix check_preemption_disabled() (Sebastian Andrzej Siewior) - Do not call __put_task_struct() on RT if pi_blocked_on is set (Luis Claudio R. Goncalves) - Correct the comment in place_entity() (wang wei)" * tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits) sched/idle: Remove play_idle() sched: Do not call __put_task_struct() on rt if pi_blocked_on is set sched: Start blocked_on chain processing in find_proxy_task() sched: Fix proxy/current (push,pull)ability sched: Add an initial sketch of the find_proxy_task() function sched: Fix runtime accounting w/ split exec & sched contexts sched: Move update_curr_task logic into update_curr_se locking/mutex: Add p->blocked_on wrappers for correctness checks locking/mutex: Rework task_struct::blocked_on sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable sched/topology: Remove sched_domain_topology_level::flags x86/smpboot: avoid SMT domain attach/destroy if SMT is not enabled x86/smpboot: moves x86_topology to static initialize and truncate x86/smpboot: remove redundant CONFIG_SCHED_SMT smpboot: introduce SDTL_INIT() helper to tidy sched topology setup tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info tools/sched: Add root_domains_dump.py which dumps root domains info sched/deadline: Fix accounting after global limits change sched/deadline: Reset extra_bw to max_bw when clearing root domains sched/deadline: Initialize dl_servers after SMP ...
21 hoursMerge tag 'ratelimit.2025.07.23a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull ratelimit test updates from Paul McKenney: "Add functional and stress tests: - Add trivial kunit test for ratelimit - Make the ratelimit test more reliable (Petr Mladek) - Add stress test for ratelimit" * tag 'ratelimit.2025.07.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: lib: Add stress test for ratelimit lib: Make the ratelimit test more reliable lib: Add trivial kunit test for ratelimit
23 hoursMerge tag 'timers-ptp-2025-07-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping and VDSO updates from Thomas Gleixner: - Introduce support for auxiliary timekeepers PTP clocks can be disconnected from the universal CLOCK_TAI reality for various reasons including regularatory requirements for functional safety redundancy. The kernel so far only supports a single notion of time, which means that all clocks are correlated in frequency and only differ by offset to each other. Access to non-correlated PTP clocks has been available so far only through the file descriptor based "POSIX clock IDs", which are subject to locking and have to go all the way out to the hardware. The access is not only horribly slow, as it has to go all the way out to the NIC/PTP hardware, but that also prevents the kernel to read the time of such clocks e.g. from the network stack, where it is required for TSN networking both on the transmit and receive side unless the hardware provides offloading. The auxiliary clocks provide a mechanism to support arbitrary clocks which are not correlated to the system clock. This is not restricted to the PTP use case on purpose as there is no kernel side association of these clocks to a particular PTP device because that's a pure user space configuration decision. Having them independent allows to utilize them for other purposes and also enables them to be tested without hardware dependencies. To avoid pointless overhead these clocks have to be enabled individualy via a new sysfs interface to reduce the overhead to a single compare in the hotpath if they are enabled at the Kconfig level at all. These clocks utilize the existing timekeeping/NTP infrastructures, which has been made possible over the recent releases by incrementaly converting these infrastructures over from a single static instance to a multi-instance pointer based implementation without any performance regression reported. The auxiliary clocks provide the same "emulation" of a "correct" clock as the existing CLOCK_* variants do with an independent instance of data and provide the same steering mechanism through the existing sys_clock_adjtime() interface, which has been confirmed to work by the chronyd(8) maintainer. That allows to provide lockless kernel internal and VDSO support so that applications and kernel internal functionalities can access these clocks without restrictions and at the same performance as the existing system clocks. - Avoid double notifications in the adjtimex() syscall. Not a big issue, but a trivial to avoid latency source. * tag 'timers-ptp-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) vdso/gettimeofday: Add support for auxiliary clocks vdso/vsyscall: Update auxiliary clock data in the datapage vdso: Introduce aux_clock_resolution_ns() vdso/gettimeofday: Introduce vdso_get_timestamp() vdso/gettimeofday: Introduce vdso_set_timespec() vdso/gettimeofday: Introduce vdso_clockid_valid() vdso/gettimeofday: Return bool from clock_gettime() helpers vdso/gettimeofday: Return bool from clock_getres() helpers vdso/helpers: Add helpers for seqlocks of single vdso_clock vdso/vsyscall: Split up __arch_update_vsyscall() into __arch_update_vdso_clock() vdso/vsyscall: Introduce a helper to fill clock configurations timekeeping: Remove the temporary CLOCK_AUX workaround timekeeping: Provide ktime_get_clock_ts64() timekeeping: Provide interface to control auxiliary clocks timekeeping: Provide update for auxiliary timekeepers timekeeping: Provide adjtimex() for auxiliary clocks timekeeping: Prepare do_adtimex() for auxiliary clocks timekeeping: Make do_adjtimex() reusable timekeeping: Add auxiliary clock support to __timekeeping_inject_offset() timekeeping: Make timekeeping_inject_offset() reusable ...
24 hoursMerge tag 'linux_kselftest-kunit-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: "Correct MODULE_IMPORT_NS() syntax documentation, make kunit_test timeout configurable via a module parameter and a Kconfig option, fix longest symbol length test, add a test for static stub, and adjust kunit_test timeout based on test_{suite,case} speed" * tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: fix longest symbol length test kunit: Make default kunit_test timeout configurable via both a module parameter and a Kconfig option kunit: Adjust kunit_test timeout based on test_{suite,case} speed kunit: Add test for static stub Documentation: kunit: Correct MODULE_IMPORT_NS() syntax
27 hoursMerge tag 'char-misc-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc / IIO / other driver updates from Greg KH: "Here is the big set of char/misc/iio and other smaller driver subsystems for 6.17-rc1. It's a big set this time around, with the huge majority being in the iio subsystem with new drivers and dts files being added there. Highlights include: - IIO driver updates, additions, and changes making more code const and cleaning up some init logic - bus_type constant conversion changes - misc device test functions added - rust miscdevice minor fixup - unused function removals for some drivers - mei driver updates - mhi driver updates - interconnect driver updates - Android binder updates and test infrastructure added - small cdx driver updates - small comedi fixes - small nvmem driver updates - small pps driver updates - some acrn virt driver fixes for printk messages - other small driver updates All of these have been in linux-next with no reported issues" * tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits) binder: Use seq_buf in binder_alloc kunit tests binder: Add copyright notice to new kunit files misc: ti_fpc202: Switch to of_fwnode_handle() bus: moxtet: Use dev_fwnode() pc104: move PC104 option to drivers/Kconfig drivers: virt: acrn: Don't use %pK through printk comedi: fix race between polling and detaching interconnect: qcom: Add Milos interconnect provider driver dt-bindings: interconnect: document the RPMh Network-On-Chip Interconnect in Qualcomm Milos SoC mei: more prints with client prefix mei: bus: use cldev in prints bus: mhi: host: pci_generic: Add Telit FN990B40 modem support bus: mhi: host: Detect events pointing to unexpected TREs bus: mhi: host: pci_generic: Add Foxconn T99W696 modem bus: mhi: host: Use str_true_false() helper bus: mhi: host: pci_generic: Add support for EM929x and set MRU to 32768 for better performance. bus: mhi: host: Fix endianness of BHI vector table bus: mhi: host: pci_generic: Disable runtime PM for QDU100 bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640 dt-bindings: interconnect: qcom,msm8998-bwmon: Allow 'nonposted-mmio' ...
43 hoursMerge tag 'libcrypto-tests-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library test updates from Eric Biggers: "Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256, SHA-384, and SHA-512 library functions. These are the first KUnit tests for lib/crypto/. So in addition to being useful tests for these specific algorithms, they also establish some conventions for lib/crypto/ testing going forwards. The new tests are fairly comprehensive: more comprehensive than the generic crypto infrastructure's tests. They use a variety of techniques to check for the types of implementation bugs that tend to occur in the real world, rather than just naively checking some test vectors. (Interestingly, poly1305_kunit found a bug in QEMU) The core test logic is shared by all six algorithms, rather than being duplicated for each algorithm. Each algorithm's test suite also optionally includes a benchmark" * tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: tests: Annotate worker to be on stack lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1 lib/crypto: tests: Add KUnit tests for Poly1305 lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512 lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256 lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
43 hoursMerge tag 'libcrypto-updates-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.17. The main focus this cycle is on reorganizing the SHA-1 and SHA-2 code, providing high-quality library APIs for SHA-1 and SHA-2 including HMAC support, and establishing conventions for lib/crypto/ going forward: - Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares most of the SHA-512 code) into lib/crypto/. This includes both the generic and architecture-optimized code. Greatly simplify how the architecture-optimized code is integrated. Add an easy-to-use library API for each SHA variant, including HMAC support. Finally, reimplement the crypto_shash support on top of the library API. - Apply the same reorganization to the SHA-256 code (and also SHA-224 which shares most of the SHA-256 code). This is a somewhat smaller change, due to my earlier work on SHA-256. But this brings in all the same additional improvements that I made for SHA-1 and SHA-512. There are also some smaller changes: - Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For these algorithms it's just a move, not a full reorganization yet. - Fix the MIPS chacha-core.S to build with the clang assembler. - Fix the Poly1305 functions to work in all contexts. - Fix a performance regression in the x86_64 Poly1305 code. - Clean up the x86_64 SHA-NI optimized SHA-1 assembly code. Note that since the new organization of the SHA code is much simpler, the diffstat of this pull request is negative, despite the addition of new fully-documented library APIs for multiple SHA and HMAC-SHA variants. These APIs will allow further simplifications across the kernel as users start using them instead of the old-school crypto API. (I've already written a lot of such conversion patches, removing over 1000 more lines of code. But most of those will target 6.18 or later)" * tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits) lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils lib/crypto: x86/sha1-ni: Convert to use rounds macros lib/crypto: x86/sha1-ni: Minor optimizations and cleanup crypto: sha1 - Remove sha1_base.h lib/crypto: x86/sha1: Migrate optimized code into library lib/crypto: sparc/sha1: Migrate optimized code into library lib/crypto: s390/sha1: Migrate optimized code into library lib/crypto: powerpc/sha1: Migrate optimized code into library lib/crypto: mips/sha1: Migrate optimized code into library lib/crypto: arm64/sha1: Migrate optimized code into library lib/crypto: arm/sha1: Migrate optimized code into library crypto: sha1 - Use same state format as legacy drivers crypto: sha1 - Wrap library and add HMAC support lib/crypto: sha1: Add HMAC support lib/crypto: sha1: Add SHA-1 library functions lib/crypto: sha1: Rename sha1_init() to sha1_init_raw() crypto: x86/sha1 - Rename conflicting symbol lib/crypto: sha2: Add hmac_sha*_init_usingrawkey() lib/crypto: arm/poly1305: Remove unneeded empty weak function lib/crypto: x86/poly1305: Fix performance regression on short messages ...
43 hoursMerge tag 'crc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: - Reorganize the architecture-optimized CRC code It now lives in lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/, and it is no longer artificially split into separate generic and arch modules. This allows better inlining and dead code elimination The generic CRC code is also no longer exported, simplifying the API. (This mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/, which can be found in the "Crypto library updates" pull request) - Improve crc32c() performance on newer x86_64 CPUs on long messages by enabling the VPCLMULQDQ optimized code - Simplify the crypto_shash wrappers for crc32_le() and crc32c() Register just one shash algorithm for each that uses the (fully optimized) library functions, instead of unnecessarily providing direct access to the generic CRC code - Remove unused and obsolete drivers for hardware CRC engines - Remove CRC-32 combination functions that are no longer used - Add kerneldoc for crc32_le(), crc32_be(), and crc32c() - Convert the crc32() macro to an inline function * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (26 commits) lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial lib/crc: x86: Reorganize crc-pclmul static_call initialization lib/crc: crc64: Add include/linux/crc64.h to kernel-api.rst lib/crc: crc32: Change crc32() from macro to inline function and remove cast nvmem: layouts: Switch from crc32() to crc32_le() lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c() lib/crc: Explicitly include <linux/export.h> lib/crc: Remove ARCH_HAS_* kconfig symbols lib/crc: x86: Migrate optimized CRC code into lib/crc/ lib/crc: sparc: Migrate optimized CRC code into lib/crc/ lib/crc: s390: Migrate optimized CRC code into lib/crc/ lib/crc: riscv: Migrate optimized CRC code into lib/crc/ lib/crc: powerpc: Migrate optimized CRC code into lib/crc/ lib/crc: mips: Migrate optimized CRC code into lib/crc/ lib/crc: loongarch: Migrate optimized CRC code into lib/crc/ lib/crc: arm64: Migrate optimized CRC code into lib/crc/ lib/crc: arm: Migrate optimized CRC code into lib/crc/ lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/ lib/crc: Move files into lib/crc/ lib/crc32: Remove unused combination support ...
44 hoursMerge tag 'hardening-v6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - Introduce and start using TRAILING_OVERLAP() helper for fixing embedded flex array instances (Gustavo A. R. Silva) - mux: Convert mux_control_ops to a flex array member in mux_chip (Thorsten Blum) - string: Group str_has_prefix() and strstarts() (Andy Shevchenko) - Remove KCOV instrumentation from __init and __head (Ritesh Harjani, Kees Cook) - Refactor and rename stackleak feature to support Clang - Add KUnit test for seq_buf API - Fix KUnit fortify test under LTO * tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits) sched/task_stack: Add missing const qualifier to end_of_stack() kstack_erase: Support Clang stack depth tracking kstack_erase: Add -mgeneral-regs-only to silence Clang warnings init.h: Disable sanitizer coverage for __init and __head kstack_erase: Disable kstack_erase for all of arm compressed boot code x86: Handle KCOV __init vs inline mismatches arm64: Handle KCOV __init vs inline mismatches s390: Handle KCOV __init vs inline mismatches arm: Handle KCOV __init vs inline mismatches mips: Handle KCOV __init vs inline mismatch powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON configs/hardening: Enable CONFIG_KSTACK_ERASE stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth stackleak: Rename STACKLEAK to KSTACK_ERASE seq_buf: Introduce KUnit tests string: Group str_has_prefix() and strstarts() kunit/fortify: Add back "volatile" for sizeof() constants acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings ...
44 hoursMerge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - MD pull request via Yu: - call del_gendisk synchronously (Xiao) - cleanup unused variable (John) - cleanup workqueue flags (Ryo) - fix faulty rdev can't be removed during resync (Qixing) - NVMe pull request via Christoph: - try PCIe function level reset on init failure (Keith Busch) - log TLS handshake failures at error level (Maurizio Lombardi) - pci-epf: do not complete commands twice if nvmet_req_init() fails (Rick Wertenbroek) - misc cleanups (Alok Tiwari) - Removal of the pktcdvd driver This has been more than a decade coming at this point, and some recently revealed breakages that had it causing issues even for cases where it isn't required made me re-pull the trigger on this one. It's known broken and nobody has stepped up to maintain the code - Series for ublk supporting batch commands, enabling the use of multishot where appropriate - Speed up ublk exit handling - Fix for the two-stage elevator fixing which could leak data - Convert NVMe to use the new IOVA based API - Increase default max transfer size to something more reasonable - Series fixing write operations on zoned DM devices - Add tracepoints for zoned block device operations - Prep series working towards improving blk-mq queue management in the presence of isolated CPUs - Don't allow updating of the block size of a loop device that is currently under exclusively ownership/open - Set chunk sectors from stacked device stripe size and use it for the atomic write size limit - Switch to folios in bcache read_super() - Fix for CD-ROM MRW exit flush handling - Various tweaks, fixes, and cleanups * tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits) block: restore two stage elevator switch while running nr_hw_queue update cdrom: Call cdrom_mrw_exit from cdrom_release function sunvdc: Balance device refcount in vdc_port_mpgroup_check nvme-pci: try function level reset on init failure dm: split write BIOs on zone boundaries when zone append is not emulated block: use chunk_sectors when evaluating stacked atomic write limits dm-stripe: limit chunk_sectors to the stripe size md/raid10: set chunk_sectors limit md/raid0: set chunk_sectors limit block: sanitize chunk_sectors for atomic write limits ilog2: add max_pow_of_two_factor() nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails nvme-tcp: log TLS handshake failures at error level docs: nvme: fix grammar in nvme-pci-endpoint-target.rst nvme: fix typo in status code constant for self-test in progress nvmet: remove redundant assignment of error code in nvmet_ns_enable() nvme: fix incorrect variable in io cqes error message nvme: fix multiple spelling and grammar issues in host drivers block: fix blk_zone_append_update_request_bio() kernel-doc md/raid10: fix set but not used variable in sync_request_write() ...
7 daysuevent: mv uevent_helper into kobject_uevent.cJoel Granados
Move both uevent_helper table into lib/kobject_uevent.c. Place the registration early in the initcall order with postcore_initcall. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
9 daysstackleak: Rename STACKLEAK to KSTACK_ERASEKees Cook
In preparation for adding Clang sanitizer coverage stack depth tracking that can support stack depth callbacks: - Add the new top-level CONFIG_KSTACK_ERASE option which will be implemented either with the stackleak GCC plugin, or with the Clang stack depth callback support. - Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE, but keep it for anything specific to the GCC plugin itself. - Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named for what it does rather than what it protects against), but leave as many of the internals alone as possible to avoid even more churn. While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS, since that's the only place it is referenced from. Suggested-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
9 dayslib/crypto: tests: Annotate worker to be on stackGuenter Roeck
The following warning traceback is seen if object debugging is enabled with the new crypto test code. ODEBUG: object 9000000106237c50 is on stack 9000000106234000, but NOT annotated. ------------[ cut here ]------------ WARNING: lib/debugobjects.c:655 at lookup_object_or_alloc.part.0+0x19c/0x1f4, CPU#0: kunit_try_catch/468 ... This also results in a boot stall when running the code in qemu:loongarch. Initializing the worker with INIT_WORK_ONSTACK() fixes the problem. Fixes: 950a81224e8b ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250721231917.3182029-1-linux@roeck-us.net Signed-off-by: Eric Biggers <ebiggers@kernel.org>
10 dayslib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutilsEric Biggers
Now that the oldest supported binutils version is 2.30, the macros that emit the SHA-512 instructions as '.inst' words are no longer needed. So drop them. No change in the generated machine code. Changed from the original patch by Ard Biesheuvel: (https://lore.kernel.org/r/20250515142702.2592942-2-ardb+git@google.com): - Reduced scope to just SHA-512 - Added comment that explains why "sha3" is used instead of "sha2" Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250718220706.475240-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
10 dayslib/crypto: x86/sha1-ni: Convert to use rounds macrosEric Biggers
The assembly code that does all 80 rounds of SHA-1 is highly repetitive. Replace it with 20 expansions of a macro that does 4 rounds, using the macro arguments and .if directives to handle the slight variations between rounds. This reduces the length of sha1-ni-asm.S by 129 lines while still producing the exact same object file. This mirrors sha256-ni-asm.S which uses this same strategy. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250718191900.42877-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
10 dayslib/crypto: x86/sha1-ni: Minor optimizations and cleanupEric Biggers
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the stack. There are plenty of unused XMM registers here, so there is no reason to spill to the stack. (While 32-bit code is limited to %xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.) - Remove the unnecessary check for nblocks == 0. sha1_ni_transform() is always passed a positive nblocks. - To get an XMM register with 'e' in the high dword and the rest zeroes, just zeroize the register using pxor, then load 'e'. Previously the code loaded 'e', then zeroized the lower dwords by AND-ing with a constant, which was slightly less efficient. - Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when DATA_PTR reaches that value, instead just decrement NBLOCKS on each iteration and stop when it reaches 0. This is fewer instructions. - Rename DIGEST_PTR to STATE_PTR. It points to the SHA-1 internal state, not a SHA-1 digest value. This commit shrinks the code size of sha1_ni_transform() from 624 bytes to 589 bytes and also shrinks rodata by 16 bytes. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
10 dayslib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficialEric Biggers
Improve crc32c() performance on lengths >= 512 bytes by using crc32_lsb_vpclmul_avx512() instead of crc32c_x86_3way(), when the CPU supports VPCLMULQDQ and has a "good" implementation of AVX-512. For now that means AMD Zen 4 and later, and Intel Sapphire Rapids and later. Pass crc32_lsb_vpclmul_avx512() the table of constants needed to make it use the CRC-32C polynomial. Rationale: VPCLMULQDQ performance has improved on newer CPUs, making crc32_lsb_vpclmul_avx512() faster than crc32c_x86_3way(), even though crc32_lsb_vpclmul_avx512() is designed for generic 32-bit CRCs and does not utilize x86_64's dedicated CRC-32C instructions. Performance results for len=4096 using crc_kunit: CPU Before (MB/s) After (MB/s) ====================== ============= ============ AMD Zen 4 (Genoa) 19868 28618 AMD Zen 5 (Ryzen AI 9 365) 24080 46940 AMD Zen 5 (Turin) 29566 58468 Intel Sapphire Rapids 22340 73794 Intel Emerald Rapids 24696 78666 Performance results for len=512 using crc_kunit: CPU Before (MB/s) After (MB/s) ====================== ============= ============ AMD Zen 4 (Genoa) 7251 7758 AMD Zen 5 (Ryzen AI 9 365) 17481 19135 AMD Zen 5 (Turin) 21332 25424 Intel Sapphire Rapids 18886 29312 Intel Emerald Rapids 19675 29045 That being said, in the above benchmarks the ZMM registers are "warm", so they don't quite tell the whole story. While significantly improved from older Intel CPUs, Intel still has ~2000 ns of ZMM warm-up time where 512-bit instructions execute 4 times more slowly than they normally do. In contrast, AMD does better and has virtually zero ZMM warm-up time (at most ~60 ns). Thus, while this change is always beneficial on AMD, strictly speaking there are cases in which it is not beneficial on Intel, e.g. a small number of 512-byte messages with "cold" ZMM registers. But typically, it is beneficial even on Intel. Note that on AMD Zen 3--5, crc32c() performance could be further improved with implementations that interleave crc32q and VPCLMULQDQ instructions. Unfortunately, it appears that a different such implementation would be optimal on *each* of these microarchitectures. Such improvements are left for future work. This commit just improves the way that we choose the implementations we already have. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250719224938.126512-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
10 dayslib/crc: x86: Reorganize crc-pclmul static_call initializationEric Biggers
Reorganize the crc-pclmul static_call initialization to place more of the logic in the *_mod_init_arch() functions instead of in the INIT_CRC_PCLMUL macro. This provides the flexibility to do more than a single static_call update for each CPU feature check. Right away, optimize crc64_mod_init_arch() to check the CPU features just once instead of twice, doing both the crc64_msb and crc64_lsb static_call updates together. A later commit will also use this to initialize an additional static_key when crc32_lsb_vpclmul_avx512() is enabled. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250719224938.126512-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
11 daysseq_buf: Introduce KUnit testsKees Cook
Add KUnit tests for the seq_buf API to ensure its correctness and prevent future regressions, covering the following functions: - seq_buf_init() - DECLARE_SEQ_BUF() - seq_buf_clear() - seq_buf_puts() - seq_buf_putc() - seq_buf_printf() - seq_buf_get_buf() - seq_buf_commit() $ tools/testing/kunit/kunit.py run seq_buf =================== seq_buf (9 subtests) =================== [PASSED] seq_buf_init_test [PASSED] seq_buf_declare_test [PASSED] seq_buf_clear_test [PASSED] seq_buf_puts_test [PASSED] seq_buf_puts_overflow_test [PASSED] seq_buf_putc_test [PASSED] seq_buf_printf_test [PASSED] seq_buf_printf_overflow_test [PASSED] seq_buf_get_buf_commit_test ===================== [PASSED] seq_buf ===================== Link: https://lore.kernel.org/r/20250717085156.work.363-kees@kernel.org Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Kees Cook <kees@kernel.org>
12 daysvdso/gettimeofday: Add support for auxiliary clocksThomas Weißschuh
Expose the auxiliary clocks through the vDSO. Architectures not using the generic vDSO time framework, namely SPARC64, are not supported. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-12-df7d9f87b9b8@linutronix.de
12 daysvdso/gettimeofday: Introduce vdso_get_timestamp()Thomas Weißschuh
This code is duplicated and with the introduction of auxiliary clocks will be duplicated even more. Introduce a helper. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-9-df7d9f87b9b8@linutronix.de
12 daysvdso/gettimeofday: Introduce vdso_set_timespec()Thomas Weißschuh
This code is duplicated and with the introduction of auxiliary clocks will be duplicated even more. Introduce a helper. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-8-df7d9f87b9b8@linutronix.de
12 daysvdso/gettimeofday: Introduce vdso_clockid_valid()Thomas Weißschuh
Move the clock ID validation check into a common helper. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-7-df7d9f87b9b8@linutronix.de
12 daysvdso/gettimeofday: Return bool from clock_gettime() helpersThomas Weißschuh
The internal helpers are effectively using boolean results, while pretending to use error numbers. Switch the return type to bool for more clarity. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-6-df7d9f87b9b8@linutronix.de
2025-07-16kunit: test: Export kunit_attach_mm()Tiffany Yang
Tests can allocate from virtual memory using kunit_vm_mmap(), which transparently creates and attaches an mm_struct to the test runner if one is not already attached. This is suitable for most cases, except for when the code under test must access a task's mm before performing an mmap. Expose kunit_attach_mm() as part of the interface for those cases. This does not change the existing behavior. Cc: David Gow <davidgow@google.com> Signed-off-by: Tiffany Yang <ynaffit@google.com> Reviewed-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20250714185321.2417234-4-ynaffit@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-14kunit/fortify: Add back "volatile" for sizeof() constantsKees Cook
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant is coming from sizeof. Adding "volatile" back to these variables solves this false positive without reintroducing the issues that originally led to switching to OPTIMIZER_HIDE_VAR in the first place[1]. Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1] Cc: Jannik Glückert <jannik.glueckert@gmail.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Fixes: 6ee149f61bcc ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()") Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-14lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1Eric Biggers
Add a KUnit test suite for the SHA-1 library functions, including the corresponding HMAC support. The core test logic is in the previously-added hash-test-template.h. This commit just adds the actual KUnit suite, and it adds the generated test vectors to the tree so that gen-hash-testvecs.py won't have to be run at build time. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-16-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: tests: Add KUnit tests for Poly1305Eric Biggers
Add a KUnit test suite for the Poly1305 functions. Most of its test cases are instantiated from hash-test-template.h, which is also used by the SHA-2 tests. A couple additional test cases are also included to test edge cases specific to Poly1305. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250709200112.258500-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512Eric Biggers
Add KUnit test suites for the SHA-384 and SHA-512 library functions, including the corresponding HMAC support. The core test logic is in the previously-added hash-test-template.h. This commit just adds the actual KUnit suites, and it adds the generated test vectors to the tree so that gen-hash-testvecs.py won't have to be run at build time. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250709200112.258500-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256Eric Biggers
Add KUnit test suites for the SHA-224 and SHA-256 library functions, including the corresponding HMAC support. The core test logic is in the previously-added hash-test-template.h. This commit just adds the actual KUnit suites, and it adds the generated test vectors to the tree so that gen-hash-testvecs.py won't have to be run at build time. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.pyEric Biggers
Add hash-test-template.h which generates the following KUnit test cases for hash functions: test_hash_test_vectors test_hash_all_lens_up_to_4096 test_hash_incremental_updates test_hash_buffer_overruns test_hash_overlaps test_hash_alignment_consistency test_hash_ctx_zeroization test_hash_interrupt_context_1 test_hash_interrupt_context_2 test_hmac (when HMAC is supported) benchmark_hash (when CONFIG_CRYPTO_LIB_BENCHMARK=y) The initial use cases for this will be sha224_kunit, sha256_kunit, sha384_kunit, sha512_kunit, and poly1305_kunit. Add a Python script gen-hash-testvecs.py which generates the test vectors required by test_hash_test_vectors, test_hash_all_lens_up_to_4096, and test_hmac. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250709200112.258500-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: x86/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the x86-optimized SHA-1 code via x86-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be x86-optimized, and it fixes the longstanding issue where the x86-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. To match sha1_blocks(), change the type of the nblocks parameter of the assembly functions from int to size_t. The assembly functions actually already treated it as size_t. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: sparc/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be sparc-optimized, and it fixes the longstanding issue where the sparc-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. Note: to see the diff from arch/sparc/crypto/sha1_glue.c to lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: s390/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the s390-optimized SHA-1 code via s390-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be s390-optimized, and it fixes the longstanding issue where the s390-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: powerpc/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the powerpc-optimized SHA-1 code via powerpc-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be powerpc-optimized, and it fixes the longstanding issue where the powerpc-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: mips/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the mips-optimized SHA-1 code via mips-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be mips-optimized, and it fixes the longstanding issue where the mips-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: arm64/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be arm64-optimized, and it fixes the longstanding issue where the arm64-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. Remove support for SHA-1 finalization from assembly code, since the library does not yet support architecture-specific overrides of the finalization. (Support for that has been omitted for now, for simplicity and because usually it isn't performance-critical.) To match sha1_blocks(), change the type of the nblocks parameter and the return value of __sha1_ce_transform() from int to size_t. Update the assembly code accordingly. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: arm/sha1: Migrate optimized code into libraryEric Biggers
Instead of exposing the arm-optimized SHA-1 code via arm-specific crypto_shash algorithms, instead just implement the sha1_blocks() library function. This is much simpler, it makes the SHA-1 library functions be arm-optimized, and it fixes the longstanding issue where the arm-optimized SHA-1 code was disabled by default. SHA-1 still remains available through crypto_shash, but individual architectures no longer need to handle it. To match sha1_blocks(), change the type of the nblocks parameter of the assembly functions from int to size_t. The assembly functions actually already treated it as size_t. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: sha1: Add HMAC supportEric Biggers
Add HMAC support to the SHA-1 library, again following what was done for SHA-2. Besides providing the basis for a more streamlined "hmac(sha1)" shash, this will also be useful for multiple in-kernel users such as net/sctp/auth.c, net/ipv6/seg6_hmac.c, and security/keys/trusted-keys/trusted_tpm1.c. Those are currently using crypto_shash, but using the library functions would be much simpler. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: sha1: Add SHA-1 library functionsEric Biggers
Add a library interface for SHA-1, following the SHA-2 one. As was the case with SHA-2, this will be useful for various in-kernel users. The crypto_shash interface will be reimplemented on top of it as well. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()Eric Biggers
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts with the upcoming library function. This will later be removed, but this keeps the kernel building for the introduction of the library. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()Eric Biggers
While the HMAC library functions support both incremental and one-shot computation and both prepared and raw keys, the combination of raw key + incremental was missing. It turns out that several potential users of the HMAC library functions (tpm2-sessions.c, smb2transport.c, trusted_tpm1.c) want exactly that. Therefore, add the missing functions hmac_sha*_init_usingrawkey(). Implement them in an optimized way that directly initializes the HMAC context without a separate key preparation step. Reimplement the one-shot raw key functions hmac_sha*_usingrawkey() on top of the new functions, which makes them a bit more efficient. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250711215844.41715-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14lib/crypto: arm/poly1305: Remove unneeded empty weak functionEric Biggers
Fix poly1305-armv4.pl to not do '.globl poly1305_blocks_neon' when poly1305_blocks_neon() is not defined. Then, remove the empty __weak definition of poly1305_blocks_neon(), which was still needed only because of that unnecessary globl statement. (It also used to be needed because the compiler could generate calls to it when CONFIG_KERNEL_MODE_NEON=n, but that has been fixed.) Thanks to Arnd Bergmann for reporting that the globl statement in the asm file was still depending on the weak symbol. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250711212822.6372-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14Merge branch 'tip/sched/urgent'Peter Zijlstra
Avoid merge conflicts Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2025-07-11lib/crypto: x86/poly1305: Fix performance regression on short messagesEric Biggers
Restore the len >= 288 condition on using the AVX implementation, which was incidentally removed by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface"). This check took into account the overhead in key power computation, kernel-mode "FPU", and tail handling associated with the AVX code. Indeed, restoring this check slightly improves performance for len < 256 as measured using poly1305_kunit on an "AMD Ryzen AI 9 365" (Zen 5) CPU: Length Before After ====== ========== ========== 1 30 MB/s 36 MB/s 16 516 MB/s 598 MB/s 64 1700 MB/s 1882 MB/s 127 2265 MB/s 2651 MB/s 128 2457 MB/s 2827 MB/s 200 2702 MB/s 3238 MB/s 256 3841 MB/s 3768 MB/s 511 4580 MB/s 4585 MB/s 512 5430 MB/s 5398 MB/s 1024 7268 MB/s 7305 MB/s 3173 8999 MB/s 8948 MB/s 4096 9942 MB/s 9921 MB/s 16384 10557 MB/s 10545 MB/s While the optimal threshold for this CPU might be slightly lower than 288 (see the len == 256 case), other CPUs would need to be tested too, and these sorts of benchmarks can underestimate the true cost of kernel-mode "FPU". Therefore, for now just restore the 288 threshold. Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface") Cc: stable@vger.kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11lib/crypto: x86/poly1305: Fix register corruption in no-SIMD contextsEric Biggers
Restore the SIMD usability check and base conversion that were removed by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface"). This safety check is cheap and is well worth eliminating a footgun. While the Poly1305 functions should not be called when SIMD registers are unusable, if they are anyway, they should just do the right thing instead of corrupting random tasks' registers and/or computing incorrect MACs. Fixing this is also needed for poly1305_kunit to pass. Just use irq_fpu_usable() instead of the original crypto_simd_usable(), since poly1305_kunit won't rely on crypto_simd_disabled_for_test. Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface") Cc: stable@vger.kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11lib/crypto: arm64/poly1305: Fix register corruption in no-SIMD contextsEric Biggers
Restore the SIMD usability check that was removed by commit a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface"). This safety check is cheap and is well worth eliminating a footgun. While the Poly1305 functions should not be called when SIMD registers are unusable, if they are anyway, they should just do the right thing instead of corrupting random tasks' registers and/or computing incorrect MACs. Fixing this is also needed for poly1305_kunit to pass. Just use may_use_simd() instead of the original crypto_simd_usable(), since poly1305_kunit won't rely on crypto_simd_disabled_for_test. Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface") Cc: stable@vger.kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-11lib/crypto: arm/poly1305: Fix register corruption in no-SIMD contextsEric Biggers
Restore the SIMD usability check that was removed by commit 773426f4771b ("crypto: arm/poly1305 - Add block-only interface"). This safety check is cheap and is well worth eliminating a footgun. While the Poly1305 functions should not be called when SIMD registers are unusable, if they are anyway, they should just do the right thing instead of corrupting random tasks' registers and/or computing incorrect MACs. Fixing this is also needed for poly1305_kunit to pass. Just use may_use_simd() instead of the original crypto_simd_usable(), since poly1305_kunit won't rely on crypto_simd_disabled_for_test. Fixes: 773426f4771b ("crypto: arm/poly1305 - Add block-only interface") Cc: stable@vger.kernel.org Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250706231100.176113-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>