summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-26init.h: Disable sanitizer coverage for __init and __headKees Cook
While __noinstr already contained __no_sanitize_coverage, it needs to be added to __init and __head section markings to support the Clang implementation of CONFIG_KSTACK_ERASE. This is to make sure the stack depth tracking callback is not executed in unsupported contexts. The other sanitizer coverage options (trace-pc and trace-cmp) aren't needed in __head nor __init either ("We are interested in code coverage as a function of a syscall inputs"[1]), so this is fine to disable for them as well. Link: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcov.c?h=v6.14#n179 [1] Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20250724055029.3623499-3-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26kstack_erase: Disable kstack_erase for all of arm compressed boot codeKees Cook
When building with CONFIG_KSTACK_ERASE=y and CONFIG_ARM_ATAG_DTB_COMPAT=y, the compressed boot environment encounters an undefined symbol error: ld.lld: error: undefined symbol: __sanitizer_cov_stack_depth >>> referenced by atags_to_fdt.c:135 This occurs because the compiler instruments the atags_to_fdt() function with sanitizer coverage calls, but the minimal compressed boot environment lacks access to sanitizer runtime support. The compressed boot environment already disables stack protector with -fno-stack-protector. Similarly disable sanitizer coverage by adding $(DISABLE_KSTACK_ERASE) to the general compiler flags (and remove it from the one place it was noticed before), which contains the appropriate flags to prevent sanitizer instrumentation. This follows the same pattern used in other early boot contexts where sanitizer runtime support is unavailable. Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYtBk8qnpWvoaFwymCx5s5i-5KXtPGpmf=_+UKJddCOnLA@mail.gmail.com Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162 Suggested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-24x86: Handle KCOV __init vs inline mismatchesKees Cook
GCC appears to have kind of fragile inlining heuristics, in the sense that it can change whether or not it inlines something based on optimizations. It looks like the kcov instrumentation being added (or in this case, removed) from a function changes the optimization results, and some functions marked "inline" are _not_ inlined. In that case, we end up with __init code calling a function not marked __init, and we get the build warnings I'm trying to eliminate in the coming patch that adds __no_sanitize_coverage to __init functions: WARNING: modpost: vmlinux: section mismatch in reference: xbc_exit+0x8 (section: .text.unlikely) -> _xbc_exit (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: real_mode_size_needed+0x15 (section: .text.unlikely) -> real_mode_blob_end (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: __set_percpu_decrypted+0x16 (section: .text.unlikely) -> early_set_memory_decrypted (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: memblock_alloc_from+0x26 (section: .text.unlikely) -> memblock_alloc_try_nid (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_set_root_pointer+0xc (section: .text.unlikely) -> x86_init (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_get_root_pointer+0x8 (section: .text.unlikely) -> x86_init (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: efi_config_table_is_usable+0x16 (section: .text.unlikely) -> xen_efi_config_table_is_usable (section: .init.text) This problem is somewhat fragile (though using either __always_inline or __init will deterministically solve it), but we've tripped over this before with GCC and the solution has usually been to just use __always_inline and move on. For x86 this means forcing several functions to be inline with __always_inline. Link: https://lore.kernel.org/r/20250724055029.3623499-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-24arm64: Handle KCOV __init vs inline mismatchesKees Cook
GCC appears to have kind of fragile inlining heuristics, in the sense that it can change whether or not it inlines something based on optimizations. It looks like the kcov instrumentation being added (or in this case, removed) from a function changes the optimization results, and some functions marked "inline" are _not_ inlined. In that case, we end up with __init code calling a function not marked __init, and we get the build warnings I'm trying to eliminate in the coming patch that adds __no_sanitize_coverage to __init functions: WARNING: modpost: vmlinux: section mismatch in reference: acpi_get_enable_method+0x1c (section: .text.unlikely) -> acpi_psci_present (section: .init.text) This problem is somewhat fragile (though using either __always_inline or __init will deterministically solve it), but we've tripped over this before with GCC and the solution has usually been to just use __always_inline and move on. For arm64 this requires forcing one ACPI function to be inlined with __always_inline. Link: https://lore.kernel.org/r/20250724055029.3623499-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21s390: Handle KCOV __init vs inline mismatchesKees Cook
When KCOV is enabled all functions get instrumented, unless the __no_sanitize_coverage attribute is used. To prepare for __no_sanitize_coverage being applied to __init functions, we have to handle differences in how GCC's inline optimizations get resolved. For s390 this exposed a place where the __init annotation was missing but ended up being "accidentally correct". Fix this cases and force a couple functions to be inline with __always_inline. Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20250717232519.2984886-7-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21arm: Handle KCOV __init vs inline mismatchesKees Cook
When KCOV is enabled all functions get instrumented, unless the __no_sanitize_coverage attribute is used. To prepare for __no_sanitize_coverage being applied to __init functions, we have to handle differences in how GCC's inline optimizations get resolved. For arm this exposed several places where __init annotations were missing but ended up being "accidentally correct". Fix these cases and force several functions to be inline with __always_inline. Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Lee Jones <lee@kernel.org> Reviewed-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20250717232519.2984886-5-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21mips: Handle KCOV __init vs inline mismatchKees Cook
When KCOV is enabled all functions get instrumented, unless the __no_sanitize_coverage attribute is used. To prepare for __no_sanitize_coverage being applied to __init functions, we have to handle differences in how GCC's inline optimizations get resolved. For mips this requires adding the __init annotation on init_mips_clocksource(). Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20250717232519.2984886-9-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init ↵Ritesh Harjani (IBM)
section Move a few kfence and debug_pagealloc related functions in hash_utils.c and radix_pgtable.c to __init sections since these are only invoked once by an __init function during system initialization. i.e. - hash_debug_pagealloc_alloc_slots() - hash_kfence_alloc_pool() - hash_kfence_map_pool() The above 3 functions only gets called by __init htab_initialize(). - alloc_kfence_pool() - map_kfence_pool() The above 2 functions only gets called by __init radix_init_pgtable() This should also help fix warning msgs like: >> WARNING: modpost: vmlinux: section mismatch in reference: hash_debug_pagealloc_alloc_slots+0xb0 (section: .text) -> memblock_alloc_try_nid (section: .init.text) Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504190552.mnFGs5sj-lkp@intel.com/ Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20250717232519.2984886-8-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ONKees Cook
To reduce stale data lifetimes, enable CONFIG_INIT_ON_FREE_DEFAULT_ON as well. This matches the addition of CONFIG_STACKLEAK=y, which is doing similar for stack memory. Link: https://lore.kernel.org/r/20250717232519.2984886-13-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21configs/hardening: Enable CONFIG_KSTACK_ERASEKees Cook
Since we can wipe the stack with both Clang and GCC plugins, enable this for the "hardening.config" for wider testing. Link: https://lore.kernel.org/r/20250717232519.2984886-12-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGSKees Cook
In preparation for Clang stack depth tracking for KSTACK_ERASE, split the stackleak-specific cflags out of GCC_PLUGINS_CFLAGS into KSTACK_ERASE_CFLAGS. Link: https://lore.kernel.org/r/20250717232519.2984886-3-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depthKees Cook
The Clang stack depth tracking implementation has a fixed name for the stack depth tracking callback, "__sanitizer_cov_stack_depth", so rename the GCC plugin function to match since the plugin has no external dependencies on naming. Link: https://lore.kernel.org/r/20250717232519.2984886-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21stackleak: Rename STACKLEAK to KSTACK_ERASEKees Cook
In preparation for adding Clang sanitizer coverage stack depth tracking that can support stack depth callbacks: - Add the new top-level CONFIG_KSTACK_ERASE option which will be implemented either with the stackleak GCC plugin, or with the Clang stack depth callback support. - Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE, but keep it for anything specific to the GCC plugin itself. - Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named for what it does rather than what it protects against), but leave as many of the internals alone as possible to avoid even more churn. While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS, since that's the only place it is referenced from. Suggested-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-19seq_buf: Introduce KUnit testsKees Cook
Add KUnit tests for the seq_buf API to ensure its correctness and prevent future regressions, covering the following functions: - seq_buf_init() - DECLARE_SEQ_BUF() - seq_buf_clear() - seq_buf_puts() - seq_buf_putc() - seq_buf_printf() - seq_buf_get_buf() - seq_buf_commit() $ tools/testing/kunit/kunit.py run seq_buf =================== seq_buf (9 subtests) =================== [PASSED] seq_buf_init_test [PASSED] seq_buf_declare_test [PASSED] seq_buf_clear_test [PASSED] seq_buf_puts_test [PASSED] seq_buf_puts_overflow_test [PASSED] seq_buf_putc_test [PASSED] seq_buf_printf_test [PASSED] seq_buf_printf_overflow_test [PASSED] seq_buf_get_buf_commit_test ===================== [PASSED] seq_buf ===================== Link: https://lore.kernel.org/r/20250717085156.work.363-kees@kernel.org Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-17string: Group str_has_prefix() and strstarts()Andy Shevchenko
The two str_has_prefix() and strstarts() are about the same with a slight difference on what they return. Group them in the header. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250711085514.1294428-1-andriy.shevchenko@linux.intel.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-14kunit/fortify: Add back "volatile" for sizeof() constantsKees Cook
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant is coming from sizeof. Adding "volatile" back to these variables solves this false positive without reintroducing the issues that originally led to switching to OPTIMIZER_HIDE_VAR in the first place[1]. Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1] Cc: Jannik Glückert <jannik.glueckert@gmail.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Fixes: 6ee149f61bcc ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()") Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-27acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warningsGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the new TRAILING_OVERLAP() helper to fix a dozen instances of the following type of warning: drivers/acpi/nfit/intel.c:692:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Acked-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/aF7pF4kej8VQapyR@kspp Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-18stddef: Introduce TRAILING_OVERLAP() helper macroGustavo A. R. Silva
Add new TRAILING_OVERLAP() helper macro to create a union between a flexible-array member (FAM) and a set of members that would otherwise follow it. This overlays the trailing members onto the FAM while preserving the original memory layout. Co-developed-by: Kees Cook <kees@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/aFG8gEwKXAWWIvX0@kspp Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-18mux: Convert mux_control_ops to a flex array member in mux_chipThorsten Blum
Convert mux_control_ops to a flexible array member at the end of the mux_chip struct and add the __counted_by() compiler attribute to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Use struct_size() to calculate the number of bytes to allocate for a new mux chip and to remove the following Coccinelle/coccicheck warning: WARNING: Use struct_size Use size_add() to safely add any extra bytes. No functional changes intended. Link: https://github.com/KSPP/linux/issues/83 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250610104106.1948-2-thorsten.blum@linux.dev Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-15Linux 6.16-rc2Linus Torvalds
2025-06-15Merge tag 'kbuild-fixes-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Move warnings about linux/export.h from W=1 to W=2 - Fix structure type overrides in gendwarfksyms * tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: gendwarfksyms: Fix structure type overrides kbuild: move warnings about linux/export.h from W=1 to W=2
2025-06-16gendwarfksyms: Fix structure type overridesSami Tolvanen
As we always iterate through the entire die_map when expanding type strings, recursively processing referenced types in type_expand_child() is not actually necessary. Furthermore, the type_string kABI rule added in commit c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") can fail to override type strings for structures due to a missing kabi_get_type_string() check in this function. Fix the issue by dropping the unnecessary recursion and moving the override check to type_expand(). Note that symbol versions are otherwise unchanged with this patch. Fixes: c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") Reported-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-16kbuild: move warnings about linux/export.h from W=1 to W=2Masahiro Yamada
This hides excessive warnings, as nobody builds with W=2. Fixes: a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") Fixes: 7d95680d64ac ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com>
2025-06-14Merge tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - SMB3.1.1 POSIX extensions fix for char remapping - Fix for repeated directory listings when directory leases enabled - deferred close handle reuse fix * tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: improve directory cache reuse for readdir operations smb: client: fix perf regression with deferred closes smb: client: disable path remapping with POSIX extensions
2025-06-14Merge tag 'iommu-fixes-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fix from Joerg Roedel: - Fix PTE size calculation for NVidia Tegra * tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/tegra: Fix incorrect size calculation
2025-06-14Merge tag 'block-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for a deadlock on queue freeze with zoned writes - Fix for zoned append emulation - Two bio folio fixes, for sparsemem and for very large folios - Fix for a performance regression introduced in 6.13 when plug insertion was changed - Fix for NVMe passthrough handling for polled IO - Document the ublk auto registration feature - loop lockdep warning fix * tag 'block-6.16-20250614' of git://git.kernel.dk/linux: nvme: always punt polled uring_cmd end_io work to task_work Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists block: Fix bvec_set_folio() for very large folios bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP block: use plug request list tail for one-shot backmerge attempt block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completion ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG) loop: move lo_set_size() out of queue freeze
2025-06-14Merge tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix for a race between SQPOLL exit and fdinfo reading. It's slim and I was only able to reproduce this with an artificial delay in the kernel. Followup sparse fix as well to unify the access to ->thread. - Fix for multiple buffer peeking, avoiding truncation if possible. - Run local task_work for IOPOLL reaping when the ring is exiting. This currently isn't done due to an assumption that polled IO will never need task_work, but a fix on the block side is going to change that. * tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linux: io_uring: run local task_work from ring exit IOPOLL reaping io_uring/kbuf: don't truncate end buffer for multiple buffer peeks io_uring: consistently use rcu semantics with sqpoll thread io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-14Merge tag 'rust-fixes-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust fix from Miguel Ojeda: - 'hrtimer': fix future compile error when the 'impl_has_hr_timer!' macro starts to get called * tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: time: Fix compile error in impl_has_hr_timer macro
2025-06-14Merge tag 'mm-hotfixes-stable-2025-06-13-21-56' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "9 hotfixes. 3 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. Only 4 are for MM" * tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: add mmap_prepare() compatibility layer for nested file systems init: fix build warnings about export.h MAINTAINERS: add Barry as a THP reviewer drivers/rapidio/rio_cm.c: prevent possible heap overwrite mm: close theoretical race where stale TLB entries could linger mm/vma: reset VMA iterator on commit_merge() OOM failure docs: proc: update VmFlags documentation in smaps scatterlist: fix extraneous '@'-sign kernel-doc notation selftests/mm: skip failed memfd setups in gup_longterm
2025-06-13Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "All fixes for drivers. The core change in the error handler is simply to translate an ALUA specific sense code into a retry the ALUA components can handle and won't impact any other devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: error: alua: I/O errors for ALUA state transitions scsi: storvsc: Increase the timeouts to storvsc_timeout scsi: s390: zfcp: Ensure synchronous unit_add scsi: iscsi: Fix incorrect error path labels for flashnode operations scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registers scsi: core: ufs: Fix a hang in the error handler
2025-06-13Merge tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Quiet week, only two pull requests came my way, xe has a couple of fixes and then a bunch of fixes across the board, vc4 probably fixes the biggest problem: vc4: - Fix infinite EPROBE_DEFER loop in vc4 probing amdxdna: - Fix amdxdna firmware size meson: - modesetting fixes sitronix: - Kconfig fix for st7171-i2c dma-buf: - Fix -EBUSY WARN_ON_ONCE in dma-buf udmabuf: - Use dma_sync_sgtable_for_cpu in udmabuf xe: - Fix regression disallowing 64K SVM migration - Use a bounce buffer for WA BB" * tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/lrc: Use a temporary buffer for WA BB udmabuf: use sgtable-based scatterlist wrappers dma-buf: fix compare in WARN_ON_ONCE drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERS drm/meson: fix more rounding issues with 59.94Hz modes drm/meson: use vclk_freq instead of pixel_freq in debug print drm/meson: fix debug log statement when setting the HDMI clocks drm/vc4: fix infinite EPROBE_DEFER loop drm/xe/svm: Fix regression disallowing 64K SVM migration accel/amdxdna: Fix incorrect PSP firmware size
2025-06-13io_uring: run local task_work from ring exit IOPOLL reapingJens Axboe
In preparation for needing to shift NVMe passthrough to always use task_work for polled IO completions, ensure that those are suitably run at exit time. See commit: 9ce6c9875f3e ("nvme: always punt polled uring_cmd end_io work to task_work") for details on why that is necessary. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13nvme: always punt polled uring_cmd end_io work to task_workJens Axboe
Currently NVMe uring_cmd completions will complete locally, if they are polled. This is done because those completions are always invoked from task context. And while that is true, there's no guarantee that it's invoked under the right ring context, or even task. If someone does NVMe passthrough via multiple threads and with a limited number of poll queues, then ringA may find completions from ringB. For that case, completing the request may not be sound. Always just punt the passthrough completions via task_work, which will redirect the completion, if needed. Cc: stable@vger.kernel.org Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'acpi-6.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device, address an EC driver issue related to invalid ECDT tables, clean up the usage of mwait_idle_with_hints() in the ACPI PAD driver, add a new IRQ override quirk, and fix a NULL pointer dereference related to nosmp: - Update the faux device handling code in the driver core and address an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device on top of that (Dan Williams) - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak) - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui) - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf) - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan)" * tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: Use IRQ override on MACHENIKE 16P ACPI: EC: Ignore ECDT tables with an invalid ID string ACPI: CPPC: Fix NULL pointer dereference when nosmp is used ACPI: PAD: Update arguments of mwait_idle_with_hints() ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure driver core: faux: Quiet probe failures driver core: faux: Suppress bind attributes
2025-06-13Merge tag 'pm-6.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the cpupower utility installation, fix up the recently added Rust abstractions for cpufreq and OPP, restore the x86 update eliminating mwait_play_dead_cpuid_hint() that has been reverted during the 6.16 merge window along with preventing the failure caused by it from happening, and clean up mwait_idle_with_hints() usage in intel_idle: - Implement CpuId Rust abstraction and use it to fix doctest failure related to the recently introduced cpumask abstraction (Viresh Kumar) - Do minor cleanups in the `# Safety` sections for cpufreq abstractions added recently (Viresh Kumar) - Unbreak cpupower systemd service units installation on some systems by adding a unitdir variable for specifying the location to install them (Francesco Poli) - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the 6.16 merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki) - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak)" * tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: rust: cpu: Add CpuId::current() to retrieve current CPU ID rust: Use CpuId in place of raw CPU numbers rust: cpu: Introduce CpuId abstraction intel_idle: Update arguments of mwait_idle_with_hints() cpufreq: Convert `/// SAFETY` lines to `# Safety` sections cpupower: split unitdir from libdir in Makefile Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branches 'acpi-pad', 'acpi-cppc', 'acpi-ec' and 'acpi-resource'Rafael J. Wysocki
Merge assorted ACPI updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak). - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui). - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf). - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan). * acpi-pad: ACPI: PAD: Update arguments of mwait_idle_with_hints() * acpi-cppc: ACPI: CPPC: Fix NULL pointer dereference when nosmp is used * acpi-ec: ACPI: EC: Ignore ECDT tables with an invalid ID string * acpi-resource: ACPI: resource: Use IRQ override on MACHENIKE 16P
2025-06-13Merge branch 'pm-cpuidle'Rafael J. Wysocki
Merge cpuidle updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak). - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki). * pm-cpuidle: intel_idle: Update arguments of mwait_idle_with_hints() Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branch 'pm-tools'Rafael J. Wysocki
Merge a cpupower utility fix for 6.16-rc2 that unbreaks systemd service units installation on some sysems (Francesco Poli). * pm-tools: cpupower: split unitdir from libdir in Makefile
2025-06-13Merge tag 'spi-fix-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A collection of driver specific fixes, most minor apart from the OMAP ones which disable some recent performance optimisations in some non-standard cases where we could start driving the bus incorrectly. The change to the stm32-ospi driver to use the newer reset APIs is a fix for interactions with other IP sharing the same reset line in some SoCs" * tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine spi: stm32-ospi: clean up on error in probe() spi: stm32-ospi: Make usage of reset_control_acquire/release() API spi: offload: check offload ops existence before disabling the trigger spi: spi-pci1xxxx: Fix error code in probe spi: loongson: Fix build warnings about export.h spi: omap2-mcspi: Disable multi-mode when the previous message kept CS asserted spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after message
2025-06-13Merge tag 'regulator-fix-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One minor fix for a leak in the DT parsing code in the max20086 driver" * tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: max20086: Fix refcount leak in max20086_parse_regulators_dt()
2025-06-13posix-cpu-timers: fix race between handle_posix_cpu_timers() and ↵Oleg Nesterov
posix_cpu_timer_del() If an exiting non-autoreaping task has already passed exit_notify() and calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent or debugger right after unlock_task_sighand(). If a concurrent posix_cpu_timer_del() runs at that moment, it won't be able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or lock_task_sighand() will fail. Add the tsk->exit_state check into run_posix_cpu_timers() to fix this. This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because exit_task_work() is called before exit_notify(). But the check still makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail anyway in this case. Cc: stable@vger.kernel.org Reported-by: Benoît Sevens <bsevens@google.com> Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-13Merge tag 'trace-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: - Do not free "head" variable in filter_free_subsystem_filters() The first error path jumps to "free_now" label but first frees the newly allocated "head" variable. But the "free_now" code checks this variable, and if it is not NULL, it will iterate the list. As this list variable was already initialized, the "free_now" code will not do anything as it is empty. But freeing it will cause a UAF bug. The error path should simply jump to the "free_now" label and leave the "head" variable alone. * tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Do not free "head" on error path of filter_free_subsystem_filters()
2025-06-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Rework of system register accessors for system registers that are directly writen to memory, so that sanitisation of the in-memory value happens at the correct time (after the read, or before the write). For convenience, RMW-style accessors are also provided. - Multiple fixes for the so-called "arch-timer-edge-cases' selftest, which was always broken. x86: - Make KVM_PRE_FAULT_MEMORY stricter for TDX, allowing userspace to pass only the "untouched" addresses and flipping the shared/private bit in the implementation. - Disable SEV-SNP support on initialization failure * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORY KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORY KVM: SEV: Disable SEV-SNP support on initialization failure KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases KVM: arm64: selftests: Fix help text for arch_timer_edge_cases KVM: arm64: Make __vcpu_sys_reg() a pure rvalue operand KVM: arm64: Don't use __vcpu_sys_reg() to get the address of a sysreg KVM: arm64: Add RMW specific sysreg accessor KVM: arm64: Add assignment-specific sysreg accessor
2025-06-13io_uring/kbuf: don't truncate end buffer for multiple buffer peeksJens Axboe
If peeking a bunch of buffers, normally io_ring_buffers_peek() will truncate the end buffer. This isn't optimal as presumably more data will be arriving later, and hence it's better to stop with the last full buffer rather than truncate the end buffer. Cc: stable@vger.kernel.org Fixes: 35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers") Reported-by: Christian Mazakas <christian.mazakas@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'v6.16-p4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a broken self-test in hkdf (new regression)" * tag 'v6.16-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hkdf - move to late_initcall
2025-06-13Merge tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "As usual, highlighting the ones users have been noticing: - Fix a small issue with has_case_insensitive not being propagated on snapshot creation; this led to fsck errors, which we're harmless because we're not using this flag yet (it's for overlayfs + casefolding). - Log the error being corrected in the journal when we're doing fsck repair: this was one of the "lessons learned" from the i_nlink 0 -> subvolume deletion bug, where reconstructing what had happened by analyzing the journal was a bit more difficult than it needed to be. - Don't schedule btree node scan to run in the superblock: this fixes a regression from the 6.16 recovery passes rework, and let to it running unnecessarily. The real issue here is that we don't have online, "self healing" style topology repair yet: topology repair currently has to run before we go RW, which means that we may schedule it unnecessarily after a transient error. This will be fixed in the future. - We now track, in btree node flags, the reason it was scheduled to be rewritten. We discovered a deadlock in recovery when many btree nodes need to be rewritten because they're degraded: fully fixing this will take some work but it's now easier to see what's going on. For the bug report where this came up, a device had been kicked RO due to transient errors: manually setting it back to RW was sufficient to allow recovery to succeed. - Mark a few more fsck errors as autofix: as a reminder to users, please do keep reporting cases where something needs to be repaired and is not repaired automatically (i.e. cases where -o fix_errors or fsck -y is required). - rcu_pending.c now works with PREEMPT_RT - 'bcachefs device add', then umount, then remount wasn't working - we now emit a uevent so that the new device's new superblock is correctly picked up - Assorted repair fixes: btree node scan will no longer incorrectly update sb->version_min, - Assorted syzbot fixes" * tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefs: (23 commits) bcachefs: Don't trace should_be_locked unless changing bcachefs: Ensure that snapshot creation propagates has_case_insensitive bcachefs: Print devices we're mounting on multi device filesystems bcachefs: Don't trust sb->nr_devices in members_to_text() bcachefs: Fix version checks in validate_bset() bcachefs: ioctl: avoid stack overflow warning bcachefs: Don't pass trans to fsck_err() in gc_accounting_done bcachefs: Fix leak in bch2_fs_recovery() error path bcachefs: Fix rcu_pending for PREEMPT_RT bcachefs: Fix downgrade_table_extra() bcachefs: Don't put rhashtable on stack bcachefs: Make sure opts.read_only gets propagated back to VFS bcachefs: Fix possible console lock involved deadlock bcachefs: mark more errors autofix bcachefs: Don't persistently run scan_for_btree_nodes bcachefs: Read error message now prints if self healing bcachefs: Only run 'increase_depth' for keys from btree node csan bcachefs: Mark need_discard_freespace_key_bad autofix bcachefs: Update /dev/disk/by-uuid on device add bcachefs: Add more flags to btree nodes for rewrite reason ...
2025-06-13Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublistsBagas Sanjaya
Stephen Rothwell reports htmldocs warning on ublk docs: Documentation/block/ublk.rst:414: ERROR: Unexpected indentation. [docutils] Fix the warning by separating sublists of auto buffer registration fallback behavior from their appropriate parent list item. Fixes: ff20c516485e ("ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250612132638.193de386@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250613023857.15971-1-bagasdotme@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13iommu/tegra: Fix incorrect size calculationJason Gunthorpe
This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(*pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32* in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-13block: Fix bvec_set_folio() for very large foliosMatthew Wilcox (Oracle)
Similarly to 26064d3e2b4d ("block: fix adding folio to bio"), if we attempt to add a folio that is larger than 4GB, we'll silently truncate the offset and len. Widen the parameters to size_t, assert that the length is less than 4GB and set the first page that contains the interesting data rather than the first page of the folio. Fixes: 26db5ee15851 (block: add a bvec_set_folio helper) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAPMatthew Wilcox (Oracle)
It is possible for physically contiguous folios to have discontiguous struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not. This is correctly handled by folio_page_idx(), so remove this open-coded implementation. Fixes: 640d1930bef4 (block: Add bio_for_each_folio_all()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>