summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2025-07-26umd: Remove usermode driver frameworkThomas Weißschuh
The code is unused since 98e20e5e13d2 ("bpfilter: remove bpfilter"), therefore remove it. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/bpf/20250721-remove-usermode-driver-v1-2-0d0083334382@linutronix.de
2025-07-26bpf/preload: Don't select USERMODE_DRIVERThomas Weißschuh
The usermode driver framework is not used anymore by the BPF preload code. Fixes: cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton.") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/bpf/20250721-remove-usermode-driver-v1-1-0d0083334382@linutronix.de
2025-07-25rv: Remove struct rv_monitor::reactingNam Cao
The field 'reacting' in struct rv_monitor is set but never used. Delete it. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/a6c16f845d2f1a09c4d0934ab83f3cb14478a71d.1753378331.git.namcao@linutronix.de Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-25rv: Remove rv_reactor's reference counterNam Cao
rv_reactor has a reference counter to ensure it is not removed while monitors are still using it. However, this is futile, as __exit functions are not expected to fail and will proceed normally despite rv_unregister_reactor() returning an error. At the moment, reactors do not support being built as modules, therefore they are never removed and the reference counters are not necessary. If we support building RV reactors as modules in the future, kernel module's centralized facilities such as try_module_get(), module_put() or MODULE_SOFTDEP should be used instead of this custom implementation. Remove this reference counter. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/bb946398436a5e17fb0f5b842ef3313c02291852.1753378331.git.namcao@linutronix.de Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-25rv: Merge struct rv_reactor_def into struct rv_reactorNam Cao
Each struct rv_reactor has a unique struct rv_reactor_def associated with it. struct rv_reactor is statically allocated, while struct rv_reactor_def is dynamically allocated. This makes the code more complicated than it should be: - Lookup is required to get the associated rv_reactor_def from rv_reactor - Dynamic memory allocation is required for rv_reactor_def. This is harder to get right compared to static memory. For instance, there is an existing mistake: rv_unregister_reactor() does not free the memory allocated by rv_register_reactor(). This is fortunately not a real memory leak problem as rv_unregister_reactor() is never called. Simplify and merge rv_reactor_def into rv_reactor. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/71cb91c86cd40df5b8c492b788787f2a73c3eaa3.1753378331.git.namcao@linutronix.de Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-25rv: Merge struct rv_monitor_def into struct rv_monitorNam Cao
Each struct rv_monitor has a unique struct rv_monitor_def associated with it. struct rv_monitor is statically allocated, while struct rv_monitor_def is dynamically allocated. This makes the code more complicated than it should be: - Lookup is required to get the associated rv_monitor_def from rv_monitor - Dynamic memory allocation is required for rv_monitor_def. This is harder to get right compared to static memory. For instance, there is an existing mistake: rv_unregister_monitor() does not free the memory allocated by rv_register_monitor(). This is fortunately not a real memory leak problem, as rv_unregister_monitor() is never called. Simplify and merge rv_monitor_def into rv_monitor. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/194449c00f87945c207aab4c96920c75796a4f53.1753378331.git.namcao@linutronix.de Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-25rv: Remove unused field in struct rv_monitor_defNam Cao
rv_monitor_def::task_monitor is not used. Delete it. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/502d94f2696435690a2b1fdbe80a9e56c96fcabf.1753378331.git.namcao@linutronix.de Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24Merge tag 'mm-hotfixes-stable-2025-07-24-18-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 9 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. 7 are for MM" * tag 'mm-hotfixes-stable-2025-07-24-18-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: sprintf.h requires stdarg.h resource: fix false warning in __request_region() mm/damon/core: commit damos_quota_goal->nid kasan: use vmalloc_dump_obj() for vmalloc error reports mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show() mm: update MAINTAINERS entry for HMM nilfs2: reject invalid file types when reading inodes selftests/mm: fix split_huge_page_test for folio_split() tests mailmap: add entry for Senozhatsky mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
2025-07-24net: Create separate gro_flush_normal functionSamiullah Khawaja
Move multiple copies of same code snippet doing `gro_flush` and `gro_normal_list` into separate helper function. Signed-off-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250723013031.2911384-2-skhawaja@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24tracing: Call trace_ftrace_test_filter() for the eventSteven Rostedt
The trace event filter bootup self test tests a bunch of filter logic against the ftrace_test_filter event, but does not actually call the event. Work is being done to cause a warning if an event is defined but not used. To quiet the warning call the trace event under an if statement where it is disabled so it doesn't get optimized out. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicolas Schier <nicolas.schier@linux.dev> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20250723194212.274458858@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-07-24 We've added 3 non-merge commits during the last 3 day(s) which contain a total of 4 files changed, 40 insertions(+), 15 deletions(-). The main changes are: 1) Improved verifier error message for incorrect narrower load from pointer field in ctx, from Paul Chaignon. 2) Disabled migration in nf_hook_run_bpf to address a syzbot report, from Kuniyuki Iwashima. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Test invalid narrower ctx load bpf: Reject narrower access to pointer ctx fields bpf: Disable migration in nf_hook_run_bpf(). ==================== Link: https://patch.msgid.link/20250724173306.3578483-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24resource: fix false warning in __request_region()Akinobu Mita
A warning is raised when __request_region() detects a conflict with a resource whose resource.desc is IORES_DESC_DEVICE_PRIVATE_MEMORY. But this warning is only valid for iomem_resources. The hmem device resource uses resource.desc as the numa node id, which can cause spurious warnings. This warning appeared on a machine with multiple cxl memory expanders. One of the NUMA node id is 6, which is the same as the value of IORES_DESC_DEVICE_PRIVATE_MEMORY. In this environment it was just a spurious warning, but when I saw the warning I suspected a real problem so it's better to fix it. This change fixes this by restricting the warning to only iomem_resource. This also adds a missing new line to the warning message. Link: https://lkml.kernel.org/r/20250719112604.25500-1-akinobu.mita@gmail.com Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-24x86: Handle KCOV __init vs inline mismatchesKees Cook
GCC appears to have kind of fragile inlining heuristics, in the sense that it can change whether or not it inlines something based on optimizations. It looks like the kcov instrumentation being added (or in this case, removed) from a function changes the optimization results, and some functions marked "inline" are _not_ inlined. In that case, we end up with __init code calling a function not marked __init, and we get the build warnings I'm trying to eliminate in the coming patch that adds __no_sanitize_coverage to __init functions: WARNING: modpost: vmlinux: section mismatch in reference: xbc_exit+0x8 (section: .text.unlikely) -> _xbc_exit (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: real_mode_size_needed+0x15 (section: .text.unlikely) -> real_mode_blob_end (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: __set_percpu_decrypted+0x16 (section: .text.unlikely) -> early_set_memory_decrypted (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: memblock_alloc_from+0x26 (section: .text.unlikely) -> memblock_alloc_try_nid (section: .init.text) WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_set_root_pointer+0xc (section: .text.unlikely) -> x86_init (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_get_root_pointer+0x8 (section: .text.unlikely) -> x86_init (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: efi_config_table_is_usable+0x16 (section: .text.unlikely) -> xen_efi_config_table_is_usable (section: .init.text) This problem is somewhat fragile (though using either __always_inline or __init will deterministically solve it), but we've tripped over this before with GCC and the solution has usually been to just use __always_inline and move on. For x86 this means forcing several functions to be inline with __always_inline. Link: https://lore.kernel.org/r/20250724055029.3623499-2-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc8). Conflicts: drivers/net/ethernet/microsoft/mana/gdma_main.c 9669ddda18fb ("net: mana: Fix warnings for missing export.h header inclusion") 755391121038 ("net: mana: Allocate MSI-X vectors dynamically") https://lore.kernel.org/20250711130752.23023d98@canb.auug.org.au Adjacent changes: drivers/net/ethernet/ti/icssg/icssg_prueth.h 6e86fb73de0f ("net: ti: icssg-prueth: Fix buffer allocation for ICSSG") ffe8a4909176 ("net: ti: icssg-prueth: Read firmware-names from device tree") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24rv: Return init error when registering monitorsGabriele Monaco
Monitors generated with dot2k have their registration function (the one called during monitor initialisation) return always 0, even if the registration failed on RV side. This can hide potential errors. Return the value returned by the RV register function. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Juri Lelli <jlelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Link: https://lore.kernel.org/20250723161240.194860-6-gmonaco@redhat.com Reviewed-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24verification/rvgen: Organise Kconfig entries for nested monitorsGabriele Monaco
The current behaviour of rvgen when running with the -a option is to append the necessary lines at the end of the configuration for Kconfig, Makefile and tracepoints. This is not always the desired behaviour in case of nested monitors: while tracepoints are not affected by nesting and the Makefile's only requirement is that the parent monitor is built before its children, in the Kconfig it is better to have children defined right after their parent, otherwise the result has wrong indentation: [*] foo_parent monitor [*] foo_child1 monitor [*] foo_child2 monitor [*] bar_parent monitor [*] bar_child1 monitor [*] bar_child2 monitor [*] foo_child3 monitor [*] foo_child4 monitor Adapt rvgen to look for a different marker for nested monitors in the Kconfig file and append the line right after the last sibling, instead of the last monitor. Also add the marker when creating a new parent monitor. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Juri Lelli <jlelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Link: https://lore.kernel.org/20250723161240.194860-5-gmonaco@redhat.com Reviewed-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tools/dot2c: Fix generated files going over 100 column limitGabriele Monaco
The dot2c.py script generates all states in a single line. This breaks the 100 column limit when the state machines are non-trivial. Change dot2c.py to generate the states in separate lines in case the generated line is going to be too long. Also adapt existing monitors with line length over the limit. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Juri Lelli <jlelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Link: https://lore.kernel.org/20250723161240.194860-4-gmonaco@redhat.com Suggested-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: Have eprobes handle arraysSteven Rostedt
eprobes are dynamic events that can read other events using their fields to create new events. Currently it doesn't work with arrays. When the new event field is attached to the old event field, it looks at the size of the field to determine what type of field the new field should be. For 1 byte fields it's a char, for 2 bytes, it's a short and for 4 bytes it's an integer. For all other sizes it just defaults to "long". This also reads the contents of the field for such cases. For arrays that are bigger than the size of long, return the value of the address of the content itself. This will allow eprobes to read other values in the array of the old event. This is useful when raw_syscalls is enabled but the syscall events are not. The syscall events are created from the raw_syscalls as they have an array of "args" that holds the 6 long words passed to the syscall entry point. To read the value of "filename" from sys_openat, the eprobe could attach to the raw_syscall and read the second value. It can then even be passed to a synthetic event and converted back to another eprobe to get the value of "filename" after it has been read by the kernel during the system call: [ Create an eprobe called "sys" and attach it to sys_enter. Read the id of the system call and the second argument ] # echo 'e:sys raw_syscalls.sys_enter nr=$id:u32 arg2=+8($args):u64' >> /sys/kernel/tracing/dynamic_events [ Create a synthetic event "path" that will hold the address of the sys_openat filename. This is on a 64bit machine, so make it 64 bits ] # echo 's:path u64 file;' >> /sys/kernel/tracing/dynamic_events [ Add a histogram to the eprobe/sys which tiggers if the "nr" field is 257 (sys_openat), and save the filename in the "file" variable. ] # echo 'hist:keys=common_pid:file=arg2 if nr == 257' > /sys/kernel/tracing/events/eprobes/sys/trigger [ Attach a histogram to sys_exit event that triggers the "path" synthetic event and records the "filename" that was passed from the sys eprobe. ] # echo 'hist:keys=common_pid:f=$file:onmatch(eprobes.sys).trace(path,$f)' >> /sys/kernel/tracing/events/raw_syscalls/sys_exit/trigger [ Create another eprobe that dereferences the "file" field as a user space string and displays it. ] # echo 'e:open synthetic.path file=+0($file):ustring' >> /sys/kernel/tracing/dynamic_events # echo 1 > /sys/kernel/tracing/events/eprobes/open/enable # cat trace_pipe cat-1142 [003] ...5. 799.521912: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.521934: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522065: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522080: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522296: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" cat-1142 [003] ...5. 799.522319: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" less-1143 [005] ...5. 799.522327: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522333: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" cat-1142 [003] ...5. 799.522348: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" less-1143 [005] ...5. 799.522349: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522363: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" less-1143 [005] ...5. 799.522477: open: (synthetic.path) file="/etc/ld.so.cache" cat-1142 [003] ...5. 799.522489: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" less-1143 [005] ...5. 799.522492: open: (synthetic.path) file="/etc/ld.so.cache" less-1143 [005] ...5. 799.522720: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libtinfo.so.6" less-1143 [005] ...5. 799.522744: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libtinfo.so.6" less-1143 [005] ...5. 799.522759: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libtinfo.so.6" cat-1142 [003] ...5. 799.522850: open: (synthetic.path) file="/lib/x86_64-linux-gnu/libc.so.6" Link: https://lore.kernel.org/all/20250723124202.4f7475be@batman.local.home/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-07-23bpf: Reject narrower access to pointer ctx fieldsPaul Chaignon
The following BPF program, simplified from a syzkaller repro, causes a kernel warning: r0 = *(u8 *)(r1 + 169); exit; With pointer field sk being at offset 168 in __sk_buff. This access is detected as a narrower read in bpf_skb_is_valid_access because it doesn't match offsetof(struct __sk_buff, sk). It is therefore allowed and later proceeds to bpf_convert_ctx_access. Note that for the "is_narrower_load" case in the convert_ctx_accesses(), the insn->off is aligned, so the cnt may not be 0 because it matches the offsetof(struct __sk_buff, sk) in the bpf_convert_ctx_access. However, the target_size stays 0 and the verifier errors with a kernel warning: verifier bug: error during ctx access conversion(1) This patch fixes that to return a proper "invalid bpf_context access off=X size=Y" error on the load instruction. The same issue affects multiple other fields in context structures that allow narrow access. Some other non-affected fields (for sk_msg, sk_lookup, and sockopt) were also changed to use bpf_ctx_range_ptr for consistency. Note this syzkaller crash was reported in the "Closes" link below, which used to be about a different bug, fixed in commit fce7bd8e385a ("bpf/verifier: Handle BPF_LOAD_ACQ instructions in insn_def_regno()"). Because syzbot somehow confused the two bugs, the new crash and repro didn't get reported to the mailing list. Fixes: f96da09473b52 ("bpf: simplify narrower ctx access") Fixes: 0df1a55afa832 ("bpf: Warn on internal verifier errors") Reported-by: syzbot+0ef84a7bdf5301d4cbec@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0ef84a7bdf5301d4cbec Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/3b8dcee67ff4296903351a974ddd9c4dca768b64.1753194596.git.paul.chaignon@gmail.com
2025-07-23tracing: arm: arm64: Hide trace events ipi_raise, ipi_entry and ipi_exitSteven Rostedt
The ipi tracepoints are mostly generic, but the tracepoints ipi_raise, ipi_entry and ipi_exit are only used by arm and arm64. This means these trace events are wasting memory in all the other architectures that do not use them. Add CONFIG_HAVE_EXTRA_IPI_TRACEPOINTS and have arm and arm64 select it to enable these trace events. The config makes it easy if other architectures decide to trace these as well. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Will Deacon <will@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/20250722103714.64eba013@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2025-07-23Merge branches 'rcu-exp.23.07.2025', 'rcu.22.07.2025', ↵Neeraj Upadhyay (AMD)
'torture-scripts.16.07.2025', 'srcu.19.07.2025', 'rcu.nocb.18.07.2025' and 'refscale.07.07.2025' into rcu.merge.23.07.2025
2025-07-24tracing: probes: Add a kerneldoc for traceprobe_parse_event_name()Masami Hiramatsu (Google)
Since traceprobe_parse_event_name() is a bit complicated, add a kerneldoc for explaining the behavior. Link: https://lore.kernel.org/all/175323430565.57270.2602609519355112748.stgit@devnote2/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: uprobe-event: Allocate string buffers from heapMasami Hiramatsu (Google)
Allocate temporary string buffers for parsing uprobe-events from heap instead of stack. Link: https://lore.kernel.org/all/175323429593.57270.12369235525923902341.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: eprobe-event: Allocate string buffers from heapMasami Hiramatsu (Google)
Allocate temporary string buffers for parsing eprobe-events from heap instead of stack. Link: https://lore.kernel.org/all/175323428599.57270.988038042425748956.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: kprobe-event: Allocate string buffers from heapMasami Hiramatsu (Google)
Allocate temporary string buffers for parsing kprobe-events from heap instead of stack. Link: https://lore.kernel.org/all/175323427627.57270.5105357260879695051.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: fprobe-event: Allocate string buffers from heapMasami Hiramatsu (Google)
Allocate temporary string buffers for fprobe-event from heap instead of stack. This fixes the stack frame exceed limit error. Link: https://lore.kernel.org/all/175323426643.57270.6657152008331160704.stgit@devnote2/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506240416.nZIhDXoO-lkp@intel.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: probe: Allocate traceprobe_parse_context from heapMasami Hiramatsu (Google)
Instead of allocating traceprobe_parse_context on stack, allocate it dynamically from heap (slab). This change is likely intended to prevent potential stack overflow issues, which can be a concern in the kernel environment where stack space is limited. Link: https://lore.kernel.org/all/175323425650.57270.280750740753792504.stgit@devnote2/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506240416.nZIhDXoO-lkp@intel.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24tracing: probes: Sort #include alphabeticallyMasami Hiramatsu (Google)
Sort the #include directives in trace_probe* files alphabetically for easier maintenance and avoid double includes. This also groups headers as linux-generic, asm-generic, and local headers. Link: https://lore.kernel.org/all/175323424678.57270.11975372127870059007.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2025-07-23tracing: Deprecate auto-mounting tracefs in debugfsSteven Rostedt
In January 2015, tracefs was created to allow access to the tracing infrastructure without needing to compile in debugfs. When tracefs is configured, the directory /sys/kernel/tracing will exist and tooling is expected to use that path to access the tracing infrastructure. To allow backward compatibility, when debugfs is mounted, it would automount tracefs in its "tracing" directory so that tooling that had hard coded /sys/kernel/debug/tracing would still work. It has been over 10 years since the new interface was introduced, and all tooling should now be using it. Start the process of deprecating the old path so that it doesn't need to be maintained anymore. A new config is added to allow distributions to disable automounting of tracefs on debugfs. If /sys/kernel/debug/tracing is accessed, a pr_warn() will trigger stating: "NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030" Expect to remove this feature in 5 years (2030). Cc: <linux-trace-users@vger.kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/20250722170806.40c068c6@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-23sysctl: rename kern_table -> sysctl_subsys_tableJoel Granados
Renamed sysctl table from kern_table to sysctl_subsys_table and grouped the two arch specific ctls to the end of the array. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.cJoel Granados
Moved ctl_tables elements for overflowuid and overflowgid into in kernel/sys.c. Create a register function that keeps them under "kernel" and run it after core with postcore_initcall. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23uevent: mv uevent_helper into kobject_uevent.cJoel Granados
Move both uevent_helper table into lib/kobject_uevent.c. Place the registration early in the initcall order with postcore_initcall. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23sysctl: Remove superfluous includes from kernel/sysctl.cJoel Granados
Remove the following headers from the include list in sysctl.c. * These are removed as the related variables are no longer there. =================== ==================== Include Related Var =================== ==================== linux/kmod.h usermodehelper asm/nmi.h nmi_watchdoc_enabled asm/io.h io_delay_type linux/pid.h pid_max_{,min,max} linux/sched/sysctl.h sysctl_{sched_*,numa_*,timer_*} linux/mount.h sysctl_mount_max linux/reboot.h poweroff_cmd linux/ratelimit.h {,printk_}ratelimit_state linux/printk.h kptr_restrict linux/security.h CONFIG_SECURITY_CAPABILITIES linux/net.h net_table linux/key.h key_sysctls linux/nvs_fs.h acpi_video_flags linux/acpi.h acpi_video_flags linux/fs.h proc_nr_files * These are no longer needed as intermediate includes ============== Include ============== linux/filter.h linux/binfmts.h Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23sysctl: Remove (very) old file changelogJoel Granados
These comments are older than 2003 and therefore do not bare any relevance on the current state of the sysctl.c file. Remove them as they confuse more than clarify. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.cJoel Granados
This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23sysctl: move cad_pid into kernel/pid.cJoel Granados
Move cad_pid as well as supporting function proc_do_cad_pid into kernel/pic.c. Replaced call to __do_proc_dointvec with proc_dointvec inside proc_do_cad_pid which requires the copy of the ctl_table to handle the temp value. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23sysctl: Move tainted ctl_table into kernel/panic.cJoel Granados
Move the ctl_table with the "tainted" proc_name into kernel/panic.c. With it moves the proc_tainted helper function. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23Input: sysrq: mv sysrq into drivers/tty/sysrq.cJoel Granados
Move both sysrq ctl_table and supported sysrq_sysctl_handler helper function into drivers/tty/sysrq.c. Replaced the __do_proc_dointvec in helper function with do_proc_dointvec_minmax as the former is local to kernel/sysctl.c. Here we use the minmax version of do_proc_dointvec because do_proc_dointvec is static and calling do_proc_dointvec_minmax with a NULL min and max is the same as calling do_proc_dointvec. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23fork: mv threads-max into kernel/fork.cJoel Granados
make sysctl_max_threads static as it no longer needs to be exported into sysctl.c. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23parisc/power: Move soft-power into power.cJoel Granados
Move the soft-power ctl table into parisc/power.c. As a consequence the pwrsw_enabled var is made static. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23mm: move randomize_va_space into memory.cJoel Granados
Move the randomize_va_space variable together with all its sysctl table elements into memory.c. Register it to the "kernel" directory by adding it to the subsys initialization calls This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23rcu: Move rcu_stall related sysctls into rcu/tree_stall.hJoel Granados
Move sysctl_panic_on_rcu_stall and sysctl_max_rcu_stall_to_panic into the kernel/rcu subdirectory. Make these static in tree_stall.h and removed them as extern from panic.h as their scope is now confined into one file. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23locking/rtmutex: Move max_lock_depth into rtmutex.cJoel Granados
Move the max_lock_depth sysctl table element into rtmutex_api.c. Removed the rtmutex.h include from sysctl.c. Chose to move into rtmutex_api.c to avoid multiple registrations every time rtmutex.c is included in other files. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23module: Move modprobe_path and modules_disabled ctl_tables into the module ↵Joel Granados
subsys Move module sysctl (modprobe_path and modules_disabled) out of sysctl.c and into the modules subsystem. Make modules_disabled static as it no longer needs to be exported. Remove module.h from the includes in sysctl as it no longer uses any module exported variables. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-23kcsan: test: Initialize dummy variableMarco Elver
Newer compiler versions rightfully point out: kernel/kcsan/kcsan_test.c:591:41: error: variable 'dummy' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer] 591 | KCSAN_EXPECT_READ_BARRIER(atomic_read(&dummy), false); | ^~~~~ 1 error generated. Although this particular test does not care about the value stored in the dummy atomic variable, let's silence the warning. Link: https://lkml.kernel.org/r/CA+G9fYu8JY=k-r0hnBRSkQQrFJ1Bz+ShdXNwC1TNeMt0eXaxeA@mail.gmail.com Fixes: 8bc32b348178 ("kcsan: test: Add test cases for memory barrier instrumentation") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marco Elver <elver@google.com>
2025-07-22tracing: Fix comment in trace_module_remove_events()Steven Rostedt
Fix typo "allocade" -> "allocated". Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250710095628.42ed6b06@batman.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-22tracing: Remove redundant config HAVE_FTRACE_MCOUNT_RECORDSteven Rostedt
Ftrace is tightly coupled with architecture specific code because it requires the use of trampolines written in assembly. This means that when a new feature or optimization is made, it must be done for all architectures. To simplify the approach, CONFIG_HAVE_FTRACE_* configs are added to denote which architecture has the new enhancement so that other architectures can still function until they too have been updated. The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the DYNAMIC_FTRACE work, but now every architecture that implements DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it redundant with the HAVE_DYNAMIC_FTRACE. Remove the HAVE_FTRACE_MCOUNT config and use DYNAMIC_FTRACE directly where applicable. Link: https://lore.kernel.org/all/20250703154916.48e3ada7@gandalf.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20250704104838.27a18690@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-22tracing: Remove EVENT_FILE_FL_SOFT_MODE flagSteven Rostedt
When soft disabling of trace events was first created, it needed to have a way to know if a file had a user that was using it with soft disabled (for triggers that need to enable or disable events from a context that can not really enable or disable the event, it would set SOFT_DISABLED to state it is disabled). The flag SOFT_MODE was used to denote that an event had a user that would enable or disable it via the SOFT_DISABLED flag. Commit 1cf4c0732db3c ("tracing: Modify soft-mode only if there's no other referrer") fixed a bug where if two users were using the SOFT_DISABLED flag the accounting would get messed up as the SOFT_MODE flag could only handle one user. That commit added the sm_ref counter which kept track of how many users were using the event in "soft mode". This made the SOFT_MODE flag redundant as it should only be set if the sm_ref counter is non zero. Remove the SOFT_MODE flag and just use the sm_ref counter to know the event is in soft mode or not. This makes the code a bit simpler. Link: https://lore.kernel.org/all/20250702111908.03759998@batman.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Link: https://lore.kernel.org/20250702143657.18dd1882@batman.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-22tracing: Remove pointless memory barriersNam Cao
Memory barriers are useful to ensure memory accesses from one CPU appear in the original order as seen by other CPUs. Some smp_rmb() and smp_wmb() are used, but they are not ordering multiple memory accesses. Remove them. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626151940.1756398-1-namcao@linutronix.de Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-22ftrace: Make DYNAMIC_FTRACE always enabled for architectures that support itSteven Rostedt
ftrace has two flavors: 1) static: Where every function always calls the ftrace trampoline 2) dynamic: Where each function has nops that can be changed on demand to jump to the ftrace trampoline when needed. The static flavor has very high performance overhead and was only created to make it easier for architectures to implement the dynamic flavor. An architecture developer can first implement the static ftrace to make sure the trampolines work before working on the more complicated dynamic aspect of ftrace. Once the architecture can support dynamic ftrace, there's no reason to continue to support the static flavor. In fact, the static flavor tends to bitrot and bugs start to appear in them. Remove the prompt to pick DYNAMIC_FTRACE and simply enable it if the architecture supports it. Link: https://lore.kernel.org/all/f7e12c6d-892e-4ca3-9ef0-fbb524d04a48@ghiti.fr/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: ChenMiao <chenmiao.ku@gmail.com> Link: https://lore.kernel.org/20250703115222.2d7c8cd5@batman.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>