linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-02-26	selftests/x86/xstate: Clarify supported xstates	Chang S. Bae
	The established xstate test code is designed to be generic, but certain xstates require special handling and cannot be tested without additional adjustments. Clarify which xstates are currently supported, and enforce testing only for them. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-9-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Consolidate test invocations into a single entry	Chang S. Bae
	Currently, each of the three xstate tests runs as a separate invocation, requiring the xstate number to be passed and state information to be reconstructed repeatedly. This approach arose from their individual and isolated development, but now it makes sense to unify them. Introduce a wrapper function that first verifies feature availability from the kernel and constructs the necessary state information once. The wrapper then sequentially invokes all tests to ensure consistent execution. Update the AMX test to use this unified invocation. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-8-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Introduce signal ABI test	Chang S. Bae
	With the refactored test cases, another xstate exposure to userspace is through signal delivery. While amx.c includes signal-related scenarios, its primary focus is on xstate permission management, which is largely specific to dynamic states. The remaining gap is testing xstate preservation and restoration across signal delivery. The kernel defines an ABI for presenting xstate in the signal frame, closely resembling the hardware XSAVE format, where xstate modification is also possible. Introduce a new test case to verify xstate preservation across signal delivery and return, that is ensuring ABI compatibility by: - Loading xstate before raising a signal. - Verifying correct exposure in the signal frame - Modifying xstate in the signal frame before returning. - Checking the state restoration upon signal return. Integrate this test into the AMX test suite as an initial usage site. Expected output: $ amx_64 ... [RUN] AMX Tile data: load xstate and raise SIGUSR1 [OK] 'magic1' is valid [OK] 'xfeatures' in SW reserved area is valid [OK] 'xfeatures' in XSAVE header is valid [OK] xstate delivery was successful [OK] 'magic2' is valid [RUN] AMX Tile data: load new xstate from sighandler and check it after sigreturn [OK] xstate was restored correctly Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-7-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Refactor ptrace ABI test	Chang S. Bae
	Following the refactoring of the context switching test, the ptrace test is another component reusable for other xstate features. As part of this restructuring, add a missing check to validate the user_xstateregs->xstate_fx_sw field in the ABI. Also, replace err() and fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: inject xstate via ptrace(). [OK] 'xfeatures' in SW reserved area was correctly written [OK] xstate was correctly updated. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-6-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Refactor context switching test	Chang S. Bae
	The existing context switching and ptrace tests in amx.c are not specific to dynamic states, making them reusable for general xstate testing. As a first step, move the context switching test to xstate.c. Refactor the test code to allow specifying which xstate component being tested. To decouple the test from dynamic states, remove the permission request code. In fact, The permission request inside the test wrapper was redundant. Additionally, replace fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: check context switches, 10 iterations, 5 threads. [OK] No incorrect case was found. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-5-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Enumerate and name xstate components	Chang S. Bae
	After moving essential helpers from amx.c, the code remains neutral regarding which xstate components it handles. However, explicitly listing known components helps users identify which features are ready for testing. Enumerate xstate components to facilitate identification. Extend struct xstate_info to include a name field, providing a human-readable identifier. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-4-chang.seok.bae@intel.com
2025-02-26	selftests/x86/xstate: Refactor XSAVE helpers for general use	Chang S. Bae
	The AMX test introduced several XSAVE-related helper functions, but so far, it has been the only user of them. These helpers can be generalized for broader test of multiple xstate features. Move most XSAVE-related code into xsave.h, making it shareable. The restructuring includes: * Establishing low-level XSAVE helpers for saving and restoring register states, as well as handling XSAVE buffers. * Generalizing state data manipuldations: set_rand_data() * Introducing a generic feature query helper: get_xstate_info() While doing so, remove unused defines in amx.c. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-3-chang.seok.bae@intel.com
2025-02-26	selftests/x86: Consolidate redundant signal helper functions	Chang S. Bae
	The x86 selftests frequently register and clean up signal handlers, but the sethandler() and clearhandler() functions have been redundantly copied across multiple .c files. Move these functions to helpers.h to enable reuse across tests, eliminating around 250 lines of duplicate code. Converge the error handling by using ksft_exit_fail_msg(), which is functionally equivalent with err() within the selftest framework. This change is a prerequisite for the upcoming xstate selftest, which requires signal handling for registering and cleaning up handlers. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-2-chang.seok.bae@intel.com
2025-02-26	KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR	Sebastian Ott
	Assert that MIDR_EL1, REVIDR_EL1, AIDR_EL1 are writable from userspace, that the changed values are visible to guests, and that they are preserved across a vCPU reset. Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20250225005401.679536-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-02-25	selftests/bpf: Test gen_pro/epilogue that generate kfuncs	Amery Hung
	Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program. The main bpf program and return value checks are identical to pro_epilogue.c introduced in commit 47e69431b57a ("selftests/bpf: Test gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops detects a program name with prefix "test_kfunc_", it generates slightly different prologue and epilogue: They still add 1000 to args->a in prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue, but involve kfuncs. At high level, the alternative version of prologue and epilogue look like this: cgrp = bpf_cgroup_from_id(0); if (cgrp) bpf_cgroup_release(cgrp); else /* Perform what original bpf_testmod_st_ops prologue or * epilogue does */ Since 0 is never a valid cgroup id, the original prologue or epilogue logic will be performed. As a result, the __retval check should expect the exact same return value. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250225233545.285481-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-25	selftests: drv-net-hw: Add a test for symmetric RSS hash	Gal Pressman
	Add a selftest that verifies symmetric RSS hash is working as intended. The test runs iterations of traffic, swapping the src/dst UDP ports, and verifies that the same RX queue is receiving the traffic in both cases. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-5-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25	selftests: drv-net: Make rand_port() get a port more reliably	Gal Pressman
	Instead of guessing a port and checking whether it's available, get an available port from the OS. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25	selftests/net: ensure mptcp is enabled in netns	Hangbin Liu
	Some distributions may not enable MPTCP by default. All other MPTCP tests source mptcp_lib.sh to ensure MPTCP is enabled before testing. However, the ip_local_port_range test is the only one that does not include this step. Let's also ensure MPTCP is enabled in netns for ip_local_port_range so that it passes on all distributions. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224094013.13159-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25	Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure
2025-02-25	tools/sched_ext: Provide consistent access to scx flags	Andrea Righi
	Make all the SCX_OPS_* and SCX_PICK_IDLE_* flags available to the user-space part of the schedulers via the compat interface. This allows schedulers / selftests to set all the ops flags in user-space, rather than having them split between BPF and user-space. Signed-off-by: Andrea Righi <arighi@nvidia.com> Acked-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-25	tools/memory-model: glossary.txt: Fix indents	Akira Yokosawa
	There are a couple of inconsistent indents around code/literal blocks. Adjust them to make this file easier to parse. Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2025-02-25	tools/memory-model/README: Fix typo	Akira Yokosawa
	Fix a trivial typo. Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2025-02-25	tools/memory-model: Distinguish between syntactic and semantic tags	Jonas Oberhauser
	Not all annotated accesses provide the semantics their syntactic tags would imply. For example, an 'acquire tag on a write does not imply that the write is finally in the Acquire set and provides acquire ordering. To distinguish in those cases between the syntactic tags and actual sets, we capitalize the former, so 'ACQUIRE tags may be present on both reads and writes, but only reads will appear in the Acquire set. For tags where the two concepts are the same we do not use specific capitalization to make this distinction. Reported-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Akira Yokosawa <akiyks@gmail.com> # herdtools7.7.58
2025-02-25	tools/memory-model: Switch to softcoded herd7 tags	Jonas Oberhauser
	A new version of herd7 provides a -lkmmv2 switch which overrides the old herd7 behavior of simply ignoring any softcoded tags in the .def and .bell files. We port LKMM to this version of herd7 by providing the switch in linux-kernel.cfg and reporting an error if the LKMM is used without this switch. To preserve the semantics of LKMM, we also softcode the Noreturn tag on atomic RMW which do not return a value and define atomic_add_unless with an Mb tag in linux-kernel.def. We update the herd-representation.txt accordingly and clarify some of the resulting combinations. Co-developed-by: Hernan Ponce de Leon <hernan.poncedeleon@huaweicloud.com> Signed-off-by: Hernan Ponce de Leon <hernan.poncedeleon@huaweicloud.com> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Akira Yokosawa <akiyks@gmail.com> # herdtools7.7.58
2025-02-25	objtool: Add bch2_trans_unlocked_or_in_restart_error() to bcachefs noreturns	Youling Tang
	Fix the following objtool warning during build time: fs/bcachefs/btree_cache.o: warning: objtool: btree_node_lock.constprop.0() falls through to next function bch2_recalc_btree_reserve() fs/bcachefs/btree_update.o: warning: objtool: bch2_trans_update_get_key_cache() falls through to next function need_whiteout_for_snapshot() bch2_trans_unlocked_or_in_restart_error() is an Obviously Correct (tm) panic() wrapper, add it to the list of known noreturns. Fixes: b318882022a8 ("bcachefs: bch2_trans_verify_not_unlocked_or_in_restart()") Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://lore.kernel.org/r/20250218064230.219997-1-youling.tang@linux.dev Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-02-25	objtool: Fix C jump table annotations for Clang	Ard Biesheuvel
	A C jump table (such as the one used by the BPF interpreter) is a const global array of absolute code addresses, and this means that the actual values in the table may not be known until the kernel is booted (e.g., when using KASLR or when the kernel VA space is sized dynamically). When using PIE codegen, the compiler will default to placing such const global objects in .data.rel.ro (which is annotated as writable), rather than .rodata (which is annotated as read-only). As C jump tables are explicitly emitted into .rodata, this used to result in warnings for LoongArch builds (which uses PIE codegen for the entire kernel) like Warning: setting incorrect section attributes for .rodata..c_jump_table due to the fact that the explicitly specified .rodata section inherited the read-write annotation that the compiler uses for such objects when using PIE codegen. This warning was suppressed by explicitly adding the read-only annotation to the __attribute__((section(""))) string, by commit c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table") Unfortunately, this hack does not work on Clang's integrated assembler, which happily interprets the appended section type and permission specifiers as part of the section name, which therefore no longer matches the hard-coded pattern '.rodata..c_jump_table' that objtool expects, causing it to emit a warning kernel/bpf/core.o: warning: objtool: ___bpf_prog_run+0x20: sibling call from callable instruction with modified stack frame Work around this, by emitting C jump tables into .data.rel.ro instead, which is treated as .rodata by the linker script for all builds, not just PIE based ones. Fixes: c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table") Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> # on LoongArch Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250221135704.431269-6-ardb+git@google.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2025-02-25	selftests/x86/lam: Fix minor memory in do_uring()	liuye
	Exception branch returns without freeing 'fi'. Signed-off-by: liuye <liuye@kylinos.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250114082650.113105-1-liuye@kylinos.cn
2025-02-25	tools: virtio/linux/module.h add MODULE_DESCRIPTION() define.	Yufeng Wang
	when we build tools/virtio, meet below error information. cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h -mfunction-return=thunk -fcf-protection=none -mindirect-branch-register -pthread -c -o virtio_ring.o ../../drivers/virtio/virtio_ring.c ../../drivers/virtio/virtio_ring.c:3276:20: error：expected declaration specifiers or ‘...’ before string constant 3276 \| MODULE_DESCRIPTION("Virtio ring implementation"); \| to fix, add MODULE_DESCRIPTION() define for virtio test. Fixes: ab0727f3ddb8 ("virtio: add missing MODULE_DESCRIPTION() macros") Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn> Message-Id: <20250114064726.33079-1-wangyufeng@kylinos.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-02-25	tools: virtio/linux/compiler.h: Add data_race() define.	Yufeng Wang
	Port over the definition of data_race() so we can build tools/virtio. cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h -mfunction-return=thunk -fcf-protection=none -mindirect-branch-register -pthread -c -o virtio_ring.o ../../drivers/virtio/virtio_ring.c ../../drivers/virtio/virtio_ring.c: in function'vring_interrupt': ../../drivers/virtio/virtio_ring.c:2711:17: error：Implicit declaration function'data_race' [-Wimplicit-function-declaration] 2711 \| data_race(vq->event_triggered = true); \| ^~~~~~~~~ Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn> Message-Id: <20250114033635.20623-1-wangyufeng@kylinos.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-02-25	tools/virtio: Add DMA_MAPPING_ERROR and sg_dma_len api define for virtio test	Yufeng Wang
	when we build tools/virtio, meet below error information. cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h -mfunction-return=thunk -fcf-protection=none -mindirect-branch-register -pthread -c -o virtio_ring.o ../../drivers/virtio/virtio_ring.c ../../drivers/virtio/virtio_ring.c: in function 'vring_need_unmap_buffer': ../../drivers/virtio/virtio_ring.c:294:54: error:'DMA_MAPPING_ERROR'Undeclared (first use within this function) 294 \| return vring->use_dma_api && (extra->addr != DMA_MAPPING_ERROR); \| ^~~~~~~~~~~~~~~~~ ../../drivers/virtio/virtio_ring.c:294:54: Note: Each undeclared identifier is only reported once within the function it appears in ../../drivers/virtio/virtio_ring.c: in function 'vring_map_one_sg': ../../drivers/virtio/virtio_ring.c:369:24: error：Implicit declaration function'sg_dma_len' [-Wimplicit-function-declaration] 369 \| *len = sg_dma_len(sg); \| ^~~~~~~~~~ ../../drivers/virtio/virtio_ring.c: in function'virtqueue_add_desc_split': ../../drivers/virtio/virtio_ring.c:518:37: error:'DMA_MAPPING_ERROR'Undeclared (first use within this function) 518 \| extra[i].addr = premapped ? DMA_MAPPING_ERROR : addr; \| ^~~~~~~~~~~~~~~~~ ../../drivers/virtio/virtio_ring.c: in function'virtqueue_add_indirect_packed': ../../drivers/virtio/virtio_ring.c:1370:61: error: 'DMA_MAPPING_ERROR'Undeclared (first use within this function) 1370 \| extra[i].addr = premapped ? DMA_MAPPING_ERROR : addr; \| ^~~~~~~~~~~~~~~~~ ../../drivers/virtio/virtio_ring.c: in function'virtqueue_add_packed': ../../drivers/virtio/virtio_ring.c:1535:41: error:'DMA_MAPPING_ERROR'Undeclared (first use within this function) 1535 \| DMA_MAPPING_ERROR : addr; \| ^~~~~~~~~~~~~~~~~ to fix, add DMA_MAPPING_ERROR define for virtio test. Fixes: c7e1b422afac ("virtio_ring: perform premapped operations based on per-buffer") Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn> Message-Id: <20250113100300.174382-1-wangyufeng@kylinos.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-02-24	Merge tag 'riscv-for-linus-6.14-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for cacheinfo DT probing to avoid reading non-boolean properties as booleans. - A fix for cpufeature to use bitmap_equal() instead of memcmp(), so unused bits are ignored. - Fixes for cmpxchg and futex cmpxchg that properly encode the sign extension requirements on inline asm, which results in spurious successes. This manifests in at least inode_set_ctime_current, but is likely just a disaster waiting to happen. - A fix for the rseq selftests, which was using an invalid constraint. - A pair of fixes for signal frame size handling: - We were reserving space for an extra empty extension context header on systems with extended signal context, thus resulting in unnecessarily large allocations. - We weren't properly checking for available extensions before calculating the signal stack size, which resulted in undersized stack allocations on some systems (at least those with T-Head custom vectors). Also, we've added Alex as a reviewer. He's been helping out a ton lately, thanks! * tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: MAINTAINERS: Add myself as a riscv reviewer riscv: signal: fix signal_minsigstksz riscv: signal: fix signal frame size rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm riscv/futex: sign extend compare value in atomic cmpxchg riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg riscv: cpufeature: use bitmap_equal() instead of memcmp() riscv: cacheinfo: Use of_property_present() for non-boolean properties
2025-02-24	Merge patch series "Initial support for RK3576 UFS controller"	Martin K. Petersen
	Shawn Lin <shawn.lin@rock-chips.com> says: This patchset adds initial UFS controller supprt for RK3576 SoC. Patch 1 is the dt-bindings. Patch 2-4 deal with rpm and spm support in advanced suggested by Ulf. Patch 5 exports two new APIs for host driver. Patch 6 and 7 are the host driver and dtsi support. Link: https://lore.kernel.org/r/1738736156-119203-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-24	perf report: Fix sample number stats for branch entry mode	Thomas Falcon
	Currently, stats->nr_samples is incremented per entry in the branch stack instead of per sample taken. As a result, statistics of samples taken during perf record in --branch-filter or --branch-any mode does not seem correct. Instead call hists__inc_nr_samples() for each sample taken instead of for each entry in the branch stack. Before: $ ./perf record -e cycles:u -b -c 10000000000 ./tchain_edit [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (2 samples) ] $ perf report -D \| tail -n 16 Aggregated stats: TOTAL events: 16 COMM events: 2 (12.5%) EXIT events: 1 ( 6.2%) SAMPLE events: 2 (12.5%) MMAP2 events: 2 (12.5%) KSYMBOL events: 1 ( 6.2%) FINISHED_ROUND events: 1 ( 6.2%) ID_INDEX events: 1 ( 6.2%) THREAD_MAP events: 1 ( 6.2%) CPU_MAP events: 1 ( 6.2%) EVENT_UPDATE events: 2 (12.5%) TIME_CONV events: 1 ( 6.2%) FINISHED_INIT events: 1 ( 6.2%) cpu_core/cycles/u stats: SAMPLE events: 64 After: $ ./perf report -D \| tail -n 16 Aggregated stats: TOTAL events: 16 COMM events: 2 (12.5%) EXIT events: 1 ( 6.2%) SAMPLE events: 2 (12.5%) MMAP2 events: 2 (12.5%) KSYMBOL events: 1 ( 6.2%) FINISHED_ROUND events: 1 ( 6.2%) ID_INDEX events: 1 ( 6.2%) THREAD_MAP events: 1 ( 6.2%) CPU_MAP events: 1 ( 6.2%) EVENT_UPDATE events: 2 (12.5%) TIME_CONV events: 1 ( 6.2%) FINISHED_INIT events: 1 ( 6.2%) cpu_core/cycles/u stats: SAMPLE events: 2 Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250220045942.114965-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf machine: Reuse module path buffer	Ian Rogers
	Rather than copying the path and appending the directory entry in a fresh path buffer, append to the path at the end of where it is for the recursion level. This saves a PATH_MAX buffer per recursion level and some unnecessary copying. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf hwmon_pmu: Switch event discovery to io_dir__readdir	Ian Rogers
	Avoid DIR allocations when scanning sysfs by using io_dir for the readdir implementation, that allocates about 1kb on the stack. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf parse-events: Switch tracepoints to io_dir__readdir	Ian Rogers
	Avoid DIR allocations when scanning sysfs by using io_dir for the readdir implementation, that allocates about 1kb on the stack. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf events: Remove scandir in thread synthesis	Ian Rogers
	This avoids scanddir reading the directory into memory that's allocated and instead allocates on the stack. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250222061015.303622-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf header: Switch mem topology to io_dir__readdir	Ian Rogers
	Switch memory_node__read and build_mem_topology from opendir/readdir to io_dir__readdir, with smaller stack allocations. Reduces peak memory consumption of perf record by 10kb. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf pmu: Switch to io_dir__readdir	Ian Rogers
	Avoid DIR allocations when scanning sysfs by using io_dir for the readdir implementation, that allocates about 1kb on the stack. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	perf maps: Switch modules tree walk to io_dir__readdir	Ian Rogers
	Compared to glibc's opendir/readdir this lowers the max RSS of perf record by 1.8MB on a Debian machine. Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	tools lib api: Add io_dir an allocation free readdir alternative	Ian Rogers
	glibc's opendir allocates a minimum of 32kb, when called recursively for a directory tree the memory consumption can add up - nearly 300kb during perf start-up when processing modules. Add a stack allocated variant of readdir sized a little more than 1kb. As getdents64 may be missing from libc, add support using syscall. As the system call number maybe missing, add #defines for those. Note, an earlier version of this patch had a feature test for getdents64 but there were problems on certains distros where getdents64 would be #define renamed to getdents breaking the code. The syscall use was made uncondtional to work around this. There is context in: https://lore.kernel.org/lkml/20231207050433.1426834-1-irogers@google.com/ Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250222061015.303622-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24	selftests/user_events: Fix failures caused by test code	Yiqian Xun
	In parse_abi function,the dyn_test fails because the enable_file isn’t closed after successfully registering an event. By adding wait_for_delete(), the dyn_test now passes as expected. Link: https://lore.kernel.org/r/20250221033555.326716-1-realxxyq@163.com Signed-off-by: Yiqian Xun <xunyiqian@kylinos.cn> Acked-by: Beau Belgrave <beaub@linux.microsoft.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-02-24	selftests: drv-net: test XDP, HDS auto and the ioctl path	Jakub Kicinski
	Test XDP and HDS interaction. While at it add a test for using the IOCTL, as that turned out to be the real culprit. Testing bnxt: # NETIF=eth0 ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by the device ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0 and netdevsim: # ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0 Netdevsim needs a sane default for tx/rx ring size. ethtool 6.11 is needed for the --disable-netlink option. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250221025141.1132944-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24	libbpf: Fix out-of-bound read	Nandakumar Edamana
	In `set_kcfg_value_str`, an untrusted string is accessed with the assumption that it will be at least two characters long due to the presence of checks for opening and closing quotes. But the check for the closing quote (value[len - 1] != '"') misses the fact that it could be checking the opening quote itself in case of an invalid input that consists of just the opening quote. This commit adds an explicit check to make sure the string is at least two characters long. Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250221210110.3182084-1-nandakumar@nandakumar.co.in
2025-02-24	io_uring/zcrx: add selftest case for recvzc with read limit	David Wei
	Add a selftest case to iou-zcrx where the sender sends 4x4K = 16K and the receiver does 4x4K recvzc requests. Validate that the requests complete successfully and that the data is not corrupted. Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20250224041319.2389785-3-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24	ASoC: dapm-graph: set fill colour of turned on nodes	Nicolas Frattaroli
	Some tools like KGraphViewer interpret the "ON" nodes not having an explicitly set fill colour as them being entirely black, which obscures the text on them and looks funny. In fact, I thought they were off for the longest time. Comparing to the output of the `dot` tool, I assume they are supposed to be white. Instead of speclawyering over who's in the wrong and must immediately atone for their wickedness at the altar of RFC2119, just be explicit about it, set the fillcolor to white, and nobody gets confused. Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20250221-dapm-graph-node-colour-v1-1-514ed0aa7069@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-24	sched_ext: idle: Introduce scx_bpf_nr_node_ids()	Andrea Righi
	Similarly to scx_bpf_nr_cpu_ids(), introduce a new kfunc scx_bpf_nr_node_ids() to expose the maximum number of NUMA nodes in the system. BPF schedulers can use this information together with the new node-aware kfuncs, for example to create per-node DSQs, validate node IDs, etc. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-24	KVM: selftests: Add a nested (forced) emulation intercept test for x86	Sean Christopherson
	Add a rudimentary test for validating KVM's handling of L1 hypervisor intercepts during instruction emulation on behalf of L2. To minimize complexity and avoid overlap with other tests, only validate KVM's handling of instructions that L1 wants to intercept, i.e. that generate a nested VM-Exit. Full testing of emulation on behalf of L2 is better achieved by running existing (forced) emulation tests in a VM, (although on VMX, getting L0 to emulate on #UD requires modifying either L1 KVM to not intercept #UD, or modifying L0 KVM to prioritize L0's exception intercepts over L1's intercepts, as is done by KVM for SVM). Since emulation should never be successful, i.e. L2 always exits to L1, dynamically generate the L2 code stream instead of adding a helper for each instruction. Doing so requires hand coding instruction opcodes, but makes it significantly easier for the test to compute the expected "next RIP" and instruction length. Link: https://lore.kernel.org/r/20250201015518.689704-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-22	Merge tag 'ftrace-v6.14-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Function graph accounting fixes: - Fix the manage ops hashes The function graph registers a "manager ops" and "sub-ops" to ftrace. The manager ops does not have any callback but calls the sub-ops callbacks. The manage ops hashes (what is used to tell ftrace what functions to attach to) is built on the sub-ops it manages. There was an error in the way it built the hash. An empty hash means to attach to all functions. When the manager ops had one sub-ops it properly copied its hash. But when the manager ops had more than one sub-ops, it went into a loop to make a set of all functions it needed to add to the hash. If any of the subops hashes was empty, that would mean to attach to all functions. The error was that the first iteration of the loop passed in an empty hash to start with in order to add the other hashes. That starting hash was mistaken as to attach to all functions. This made the manage ops attach to all functions whenever it had two or more sub-ops, even if each sub-op was attached to only a single function. - Do not add duplicate entries to the manager ops hash If two or more subops hashes trace the same function, an entry for that function will be added to the manager ops for each subops. This causes waste and extra overhead. Fprobe accounting fixes: - Remove last function from fprobe hash Fprobes has a ftrace hash to manage which functions an fprobe is attached to. It also has a counter of how many fprobes are attached. When the last fprobe is removed, it unregisters the fprobe from ftrace but does not remove the functions the last fprobe was attached to from the hash. This leaves the old functions attached. When a new fprobe is added, the fprobe infrastructure attaches to not only the functions of the new fprobe, but also to the functions of the last fprobe. - Fix accounting of the fprobe counter When a fprobe is added, it updates a counter. If the counter goes from zero to one, it attaches its ops to ftrace. When an fprobe is removed, the counter is decremented. If the counter goes from 1 to zero, it removes the fprobes ops from ftrace. There was an issue where if two fprobes trace the same function, the addition of each fprobe would increment the counter. But when removing the first of the fprobes, it would notice that another fprobe is still attached to one of its functions no it does not remove the functions from the ftrace ops. But it also did not decrement the counter, so when the last fprobe is removed, the counter is still one. This leaves the fprobes callback still registered with ftrace and it being called by the functions defined by the fprobes ops hash. Worse yet, because all the functions from the fprobe ops hash have been removed, that tells ftrace that it wants to trace all functions. Thus, this puts the state of the system where every function is calling the fprobe callback handler (which does nothing as there are no registered fprobes), but this causes a good 13% slow down of the entire system. Other updates: - Add a selftest to test the above issues to prevent regressions. - Fix preempt count accounting in function tracing Better recursion protection was added to function tracing which added another layer of preempt disable. As the preempt_count gets traced in the event, it needs to subtract the amount of preempt disabling the tracer does to record what the preempt_count was when the trace was triggered. - Fix memory leak in output of set_event A variable is passed by the seq_file functions in the location that is set by the return of the next() function. The start() function allocates it and the stop() function frees it. But when the last item is found, the next() returns NULL which leaks the data that was allocated in start(). The m->private is used for something else, so have next() free the data when it returns NULL, as stop() will then just receive NULL in that case" * tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak when reading set_event file ftrace: Correct preemption accounting for function tracing. selftests/ftrace: Update fprobe test to check enabled_functions file fprobe: Fix accounting of when to unregister from function graph fprobe: Always unregister fgraph function from ops ftrace: Do not add duplicate entries in subops manager ops ftrace: Fix accounting of adding subops to a manager ops
2025-02-22	selftests: remove reference to prime_numbers.sh	Tamir Duberstein
	Remove a leftover shell script reference from commit 313b38a6ecb4 ("lib/prime_numbers: convert self-test to KUnit"). Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202502171110.708d965a-lkp@intel.com Fixes: 313b38a6ecb4 ("lib/prime_numbers: convert self-test to KUnit") Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250217-fix-prime-numbers-v1-1-eb0ca7235e60@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-22	selftests/rseq: Add rseq syscall errors test	Michael Jeanson
	This test adds coverage of expected errors during rseq registration and unregistration, it disables glibc integration and will thus always exercise the rseq syscall explictly. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250121213402.1754762-1-mjeanson@efficios.com
2025-02-22	selftests/lam: Test get_user() LAM pointer handling	Maciej Wieczor-Retman
	Recent change in how get_user() handles pointers: https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundation.org/ has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/1624d9d1b9502517053a056652d50dc5d26884ac.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-22	selftests/lam: Skip test if LAM is disabled	Maciej Wieczor-Retman
	Until LASS is merged into the kernel: https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com/ LAM is left disabled in the config file. Running the LAM selftest with disabled LAM only results in unhelpful output. Use one of LAM syscalls() to determine whether the kernel was compiled with LAM support (CONFIG_ADDRESS_MASKING) or not. Skip running the tests in the latter case. Merge CPUID checking function with the one mentioned above to achieve a single function that shows LAM's availability from both CPU and the kernel. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/251d0f45f6a768030115e8d04bc85458910cb0dc.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-22	selftests/lam: Move cpu_has_la57() to use cpuinfo flag	Maciej Wieczor-Retman
	In current form cpu_has_la57() reports platform's support for LA57 through reading the output of cpuid. A much more useful information is whether 5-level paging is actually enabled on the running system. Check whether 5-level paging is enabled by trying to map a page in the high linear address space. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/8b1ca51b13e6d94b5a42b6930d81b692cbb0bcbb.1737990375.git.maciej.wieczor-retman@intel.com
2025-02-21	selftests: fib_nexthops: do not mark skipped tests as failed	Hangbin Liu
	The current test marks all unexpected return values as failed and sets ret to 1. If a test is skipped, the entire test also returns 1, incorrectly indicating failure. To fix this, add a skipped variable and set ret to 4 if it was previously 0. Otherwise, keep ret set to 1. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>