linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-02-12	KVM: selftests: Only validate counts for hardware-supported arch events	Sean Christopherson
	In the Intel PMU counters test, only validate the counts for architectural events that are supported in hardware. If an arch event isn't supported, the event selector may enable a completely different event, and thus the logic for the expected count is bogus. This fixes test failures on pre-Icelake systems due to the encoding for the architectural Top-Down Slots event corresponding to something else (at least on the Skylake family of CPUs). Note, validation relies on hardware support, not KVM support and not guest support. Architectural events are all about enumerating the event selector encoding; lack of enumeration for an architectural event doesn't mean the event itself is unsupported, i.e. the event should still count as expected even if KVM and/or guest CPUID doesn't enumerate the event as being "architectural". Note #2, it's desirable to _program_ the architectural event encoding even if hardware doesn't support the event. The count can't be validated when the event is fully enabled, but KVM should still let the guest program the event selector, and the PMC shouldn't count if the event is disabled. Fixes: 4f1bd6b16074 ("KVM: selftests: Test Intel PMU architectural events on gp counters") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202501141009.30c629b4-lkp@intel.com Debugged-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250117234204.2600624-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12	KVM: selftests: Make Intel arch events globally available in PMU counters test	Sean Christopherson
	Wrap PMU counter test's array of Intel architectrual in a helper function so that the events can be queried in multiple locations. Add a comment to explain the need for a wrapper. No functional change intended. Link: https://lore.kernel.org/r/20250117234204.2600624-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-12	selftests: add tests for using detached mount with overlayfs	Christian Brauner
	Test that it is possible to use detached mounts as overlayfs layers. Link: https://lore.kernel.org/r/20250123-erstbesteigung-angeeignet-1d30e64b7df2@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12	selftests/overlayfs: test specifying layers as O_PATH file descriptors	Christian Brauner
	Verify that userspace can specify layers via O_PATH file descriptors. Link: https://lore.kernel.org/r/20250210-work-overlayfs-v2-2-ed2a949b674b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-11	selftests/bpf: Select NUMA_NO_NODE to create map	Saket Kumar Bhaskar
	On powerpc, a CPU does not necessarily originate from NUMA node 0. This contrasts with architectures like x86, where CPU 0 is not hot-pluggable, making NUMA node 0 a consistently valid node. This discrepancy can lead to failures when creating a map on NUMA node 0, which is initialized by default, if no CPUs are allocated from NUMA node 0. This patch fixes the issue by setting NUMA_NO_NODE (-1) for map creation for this selftest. Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation") Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/cf1f61468b47425ecf3728689bc9636ddd1d910e.1738302337.git.skb99@linux.ibm.com
2025-02-11	selftests/bpf: Define SYS_PREFIX for powerpc	Saket Kumar Bhaskar
	Since commit 7e92e01b7245 ("powerpc: Provide syscall wrapper") landed in v6.1, syscall wrapper is enabled on powerpc. Commit 94746890202c ("powerpc: Don't add __powerpc_ prefix to syscall entry points") , that drops the prefix to syscall entry points, also landed in the same release. So, add the missing empty SYS_PREFIX prefix definition for powerpc, to fix some fentry and kprobe selftests. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/7192d6aa9501115dc242435970df82b3d190f257.1738302337.git.skb99@linux.ibm.com
2025-02-11	blackhole_dev: convert self-test to KUnit	Tamir Duberstein
	Convert this very simple smoke test to a KUnit test. Add a missing `htons` call that was spotted[0] by kernel test robot <lkp@intel.com> after initial conversion to KUnit. Link: https://lore.kernel.org/oe-kbuild-all/202502090223.qCYMBjWT-lkp@intel.com/ [0] Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://patch.msgid.link/20250208-blackholedev-kunit-convert-v2-1-182db9bd56ec@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11	selftests/net: Add selftest for IPv4 RTM_GETMULTICAST support	Yuyang Huang
	This change introduces a new selftest case to verify the functionality of dumping IPv4 multicast addresses using the RTM_GETMULTICAST netlink message. The test utilizes the ynl library to interact with the netlink interface and validate that the kernel correctly reports the joined IPv4 multicast addresses. To run the test, execute the following command: $ vng -v --user root --cpus 16 -- \ make -C tools/testing/selftests TARGETS=net \ TEST_PROGS=rtnetlink.py TEST_GEN_PROGS="" run_tests Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Yuyang Huang <yuyanghuang@google.com> Link: https://patch.msgid.link/20250207110836.2407224-2-yuyanghuang@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11	selftests/powerpc/pmu: Update comment with details to understand ↵	Athira Rajeev
	auxv_generic_compat_pmu() utility function auxv_generic_compat_pmu() utility function is to detect whether the system is having generic compat PMU. The check is based on base platform value from /proc/self/auxv. Update the comment with details on how auxv is used to detect the platform. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-5-atrajeev@linux.vnet.ibm.com
2025-02-11	selftests/powerpc/pmu: Add interface test for extended reg support	Kajol Jain
	The testcase uses check_extended_regs_support and perf_get_platform_reg_mask function to check if the platform has extended reg support. This will help to check if sampling pmu selftest is enabled or not for a given platform. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-4-atrajeev@linux.vnet.ibm.com
2025-02-11	tools/testing/selftests/powerpc/pmu: Update comment description to mention ↵	Athira Rajeev
	ISA v3.1 for power10 and above Updated the comments in the pmu selftests to include power11/ISA v3.1 where ever required. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-3-atrajeev@linux.vnet.ibm.com
2025-02-11	tools/testing/selftests/powerpc: Add check for power11 pvr for pmu selfests	Athira Rajeev
	Some of the tests depends on pvr value to choose the event. Example: - event_alternatives_tests_p10: alternative event depends on registered PMU driver which is based on pvr - generic_events_valid_test varies based on platform - bhrb_filter_map_test: again its dependent on pmu to decide which bhrb filter to use - reserved_bits_mmcra_sample_elig_mode: randome sampling mode reserved bits is also varies based on platform Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-2-atrajeev@linux.vnet.ibm.com
2025-02-11	tools/testing/selftests/powerpc: Enable pmu selftests for power11	Athira Rajeev
	Add check for power11 pvr in the selftest utility functions. Selftests uses pvr value to check for platform support inorder to run the tests. pvr is also used to send the extended mask value to capture sampling registers. Update some of the utility functions to use hwcap2 inorder to return platform specific bits from sampling registers. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250113075858.45137-1-atrajeev@linux.vnet.ibm.com
2025-02-10	selftests: drv-net: add helper for path resolution	Jakub Kicinski
	Refering to C binaries from Python code is going to be a common need. Add a helper to convert from path in relation to the test. Meaning, if the test is in the same directory as the binary, the call would be simply: cfg.rpath("binary"). The helper name "rpath" is not great. I can't think of a better name that would be accurate yet concise. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250207184140.1730466-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10	selftests: drv-net: factor out a DrvEnv base class	Jakub Kicinski
	We have separate Env classes for local tests and tests with a remote endpoint. Make it easier to share the code by creating a base class. Make env loading a method of this class. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250207184140.1730466-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10	selftests: drv-net: remove an unnecessary libmnl include	Jakub Kicinski
	ncdevmem doesn't need libmnl, remove the unnecessary include. Since YNL doesn't depend on libmnl either, any more, it's actually possible to build selftests without having libmnl installed. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250207183119.1721424-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10	bpftool: Using the right format specifiers	Jiayuan Chen
	Fixed some formatting specifiers errors, such as using %d for int and %u for unsigned int, as well as other byte-length types. Signed-off-by: Jiayuan Chen <mrpre@163.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20250207123706.727928-2-mrpre@163.com
2025-02-10	perf tools: Add skip check in tool_pmu__event_to_str()	Kan Liang
	Some topdown related metrics may fail on hybrid machines. $ perf stat -M tma_frontend_bound Cannot resolve IDs for tma_frontend_bound: cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@) In the find_tool_events(), the tool_pmu__event_to_str() is used to compare the tool_events. It only checks the event name, no PMU or arch. So the tool_events[TOOL_PMU__EVENT_SLOTS] is set to true, because the p-core Topdown metrics has "slots" event. The tool_events is shared. So when parsing the e-core metrics, the "slots" is automatically added. The "slots" event as a tool event should only be available on arm64. It has a different meaning on X86. The tool_pmu__skip_event() intends handle the case. Apply it for tool_pmu__event_to_str() as well. There is a lack of sanity check in the expr__get_id(). Add the check. Closes: https://lore.kernel.org/lkml/608077bc-4139-4a97-8dc4-7997177d95c4@linux.intel.com/ Fixes: 069057239a67 ("perf tool_pmu: Move expr literals to tool_pmu") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: thomas.falcon@intel.com Link: https://lore.kernel.org/r/20250207152844.302167-1-kan.liang@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-10	perf tools: Deadcode removal	Dr. David Alan Gilbert
	The last use of machine__fprintf_vmlinux_path() was removed in 2011 by commit ab81f3fd350c ("perf top: Reuse the 'report' hist_entry/hists classes") mmap_cpu_mask__duplicate() was added in 2021 by commit 6bd006c6eb7f ("perf mmap: Introduce mmap_cpu_mask__duplicate()") but hasn't been used since. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250204220545.456435-1-linux@treblig.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-10	selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64	Kees Cook
	Since headers don't always follow the selftests around correct, explicitly include the __NR_uretprobe syscall for better test coverage. Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10	tools/sched_ext: Update enum_defs.autogen.h	Changwoo Min
	Add where the script is located to the comment lines of the header file. This helps anyone re-generate the header file if required. Note that this is a sync from the PR [1] in the scx repo. [1] https://github.com/sched-ext/scx/pull/1322 Signed-off-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-10	selftests: drv-net: rss_ctx: skip tests which need multiple contexts cleanly	Jakub Kicinski
	There's no good API to check how many contexts device supports. But initial tests sense the context count already, so just store that number and skip tests which we know need more. Link: https://patch.msgid.link/20250206235334.1425329-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10	selftests: net-drv: test adding flow rule to invalid RSS context	Jakub Kicinski
	Check that adding Rx flow steering rules pointing to an RSS context which does not exist is prevented. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250206235334.1425329-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10	netconsole: selftest: test for sysdata CPU	Breno Leitao
	Add a new selftest to verify that the netconsole module correctly handles CPU runtime data in sysdata. The test validates three scenarios: 1. Basic CPU sysdata functionality - verifies that cpu=X is appended to messages 2. CPU sysdata with userdata - ensures CPU data works alongside userdata 3. Disabled CPU sysdata - confirms no CPU data is included when disabled The test uses taskset to control which CPU sends messages and verifies the reported CPU matches the one used. This helps ensure that netconsole accurately tracks and reports the originating CPU of messages. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-09	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations - Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 - Stop pretending that we can run the protected mode without a GICv3 being present on the host - Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts - Always evaluate the need to arm a background timer for fully emulated guest timers - Fix the emulation of EL1 timers in the absence of FEAT_ECV - Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0 s390: - move some of the guest page table (gmap) logic into KVM itself, inching towards the final goal of completely removing gmap from the non-kvm memory management code. As an initial set of cleanups, move some code from mm/gmap into kvm and start using __kvm_faultin_pfn() to fault-in pages as needed; but especially stop abusing page->index and page->lru to aid in the pgdesc conversion. x86: - Add missing check in the fix to defer starting the huge page recovery vhost_task - SRSO_USER_KERNEL_NO does not need SYNTHESIZED_F" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (31 commits) KVM: x86/mmu: Ensure NX huge page recovery thread is alive before waking KVM: remove kvm_arch_post_init_vm KVM: selftests: Fix spelling mistake "initally" -> "initially" kvm: x86: SRSO_USER_KERNEL_NO is not synthesized KVM: arm64: timer: Don't adjust the EL2 virtual timer offset KVM: arm64: timer: Correctly handle EL1 timer emulation when !FEAT_ECV KVM: arm64: timer: Always evaluate the need for a soft timer KVM: arm64: Fix nested S2 MMU structures reallocation KVM: arm64: Fail protected mode init if no vgic hardware is present KVM: arm64: Flush/sync debug state in protected mode KVM: s390: selftests: Streamline uc_skey test to issue iske after sske KVM: s390: remove the last user of page->index KVM: s390: move PGSTE softbits KVM: s390: remove useless page->index usage KVM: s390: move gmap_shadow_pgt_lookup() into kvm KVM: s390: stop using lists to keep track of used dat tables KVM: s390: stop using page->index for non-shadow gmaps KVM: s390: move some gmap shadowing functions away from mm/gmap.c KVM: s390: get rid of gmap_translate() KVM: s390: get rid of gmap_fault() ...
2025-02-09	tools/power turbostat: Fix names matching	Artem Bityutskiy
	Fix the 'find_msrp_by_name()' function which returns incorrect matches for cases like this: s1 = "C1-"; find_msrp_by_name(head, s1); Inside 'find_msrp_by_name()': ... s2 = "C1" if !(strcnmp(s1, s2, len(s2))) // Incorrect match! return mp; Full strings should be match istead. Switch to 'strcmp()' to fix the problem. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-02-09	tools/nolibc: add support for directory access	Thomas Weißschuh
	Add an implementation for directory access operations. To keep nolibc itself allocation-free, a "DIR " does not point to any data, but directly encodes a filedescriptor number, equivalent to "FILE ". Without any per-directory storage it is not possible to implement readdir() POSIX confirming. Instead only readdir_r() is provided. While readdir_r() is deprecated in glibc, the reasons for that are not applicable to nolibc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20250209-nolibc-dir-v2-2-57cc1da8558b@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-09	tools/nolibc: add support for sys_llseek()	Thomas Weißschuh
	Not all architectures have the old sys_lseek(), notably riscv32. Implement lseek() in terms of sys_llseek() in that case. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20250209-nolibc-dir-v2-1-57cc1da8558b@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-02-08	tools/sched_ext: Compatible testing of SCX_ENQ_CPU_SELECTED	Changwoo Min
	This provides compatible testing of SCX_ENQ_CPU_SELECTED. More specifically, it handles two cases: 1. a BPF scheduler is compiled against vmlinux.h where SCX_ENQ_CPU_SELECTED is defined, but it runs on a kernel that does not have SCX_ENQ_CPU_SELECTED. In this case, the test result of 'enq_flags & SCX_ENQ_CPU_SELECTED' will always be false. That test result is semantically incorrect because the kernel before SCX_ENQ_CPU_SELECTED has never skipped select_task_rq_scx(), so the result should be true. 2. a BPF scheduler is compiling against vmlinux.h where SCX_ENQ_CPU_SELECTED is not defined. In this case, directly using SCX_ENQ_CPU_SELECTED causes compilation errors. To hide such complexity, introduce __COMPAT_is_enq_cpu_selected(), which checks if SCX_ENQ_CPU_SELECTED exists in runtime using BPF CO-RE. This consists of three parts: 1. Add enum_defs.autogen.h, which has macros (HAVE_{enum name}) denoting whether SCX enums are defined in the vmlinux.h or not. 2. Implement __COMPAT_is_enq_cpu_selected(), which provide the test of SCX_ENQ_CPU_SELECTED in a compatible way. 3. Use __COMPAT_is_enq_cpu_selected() in scx_qmap. Note that this is a sync of the relevant PR [1] in the scx repo. [1] https://github.com/sched-ext/scx/pull/1314 Signed-off-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-08	tool/sched_ext: Event counter dumping updates	Tejun Heo
	- There's no need to dump event counters from both scx_qmap and scx_central. Drop counter dumping from scx_central. - bpf_printk() implies a trailing new line and the explicit new line leads to double new lines. Drop the explicit new lines. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Changwoo Min <changwoo@igalia.com>
2025-02-08	Merge branch 'for-6.14-fixes' into for-6.15	Tejun Heo
	Pull to receive: - 2fa0fbeb69ed ("sched_ext: Implement auto local dispatching of migration disabled tasks") - 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches") as planned for-6.15 changes depend on them (e.g. adding event counter for implicit migration disabled task handling).
2025-02-08	crypto: crct10dif - remove from crypto API	Eric Biggers
	Remove the "crct10dif" shash algorithm from the crypto API. It has no known user now that the lib is no longer built on top of it. It has no remaining references in kernel code. The only other potential users would be the usual components that allow specifying arbitrary hash algorithms by name, namely AF_ALG and dm-integrity. However there are no indications that "crct10dif" is being used with these components. Debian Code Search and web searches don't find anything relevant, and explicitly grepping the source code of the usual suspects (cryptsetup, libell, iwd) finds no matches either. "crc32" and "crc32c" are used in a few more places, but that doesn't seem to be the case for "crct10dif". crc_t10dif_update() is also tested by crc_kunit now, so the test coverage provided via the crypto self-tests is no longer needed. Also note that the "crct10dif" shash algorithm was inconsistent with the rest of the shash API in that it wrote the digest in CPU endianness, making the resulting byte array differ on little endian vs. big endian platforms. This means it was effectively just built for use by the lib functions, and it was not actually correct to treat it as "just another hash function" that could be dropped in via the shash API. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250206173857.39794-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08	tools/power turbostat: Allow Zero return value for some RAPL registers	Zhang Rui
	turbostat aborted with below messages on a dual-package system, turbostat: turbostat.c:3744: rapl_counter_accumulate: Assertion `dst->unit == src->unit' failed. Aborted This is because 1. the MSR_DRAM_PERF_STATUS returns Zero for one package, and non-Zero for another package 2. probe_msr() treats Zero return value as a failure so this feature is enabled on one package, and disabled for another package. 3. turbostat aborts because the feature is invalid on some package Unlike the RAPL energy counter registers, MSR_DRAM_PERF_STATUS can return Zero value, and this should not be treated as a failure. Fix the problem by allowing Zero return value for RAPL registers other than the energy counters. Fixes: 7c6fee25bdf5 ("tools/power turbostat: Check for non-zero value when MSR probing") Reported-by: Artem Bityutskiy <artem.bityutskiy@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-02-08	Merge tag 'seccomp-v6.14-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fix from Kees Cook: "This is really a work-around for x86_64 having grown a syscall to implement uretprobe, which has caused problems since v6.11. This may change in the future, but for now, this fixes the unintended seccomp filtering when uretprobe switched away from traps, and does so with something that should be easy to backport. - Allow uretprobe on x86_64 to avoid behavioral complications (Eyal Birger)" * tag 'seccomp-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: validate uretprobe syscall passes through seccomp seccomp: passthrough uretprobe systemcall without filtering
2025-02-08	iio: introduce the FAULT event type	Guillaume Ranquet
	Add a new event type to describe an hardware failure. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Guillaume Ranquet <granquet@baylibre.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250127-ad4111_openwire-v5-1-ef2db05c384f@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-08	objtool: Move dodgy linker warn to verbose	Peter Zijlstra
	The lld.ld borkage is fixed in the latest llvm release (?) but will not be backported, meaning we're stuck with broken linker for a fair while. Lets not spam all clang build logs and move warning to verbose. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-02-08	objtool: Ignore dangling jump table entries	Josh Poimboeuf
	Clang sometimes leaves dangling unused jump table entries which point to the end of the function. Ignore them. Closes: https://lore.kernel.org/20250113235835.vqgvb7cdspksy5dn@jpoimboe Reported-by: Klaus Kusche <klaus.kusche@computerix.info> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/ee25c0b7e80113e950bd1d4c208b671d35774ff4.1736891751.git.jpoimboe@kernel.org
2025-02-07	selftests/bpf: Remove with_addr.sh and with_tunnels.sh	Bastien Curutchet (eBPF Foundation)
	Those two scripts were used by test_flow_dissector.sh to setup/cleanup the network topology before/after the tests. test_flow_dissector.sh have been deleted by commit 63b37657c5fd ("selftests/bpf: remove test_flow_dissector.sh") so they aren't used anywhere now. Remove the two unused scripts and their Makefile entries. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250204-with-v1-1-387a42118cd4@bootlin.com
2025-02-07	netdevsim: allow normal queue reset while down	Jakub Kicinski
	Resetting queues while the device is down should be legal. Allow it, test it. Ideally we'd test this with a real device supporting devmem but I don't have access to such devices. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250206225638.1387810-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07	bpf: selftests: Test constant key extraction on irrelevant maps	Daniel Xu
	Test that very high constant map keys are not interpreted as an error value by the verifier. This would previously fail. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/c0590b62eb9303f389b2f52c0c7e9cf22a358a30.1738689872.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-07	sched_ext: Print an event, SCX_EV_ENQ_SLICE_DFL, in scx_qmap/central	Changwoo Min
	Modify the scx_qmap and scx_celtral schedulers to print the SCX_EV_ENQ_SLICE_DFL event every second. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-07	tools/power turbostat: Clustered Uncore MHz counters should honor show/hide ↵	Len Brown
	options The clustered uncore frequency counters, UMHz. should honor the --show and --hide options. All non-specified counters should be implicityly hidden. But when --show was used, UMHz. showed up anyway: $ sudo turbostat -q -S --show Busy% Busy% UMHz0.0 UMHz1.0 UMHz2.0 UMHz3.0 UMHz4.0 Indeed, there was no string that can be used to explicitly show or hide clustered uncore counters. Even through they are dynamically probed and added, group the clustered UMHz. counters with the legacy built-in-counter "UncMHz" for show/hide. turbostat --show Busy% does not show UMHz.. turbostat --show UncMHz shows either UncMHz or UMHz., if present turbostat --hide UncMHz hides either UncMHz or UMHz., if present Reported-by: Artem Bityutskiy <artem.bityutskiy@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@intel.com>
2025-02-07	Merge tag 'vfs-6.14-rc2.fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix fsnotify FMODE_NONOTIFY* handling. This also disables fsnotify on all pseudo files by default apart from very select exceptions. This carries a regression risk so we need to watch out and adapt accordingly. However, it is overall a significant improvement over the current status quo where every rando file can get fsnotify enabled. - Cleanup and simplify lockref_init() after recent lockref changes. - Fix vboxfs build with gcc-15. - Add an assert into inode_set_cached_link() to catch corrupt links. - Allow users to also use an empty string check to detect whether a given mount option string was empty or not. - Fix how security options were appended to statmount()'s ->mnt_opt field. - Fix statmount() selftests to always check the returned mask. - Fix uninitialized value in vfs_statx_path(). - Fix pidfs_ioctl() sanity checks to guard against ioctl() overloading and preserve extensibility. * tag 'vfs-6.14-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: sanity check the length passed to inode_set_cached_link() pidfs: improve ioctl handling fsnotify: disable pre-content and permission events by default selftests: always check mask returned by statmount(2) fsnotify: disable notification by default for all pseudo files fs: fix adding security options to statmount.mnt_opt fsnotify: use accessor to set FMODE_NONOTIFY_* lockref: remove count argument of lockref_init gfs2: switch to lockref_init(..., 1) gfs2: use lockref_init for gl_lockref statmount: let unset strings be empty vboxsf: fix building with GCC 15 fs/stat.c: avoid harmless garbage value problem in vfs_statx_path()
2025-02-07	selftests: always check mask returned by statmount(2)	Miklos Szeredi
	STATMOUNT_MNT_OPTS can actually be missing if there are no options. This is a change of behavior since 75ead69a7173 ("fs: don't let statmount return empty strings"). The other checks shouldn't actually trigger, but add them for correctness and for easier debugging if the test fails. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250129160641.35485-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-06	selftests/bpf: Correct the check of join cgroup	Jason Xing
	Use ASSERT_OK_FD to check the return value of join cgroup, or else this test will pass even if the fd < 0. ASSERT_OK_FD can print the error message to the console. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/all/6d62bd77-6733-40c7-b240-a1aeff55566c@linux.dev/ Link: https://patch.msgid.link/20250204051154.57655-1-kerneljasonxing@gmail.com
2025-02-06	Merge branch 'io_uring-zero-copy-rx'	Jakub Kicinski
	David Wei says: ==================== io_uring zero copy rx This patchset contains net/ patches needed by a new io_uring request implementing zero copy rx into userspace pages, eliminating a kernel to user copy. We configure a page pool that a driver uses to fill a hw rx queue to hand out user pages instead of kernel pages. Any data that ends up hitting this hw rx queue will thus be dma'd into userspace memory directly, without needing to be bounced through kernel memory. 'Reading' data out of a socket instead becomes a _notification_ mechanism, where the kernel tells userspace where the data is. The overall approach is similar to the devmem TCP proposal. This relies on hw header/data split, flow steering and RSS to ensure packet headers remain in kernel memory and only desired flows hit a hw rx queue configured for zero copy. Configuring this is outside of the scope of this patchset. We share netdev core infra with devmem TCP. The main difference is that io_uring is used for the uAPI and the lifetime of all objects are bound to an io_uring instance. Data is 'read' using a new io_uring request type. When done, data is returned via a new shared refill queue. A zero copy page pool refills a hw rx queue from this refill queue directly. Of course, the lifetime of these data buffers are managed by io_uring rather than the networking stack, with different refcounting rules. This patchset is the first step adding basic zero copy support. We will extend this iteratively with new features e.g. dynamically allocated zero copy areas, THP support, dmabuf support, improved copy fallback, general optimisations and more. In terms of netdev support, we're first targeting Broadcom bnxt. Patches aren't included since Taehee Yoo has already sent a more comprehensive patchset adding support in [1]. Google gve should already support this, and Mellanox mlx5 support is WIP pending driver changes. =========== Performance =========== Note: Comparison with epoll + TCP_ZEROCOPY_RECEIVE isn't done yet. Test setup: * AMD EPYC 9454 * Broadcom BCM957508 200G * Kernel v6.11 base [2] * liburing fork [3] * kperf fork [4] * 4K MTU * Single TCP flow With application thread + net rx softirq pinned to _different_ cores: +-------------------------------+ \| epoll \| io_uring \| \|-----------\|-------------------\| \| 82.2 Gbps \| 116.2 Gbps (+41%) \| +-------------------------------+ Pinned to _same_ core: +-------------------------------+ \| epoll \| io_uring \| \|-----------\|-------------------\| \| 62.6 Gbps \| 80.9 Gbps (+29%) \| +-------------------------------+ ===== Links ===== Broadcom bnxt support: [1]: https://lore.kernel.org/20241003160620.1521626-8-ap420073@gmail.com Linux kernel branch including io_uring bits: [2]: https://github.com/isilence/linux.git zcrx/v13 liburing for testing: [3]: https://github.com/isilence/liburing.git zcrx/next kperf for testing: [4]: https://git.kernel.dk/kperf.git ==================== Link: https://patch.msgid.link/20250204215622.695511-1-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06	netdev: add io_uring memory provider info	David Wei
	Add a nested attribute for io_uring memory provider info. For now it is empty and its presence indicates that a particular page pool or queue has an io_uring memory provider attached. $ ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get [{'id': 80, 'ifindex': 2, 'inflight': 64, 'inflight-mem': 262144, 'napi-id': 525}, {'id': 79, 'ifindex': 2, 'inflight': 320, 'inflight-mem': 1310720, 'io_uring': {}, 'napi-id': 525}, ... $ ./cli.py --spec netlink/specs/netdev.yaml --dump queue-get [{'id': 0, 'ifindex': 1, 'type': 'rx'}, {'id': 0, 'ifindex': 1, 'type': 'tx'}, {'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 514, 'type': 'rx'}, ... {'id': 12, 'ifindex': 2, 'io_uring': {}, 'napi-id': 525, 'type': 'rx'}, ... Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-6-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.14-rc2). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06	tools: ynl: add all headers to makefile deps	Jakub Kicinski
	The Makefile.deps lists uAPI headers to make the build work when system headers are older than in-tree headers. The problem doesn't occur for new headers, because system headers are not there at all. But out-of-tree YNL clone on GH also uses this header to identify header dependencies, and one day the system headers will exist, and will get out of date. So let's add the headers we missed. I don't think this is a fix, but FWIW the commits which added the missing headers are: commit 04e65df94b31 ("netlink: spec: add shaper YAML spec") commit 49922401c219 ("ethtool: separate definitions that are gonna be generated") Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250205173352.446704-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06	selftests/seccomp: validate uretprobe syscall passes through seccomp	Eyal Birger
	The uretprobe syscall is implemented as a performance enhancement on x86_64 by having the kernel inject a call to it on function exit; User programs cannot call this system call explicitly. As such, this syscall is considered a kernel implementation detail and should not be filtered by seccomp. Enhance the seccomp bpf test suite to check that uretprobes can be attached to processes without the killing the process regardless of seccomp policy. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20250202162921.335813-3-eyal.birger@gmail.com [kees: Skip archs without __NR_uretprobe] Signed-off-by: Kees Cook <kees@kernel.org>