linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-08-15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. Conflicts: Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml c25504a0ba36 ("dt-bindings: net: fsl,qoriq-mc-dpmac: add missed property phys") be034ee6c33d ("dt-bindings: net: fsl,qoriq-mc-dpmac: using unevaluatedProperties") https://lore.kernel.org/20240815110934.56ae623a@canb.auug.org.au drivers/net/dsa/vitesse-vsc73xx-core.c 5b9eebc2c7a5 ("net: dsa: vsc73xx: pass value in phy_write operation") fa63c6434b6f ("net: dsa: vsc73xx: check busy flag in MDIO operations") 2524d6c28bdc ("net: dsa: vsc73xx: use defined values in phy operations") https://lore.kernel.org/20240813104039.429b9fe6@canb.auug.org.au Resolve by using FIELD_PREP(), Stephen's resolution is simpler. Adjacent changes: net/vmw_vsock/af_vsock.c 69139d2919dd ("vsock: fix recursive ->recvmsg calls") 744500d81f81 ("vsock: add support for SIOCOUTQ ioctl") Link: https://patch.msgid.link/20240815141149.33862-1-pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-15	Merge tag 'perf-tools-fixes-for-v6.11-2024-08-15' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "The usual header file sync-ups and one more build fix: - Add README file to explain why we copy the headers - Sync UAPI and other header files with kernel source - Fix build on MIPS 32-bit" * tag 'perf-tools-fixes-for-v6.11-2024-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf daemon: Fix the build on 32-bit architectures tools/include: Sync arm64 headers with the kernel sources tools/include: Sync x86 headers with the kernel sources tools/include: Sync filesystem headers with the kernel sources tools/include: Sync network socket headers with the kernel sources tools/include: Sync uapi/asm-generic/unistd.h with the kernel sources tools/include: Sync uapi/sound/asound.h with the kernel sources tools/include: Sync uapi/linux/perf.h with the kernel sources tools/include: Sync uapi/linux/kvm.h with the kernel sources tools/include: Sync uapi/drm/i915_drm.h with the kernel sources perf tools: Add tools/include/uapi/README
2024-08-15	libbpf: Workaround -Wmaybe-uninitialized false positive	Sam James
	In `elf_close`, we get this with GCC 15 -O3 (at least): ``` In function ‘elf_close’, inlined from ‘elf_close’ at elf.c:53:6, inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2: elf.c:57:9: warning: ‘elf_fd.elf’ may be used uninitialized [-Wmaybe-uninitialized] 57 \| elf_end(elf_fd->elf); \| ^~~~~~~~~~~~~~~~~~~~ elf.c: In function ‘elf_find_func_offset_from_file’: elf.c:377:23: note: ‘elf_fd.elf’ was declared here 377 \| struct elf_fd elf_fd; \| ^~~~~~ In function ‘elf_close’, inlined from ‘elf_close’ at elf.c:53:6, inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2: elf.c:58:9: warning: ‘elf_fd.fd’ may be used uninitialized [-Wmaybe-uninitialized] 58 \| close(elf_fd->fd); \| ^~~~~~~~~~~~~~~~~ elf.c: In function ‘elf_find_func_offset_from_file’: elf.c:377:23: note: ‘elf_fd.fd’ was declared here 377 \| struct elf_fd elf_fd; \| ^~~~~~ ``` In reality, our use is fine, it's just that GCC doesn't model errno here (see linked GCC bug). Suppress -Wmaybe-uninitialized accordingly by initializing elf_fd.fd to -1 and elf_fd.elf to NULL. I've done this in two other functions as well given it could easily occur there too (same access/use pattern). Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://gcc.gnu.org/PR114952 Link: https://lore.kernel.org/bpf/14ec488a1cac02794c2fa2b83ae0cef1bce2cb36.1723578546.git.sam@gentoo.org
2024-08-15	tools build: Provide consistent build options for fixdep	Alexander Gordeev
	The fixdep binary is being compiled and linked in one step. While the host linker flags are passed to the compiler the host compiler flags are missed. That leads to build errors at least on x86_64, arm64 and s390 as result of the compiler vs linker flags inconsistency. For example, during RPM package build redhat-hardened-ld script is provided to gcc, while redhat-hardened-cc1 script is missed. Provide both KBUILD_HOSTCFLAGS and KBUILD_HOSTLDFLAGS to avoid that. Fixes: ea974028a049f2ce ("tools build: Avoid circular .fixdep-in.o.cmd issues") Closes: https://lore.kernel.org/lkml/99ae0d34-ed76-4ca0-a9fd-c337da33c9f9@leemhuis.info/ Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240815072046.1002837-1-agordeev@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-15	selftests/bpf: Monitor traffic for select_reuseport.	Kui-Feng Lee
	Enable traffic monitoring for the subtests of select_reuseport. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-7-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	selftests/bpf: Monitor traffic for sockmap_listen.	Kui-Feng Lee
	Enable traffic monitor for each subtest of sockmap_listen. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-6-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	selftests/bpf: Monitor traffic for tc_redirect.	Kui-Feng Lee
	Enable traffic monitoring for the test case tc_redirect. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-5-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	selftests/bpf: netns_new() and netns_free() helpers.	Kui-Feng Lee
	netns_new()/netns_free() create/delete network namespaces. They support the option '-m' of test_progs to start/stop traffic monitor for the network namespace being created for matched tests. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-4-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	selftests/bpf: Add the traffic monitor option to test_progs.	Kui-Feng Lee
	Add option '-m' to test_progs to accept names and patterns of test cases. This option will be used later to enable traffic monitor that capture network packets generated by test cases. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-3-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	selftests/bpf: Add traffic monitor functions.	Kui-Feng Lee
	Add functions that capture packets and print log in the background. They are supposed to be used for debugging flaky network test cases. A monitored test case should call traffic_monitor_start() to start a thread to capture packets in the background for a given namespace and call traffic_monitor_stop() to stop capturing. (Or, option '-m' implemented by the later patches.) lo In IPv4 127.0.0.1:40265 > 127.0.0.1:55907: TCP, length 68, SYN lo In IPv4 127.0.0.1:55907 > 127.0.0.1:40265: TCP, length 60, SYN, ACK lo In IPv4 127.0.0.1:40265 > 127.0.0.1:55907: TCP, length 60, ACK lo In IPv4 127.0.0.1:55907 > 127.0.0.1:40265: TCP, length 52, ACK lo In IPv4 127.0.0.1:40265 > 127.0.0.1:55907: TCP, length 52, FIN, ACK lo In IPv4 127.0.0.1:55907 > 127.0.0.1:40265: TCP, length 52, RST, ACK Packet file: packets-2173-86-select_reuseport:sockhash_IPv4_TCP_LOOPBACK_test_detach_bpf-test.log #280/87 select_reuseport/sockhash IPv4/TCP LOOPBACK test_detach_bpf:OK The above is the output of an example. It shows the packets of a connection and the name of the file that contains captured packets in the directory /tmp/tmon_pcap. The file can be loaded by tcpdump or wireshark. This feature only works if libpcap is available. (Could be found by pkg-config) Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240815053254.470944-2-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-15	Merge tag 'net-6.11-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wireless and netfilter Current release - regressions: - udp: fall back to software USO if IPv6 extension headers are present - wifi: iwlwifi: correctly lookup DMA address in SG table Current release - new code bugs: - eth: mlx5e: fix queue stats access to non-existing channels splat Previous releases - regressions: - eth: mlx5e: take state lock during tx timeout reporter - eth: mlxbf_gige: disable RX filters until RX path initialized - eth: igc: fix reset adapter logics when tx mode change Previous releases - always broken: - tcp: update window clamping condition - netfilter: - nf_queue: drop packets with cloned unconfirmed conntracks - nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests - vsock: fix recursive ->recvmsg calls - dsa: vsc73xx: fix MDIO bus access and PHY opera - eth: gtp: pull network headers in gtp_dev_xmit() - eth: igc: fix packet still tx after gate close by reducing i226 MAC retry buffer - eth: mana: fix RX buf alloc_size alignment and atomic op panic - eth: hns3: fix a deadlock problem when config TC during resetting" * tag 'net-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits) net: hns3: use correct release function during uninitialization net: hns3: void array out of bound when loop tnl_num net: hns3: fix a deadlock problem when config TC during resetting net: hns3: use the user's cfg after reset net: hns3: fix wrong use of semaphore up selftests: net: lib: kill PIDs before del netns pse-core: Conditionally set current limit during PI regulator registration net: thunder_bgx: Fix netdev structure allocation net: ethtool: Allow write mechanism of LPL and both LPL and EPL vsock: fix recursive ->recvmsg calls selftest: af_unix: Fix kselftest compilation warnings netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests netfilter: nf_tables: Introduce nf_tables_getobj_single netfilter: nf_tables: Audit log dump reset after the fact selftests: netfilter: add test for br_netfilter+conntrack+queue combination netfilter: nf_queue: drop packets with cloned unconfirmed conntracks netfilter: flowtable: initialise extack before use netfilter: nfnetlink: Initialise extack before use in ACKs netfilter: allow ipv6 fragments to arrive on different devices tcp: Update window clamping condition ...
2024-08-15	perf hist: Update hist symbol when updating maps	Matt Fleming
	AddressSanitizer found a use-after-free bug in the symbol code which manifested as 'perf top' segfaulting. ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80 READ of size 1 at 0x60b00c48844b thread T193 #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310 #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286 #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614 #3 0x5650d804568f in __hists__add_entry util/hist.c:754 #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772 #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997 #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242 #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845 #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208 #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245 #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324 #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120 #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442 #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 When updating hist maps it's also necessary to update the hist symbol reference because the old one gets freed in map__put(). While this bug was probably introduced with 5c24b67aae72f54c ("perf tools: Replace map->referenced & maps->removed_maps with map->refcnt"), the symbol objects were leaked until c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so the bug was masked. Fixes: c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL") Reported-by: Yunzhao Li <yunzhao@cloudflare.com> Signed-off-by: Matt Fleming (Cloudflare) <matt@readmodwrite.com> Cc: Ian Rogers <irogers@google.com> Cc: kernel-team@cloudflare.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org # v5.13+ Link: https://lore.kernel.org/r/20240815142212.3834625-1-matt@readmodwrite.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-15	Merge tag 'nf-24-08-15' of ↵	Paolo Abeni
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Ignores ifindex for types other than mcast/linklocal in ipv6 frag reasm, from Tom Hughes. 2) Initialize extack for begin/end netlink message marker in batch, from Donald Hunter. 3) Initialize extack for flowtable offload support, also from Donald. 4) Dropped packets with cloned unconfirmed conntracks in nfqueue, later it should be possible to explore lookup after reinject but Florian prefers this approach at this stage. From Florian Westphal. 5) Add selftest for cloned unconfirmed conntracks in nfqueue for previous update. 6) Audit after filling netlink header successfully in object dump, from Phil Sutter. 7-8) Fix concurrent dump and reset which could result in underflow counter / quota objects. netfilter pull request 24-08-15 * tag 'nf-24-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests netfilter: nf_tables: Introduce nf_tables_getobj_single netfilter: nf_tables: Audit log dump reset after the fact selftests: netfilter: add test for br_netfilter+conntrack+queue combination netfilter: nf_queue: drop packets with cloned unconfirmed conntracks netfilter: flowtable: initialise extack before use netfilter: nfnetlink: Initialise extack before use in ACKs netfilter: allow ipv6 fragments to arrive on different devices ==================== Link: https://patch.msgid.link/20240814222042.150590-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-15	selftests: net: lib: kill PIDs before del netns	Matthieu Baerts (NGI0)
	When deleting netns, it is possible to still have some tasks running, e.g. background tasks like tcpdump running in the background, not stopped because the test has been interrupted. Before deleting the netns, it is then safer to kill all attached PIDs, if any. That should reduce some noises after the end of some tests, and help with the debugging of some issues. That's why this modification is seen as a "fix". Fixes: 25ae948b4478 ("selftests/net: add lib.sh") Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20240813-upstream-net-20240813-selftests-net-lib-kill-v1-1-27b689b248b8@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-14	selftest: af_unix: Fix kselftest compilation warnings	Abhinav Jain
	Change expected_buf from (const void ) to (const char ) in function __recvpair(). This change fixes the below warnings during test compilation: ``` In file included from msg_oob.c:14: msg_oob.c: In function ‘__recvpair’: ../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument of type ‘char ’,but argument 6 has type ‘const void ’ [-Wformat=] ../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’ msg_oob.c:235:17: note: in expansion of macro ‘TH_LOG’ ../../kselftest_harness.h:106:40: warning: format ‘%s’ expects argument of type ‘char ’,but argument 6 has type ‘const void ’ [-Wformat=] ../../kselftest_harness.h:101:17: note: in expansion of macro ‘__TH_LOG’ msg_oob.c:259:25: note: in expansion of macro ‘TH_LOG’ ``` Fixes: d098d77232c3 ("selftest: af_unix: Add msg_oob.c.") Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20240814080743.1156166-1-jain.abhinav177@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14	selftests/bpf: convert test_skb_cgroup_id_user to test_progs	Alexis Lothoré (eBPF Foundation)
	test_skb_cgroup_id_user allows testing skb cgroup id retrieval at different levels, but is not integrated in test_progs, so it is not run automatically in CI. The test overlaps a bit with cgroup_skb_sk_lookup_kern, which is integrated in test_progs and test extensively skb cgroup helpers, but there is still one major difference between the two tests which justifies the conversion: cgroup_skb_sk_lookup_kern deals with a BPF_PROG_TYPE_CGROUP_SKB (attached on a cgroup), while test_skb_cgroup_id_user deals with a BPF_PROG_TYPE_SCHED_CLS (attached on a qdisc) Convert test_skb_cgroup_id_user into test_progs framework in order to run it automatically in CI. The main differences with the original test are the following: - rename the test to make it shorter and more straightforward regarding tested feature - the wrapping shell script has been dropped since every setup step is now handled in the main C test file - the test has been renamed for a shorter name and reflecting the tested API - add dedicated assert log per level to ease test failure debugging - use global variables instead of maps to access bpf prog data Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20240813-convert_cgroup_tests-v4-4-a33c03458cf6@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-14	selftests/bpf: add proper section name to bpf prog and rename it	Alexis Lothoré (eBPF Foundation)
	test_skb_cgroup_id_kern.c is currently involved in a manual test. In its current form, it can not be used with the auto-generated skeleton APIs, because the section name is not valid to allow libbpf to deduce the program type. Update section name to allow skeleton APIs usage. Also rename the program name to make it shorter and more straighforward regarding the API it is testing. While doing so, make sure that test_skb_cgroup_id.sh passes to get a working reference before converting it to test_progs - update the obj name - fix loading issue (verifier rejecting the program when loaded through tc, because of map not found), by preloading the whole obj with bpftool Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20240813-convert_cgroup_tests-v4-3-a33c03458cf6@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-14	selftests/bpf: convert test_cgroup_storage to test_progs	Alexis Lothoré (eBPF Foundation)
	test_cgroup_storage is currently a standalone program which is not run when executing test_progs. Convert it to the test_progs framework so it can be automatically executed in CI. The conversion led to the following changes: - converted the raw bpf program in the userspace test file into a dedicated test program in progs/ dir - reduced the scope of cgroup_storage test: the content from this test overlaps with some other tests already present in test_progs, most notably netcnt and cgroup_storage_multi*. Those tests already check extensively local storage, per-cpu local storage, cgroups interaction, etc. So the new test only keep the part testing that the program return code (based on map content) properly leads to packet being passed or dropped. Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20240813-convert_cgroup_tests-v4-2-a33c03458cf6@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-14	selftests/bpf: convert get_current_cgroup_id_user to test_progs	Alexis Lothoré (eBPF Foundation)
	get_current_cgroup_id_user allows testing for bpf_get_current_cgroup_id() bpf API but is not integrated into test_progs, and so is not tested automatically in CI. Convert it to the test_progs framework to allow running it automatically. The most notable differences with the old test are the following: - the new test relies on autoattach instead of manually hooking/enabling the targeted tracepoint through perf_event, which reduces quite a lot the test code size - it also accesses bpf prog data through global variables instead of maps - sleep duration passed to nanosleep syscall has been reduced to its minimum to not impact overall CI duration (we only care about the syscall being properly triggered, not about the passed duration) Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20240813-convert_cgroup_tests-v4-1-a33c03458cf6@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-14	selftests: netfilter: add test for br_netfilter+conntrack+queue combination	Florian Westphal
	Trigger cloned skbs leaving softirq protection. This triggers splat without the preceeding change ("netfilter: nf_queue: drop packets with cloned unconfirmed conntracks"): WARNING: at net/netfilter/nf_conntrack_core.c:1198 __nf_conntrack_confirm.. because local delivery and forwarding will race for confirmation. Based on a reproducer script from Yi Chen. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-14	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "s390: - Fix failure to start guests with kvm.use_gisa=0 - Panic if (un)share fails to maintain security. ARM: - Use kvfree() for the kvmalloc'd nested MMUs array - Set of fixes to address warnings in W=1 builds - Make KVM depend on assembler support for ARMv8.4 - Fix for vgic-debug interface for VMs without LPIs - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest - Minor code / comment cleanups for configuring PAuth traps - Take kvm->arch.config_lock to prevent destruction / initialization race for a vCPU's CPUIF which may lead to a UAF x86: - Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX) - Fix smatch issues - Small cleanups - Make x2APIC ID 100% readonly - Fix typo in uapi constant Generic: - Use synchronize_srcu_expedited() on irqfd shutdown" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX) KVM: eventfd: Use synchronize_srcu_expedited() on shutdown KVM: selftests: Add a testcase to verify x2APIC is fully readonly KVM: x86: Make x2APIC ID 100% readonly KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) KVM: x86: hyper-v: Remove unused inline function kvm_hv_free_pa_page() KVM: SVM: Fix an error code in sev_gmem_post_populate() KVM: SVM: Fix uninitialized variable bug KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-list KVM: arm64: Tidying up PAuth code in KVM KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain s390/uv: Panic for set and remove shared access UVC errors KVM: s390: fix validity interception issue when gisa is switched off docs: KVM: Fix register ID of SPSR_FIQ KVM: arm64: vgic: fix unexpected unlock sparse warnings KVM: arm64: fix kdoc warnings in W=1 builds KVM: arm64: fix override-init warnings in W=1 builds ...
2024-08-14	KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled	Yan Zhao
	Add a new user option to memslot_perf_test to allow testing memslot move with quirk KVM_X86_QUIRK_SLOT_ZAP_ALL disabled. Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Message-ID: <20240703021219.13939-1-yan.y.zhao@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14	KVM: selftests: Allow slot modification stress test with quirk disabled	Yan Zhao
	Add a new user option to memslot_modification_stress_test to allow testing with slot zap quirk KVM_X86_QUIRK_SLOT_ZAP_ALL disabled. Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Message-ID: <20240703021206.13923-1-yan.y.zhao@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14	KVM: selftests: Test slot move/delete with slot zap quirk enabled/disabled	Yan Zhao
	Update set_memory_region_test to make sure memslot move and deletion function correctly both when slot zap quirk KVM_X86_QUIRK_SLOT_ZAP_ALL is enabled and disabled. Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Message-ID: <20240703021119.13904-1-yan.y.zhao@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14	perf test record.sh: Raise limit of open file descriptors	Veronika Molnarova
	Subtest for system-wide record with '--threads=cpu' option fails due to a limit of open file descriptors on systems with 128 or more CPUs as the default limit is set to 1024. The number of open file descriptors should be slightly above nmb_eventsnmb_cpus + nmb_cpus(for perf.data.n) + 4nmb_cpus(for pipes), which equals 8nmb_cpus. Therefore, temporarily raise the limit to 16nmb_cpus for the test. Committer notes: Instead of disabling ShellCheck warnings all the uses of 'uname -n', i.e. those: In tests/shell/record.sh line 35: default_fd_limit=$(ulimit -Sn) ^-^ SC3045 (warning): In POSIX sh, ulimit -S is undefined. We can just switch from using '/bin/sh' to '/bin/bash' for this test, as bash _has_ 'ulimit -n', so ShellCheck will not emit that warning. There are dozens of 'perf test' shell tests that do just that, '/bin/bash' is a reasonable expectation for those tests. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Radostin Stoyanov <rstoyano@redhat.com> Link: https://lore.kernel.org/linux-perf-users/20240429085721.10122-1-vmolnaro@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf build: Fix up broken capstone feature detection fast path	Arnaldo Carvalho de Melo
	The capstone devel headers define 'struct bpf_insn' in a way that clashes with what is in the libbpf devel headers, so we so far need to avoid including both. This is happening on the tools/build/feature/test-all.c file, where we try building all the expected set of libraries to be normally available on a system: ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output In file included from test-bpf.c:3, from test-all.c:150: /home/acme/git/perf-tools-next/tools/include/uapi/linux/bpf.h:77:8: error: ‘bpf_insn’ defined as wrong kind of tag 77 \| struct bpf_insn { \| ^~~~~~~~ ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output When doing so there is a trick where we define main to be main_test_libcapstone, then include the individual tools/build/feture/test-libcapstone.c capability query test, and then we undef 'main' because we'll do it all over again with the next expected library to be tested (at this time 'lzma'). To complete this mechanism we need to, in test-all.c 'main' routine, to call main_test_libcapstone(), which isn't being done, so the effect of adding references to capstone in test-all.c are not achieved. The only thing that is happening is that test-all.c is failing to build and thus all the tests will have to be done individually, which nullifies the test-all.c single build speedup. So lets remove references to capstone from test-all.c to see if this makes it build again so that we get faster builds or go on fixing up whatever is preventing us to get that benefit. Nothing: after this fix we get a clean test-all.c build and get the build speedup back: ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all. test-all.bin test-all.d test-all.make.output ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/feature/test-all.bin linux-vdso.so.1 (0x00007f13277a1000) libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007f1326e00000) libm.so.6 => /lib64/libm.so.6 (0x00007f13274be000) libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1327496000) libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007f132746f000) libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f1326800000) libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f1327452000) libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f1327436000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f1327403000) libdw.so.1 => /lib64/libdw.so.1 (0x00007f1326d6f000) libz.so.1 => /lib64/libz.so.1 (0x00007f13273e2000) libelf.so.1 => /lib64/libelf.so.1 (0x00007f1326d53000) libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f13273d4000) libslang.so.2 => /lib64/libslang.so.2 (0x00007f1326400000) libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007f1326000000) libc.so.6 => /lib64/libc.so.6 (0x00007f1325e0f000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f1326741000) /lib64/ld-linux-x86-64.so.2 (0x00007f13277a3000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f1326d3f000) libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f1326d07000) ⬢[acme@toolbox perf-tools-next]$ And when having capstone-devel installed we get it detected and linked with perf, allowing us to benefit from the features that it enables: ⬢[acme@toolbox perf-tools-next]$ rpm -q capstone-devel capstone-devel-5.0.1-3.fc40.x86_64 ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/perf \| grep capstone libcapstone.so.5 => /lib64/libcapstone.so.5 (0x00007fe6a5c00000) ⬢[acme@toolbox perf-tools-next]$ /tmp/build/perf-tools-next/perf -vv \| grep cap libcapstone: [ on ] # HAVE_LIBCAPSTONE_SUPPORT ⬢[acme@toolbox perf-tools-next]$ Fixes: 8b767db3309595a2 ("perf: build: introduce the libcapstone") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zry0sepD5Ppa5YKP@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf test: Add new test cases for the branch counter feature	Kan Liang
	Enhance the test case for the branch counter feature. Now, the test verifies: - The new filter can be successfully applied on the supported platforms. - The counter value can be outputted via the perf report -D - The counter value and the abbr name can be outputted via the perf script (New) Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-10-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf script: Add branch counters	Kan Liang
	It's useful to print the branch counter information for each jump in the brstackinsn when it's available. Add a new field 'brcntr' to display the branch counter information. By default, the abbreviation will be used to indicate the branch counter. In the verbose mode, the real event name is shown. $ perf script -F +brstackinsn,+brcntr # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC $ perf script -F +brstackinsn,+brcntr -v tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: branch-instructions:ppp 2 branch-misses 0 # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 7 cycles [14] 0.43 IPC Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-9-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf annotate: Display the branch counter histogram	Kan Liang
	Display the branch counter histogram in the annotation view. Press 'B' to display the branch counter's abbreviation list as well. Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }', 4000 Hz, Event count (approx.): f3 /home/sdp/test/tchain_edit [Percent: local period] Percent │ IPC Cycle Branch Counter (Average IPC: 1.39, IPC Coverage: 29.4%) │ 0000000000401755 <f3>: 0.00 0.00 │ endbr64 │ push %rbp │ mov %rsp,%rbp │ movl $0x0,-0x4(%rbp) 0.00 0.00 │1.33 3 \|A \|- \| ↓ jmp 25 11.03 11.03 │ 11: mov -0x4(%rbp),%eax │ and $0x1,%eax │ test %eax,%eax 17.13 17.13 │2.41 1 \|A \|- \| ↓ je 21 │ addl $0x1,-0x4(%rbp) 21.84 21.84 │2.22 2 \|AA \|- \| ↓ jmp 25 17.13 17.13 │ 21: addl $0x1,-0x4(%rbp) 21.84 21.84 │ 25: cmpl $0x270f,-0x4(%rbp) 11.03 11.03 │0.61 3 \|A \|- \| ↑ jle 11 │ nop │ pop %rbp 0.00 0.00 │0.24 20 \|AA \|B \| ← ret Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-8-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf report: Display the branch counter histogram	Kan Liang
	Reusing the existing --total-cycles option to display the branch counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display the logged branch counter events. They are shown right after all the cycle-related annotations. Extend the 'struct block_info' to store and pass the branch counter related information. The annotation_br_cntr_entry() is to print the histogram of each branch counter event. If the number of logged events is less than 4, the exact number of the abbr name is printed. Otherwise, using '+' to stands for more than 3 events. Assume the number of logged events is less than 4. The annotation_br_cntr_abbr_list() prints the branch counter's abbreviation list. Press 'B' to display the list in the TUI mode. $ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter $ perf report --total-cycles --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }' # Event count (approx.): 1610046 # # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 \|A \|- \| ... 25.27% 1.1M 0.00% 2 \|AA \|- \| ... 15.61% 667.2K 0.00% 1 \|A \|- \| ... 0.16% 6.9K 0.81% 575 \|A \|- \| ... 0.16% 6.8K 1.38% 977 \|AA \|- \| ... 0.16% 6.8K 0.04% 28 \|AA \|B \| ... 0.15% 6.6K 1.33% 946 \|A \|- \| ... 0.11% 4.5K 0.06% 46 \|AAA+\|- \| ... 0.10% 4.4K 0.88% 624 \|A \|- \| ... 0.09% 3.7K 0.74% 524 \|AAA+\|B \| ... With -v applied, # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 A=1 ,B=- ... 25.27% 1.1M 0.00% 2 A=2 ,B=- ... 15.61% 667.2K 0.00% 1 A=1 ,B=- ... 0.16% 6.9K 0.81% 575 A=1 ,B=- ... 0.16% 6.8K 1.38% 977 A=2 ,B=- ... 0.16% 6.8K 0.04% 28 A=2 ,B=1 ... 0.15% 6.6K 1.33% 946 A=1 ,B=- ... 0.11% 4.5K 0.06% 46 A=3+,B=- ... 0.10% 4.4K 0.88% 624 A=1 ,B=- ... 0.09% 3.7K 0.74% 524 A=3+,B=1 ... Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf evsel: Assign abbr name for the branch counter events	Kan Liang
	There could be several branch counter events. If perf tool output the result via the format "event name + a number", the line could be very long and hard to read. An abbreviation is introduced to replace the full event name in the display. The abbreviation starts from 'A' to 'Z9', which can support up to 286 events. The same abbreviation will be assigned if the same events are found in the evlist. The next patch will utilize the abbreviation name to show the branch counter events in the output. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-6-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf annotate: Save branch counters for each block	Kan Liang
	When annotating a basic block, it's useful to display the occurrences of other events in the block. The branch counter feature is only available for newer Intel platforms. So a dedicated option to display the branch counters is not introduced. Reuse the existing --total-cycles option, which triggers the annotation of a basic block and displays the cycle-related annotation. When the branch counters information is available, the branch counters are automatically appended after all the cycle-related annotation. Accounting the branch counters as well when accounting the cycles in hist__account_cycles(). In 'struct annotated_branch', introduce a br_cntr array to save the accumulation of each branch counter. In a sample, all the branch counters for a branch are saved in a u64 space. Because the saturation of a branch counter is small, e.g., for Intel Sierra Forest, the saturation is only 3. Add ANNOTATION__BR_CNTR_SATURATED_FLAG to indicate if a branch counter once saturated. That can be used to indicate a potential event lost because of the saturation. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-5-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf evlist: Save branch counters information	Kan Liang
	The branch counters logging (A.K.A LBR event logging) introduces a per-counter indication of precise event occurrences in LBRs. The kernel only dumps the number of occurrences into a record. The perf tool has to map the number to the corresponding event. Add evlist__update_br_cntr() to go through the evlist to pick the events that are configured to be logged. Assign a logical idx to track them, and add the total number of the events in the leader event. The total number will be used to allocate the space to save the branch counters for a block. The logical idx will be used to locate the corresponding event quickly in the following patches. It only needs to iterate the evlist once. The evsel__has_branch_counters() is also optimized. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-4-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf report: Remove the first overflow check for branch counters	Kan Liang
	A false overflow warning is triggered if a sample doesn't have any LBRs recorded and the branch counters feature is enabled. The current code does OVERFLOW_CHECK_u64() at the very beginning when reading the information of branch counters. It assumes that there is at least one LBR in the PEBS record. But it is a valid case that 0 LBR is recorded especially in a high context switch. Remove the OVERFLOW_CHECK_u64(). The later OVERFLOW_CHECK() should be good enough to check the overflow when reading the information of the branch counters. Fixes: 9fbb4b02302b0ae6 ("perf tools: Add branch counter knob") Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-3-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf report: Fix --total-cycles --stdio output error	Kan Liang
	The --total-cycles may output wrong information with the --stdio. For example: # perf record -e "{cycles,instructions}",cache-misses -b sleep 1 # perf report --total-cycles --stdio The total cycles output of {cycles,instructions} and cache-misses are almost the same. # Samples: 938 of events 'anon group { cycles, instructions }' # Event count (approx.): 938 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 11.19% 2.6K 0.10% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 0.45% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_> 5.11% 1.2K 0.33% 71 [native_write_msr+0 ->> # Samples: 293 of event 'cache-misses' # Event count (approx.): 293 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 11.19% 2.6K 0.13% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 0.59% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_> 5.11% 1.2K 0.43% 71 [native_write_msr+0 ->> With the symbol_conf.event_group, the 'perf report' should only report the block information of the leader event in a group. However, the current implementation retrieves the next event's block information, rather than the next group leader's block information. Make sure the index is updated even if the event is skipped. With the patch, # Samples: 293 of event 'cache-misses' # Event count (approx.): 293 # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] # ............... .............. ........... .......... ..................................................> # 37.98% 9.0K 4.05% 299 [perf_event_addr_filters_exec+0 -> perf_event_a> 11.19% 2.6K 0.28% 21 [perf_iterate_ctx+48 -> > 5.79% 1.4K 1.32% 97 [__intel_pmu_enable_all.constprop.0+80 -> __intel_> Fixes: 6f7164fa231a5f36 ("perf report: Sort by sampled cycles percent per block for stdio") Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf test annotate: Dump trapping test in trap handler	Ian Rogers
	Help to better identify the location of test failures but dumping the failing test in the trap handler. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240813040613.882075-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	perf disasm: Fix memory leak for locked operations	Ian Rogers
	lock__parse() calls disasm_line__parse() passing &ops->locked.ins.name that will use strdup() to populate it. Ensure ops->locked.ins.name is freed in lock__delete(). Found with address/leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240813040613.882075-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14	selftests/ftrace: Add required dependency for kprobe tests	Masami Hiramatsu (Google)
	kprobe_args_{char,string}.tc are using available_filter_functions file which is provided by function tracer. Thus if function tracer is disabled, these tests are failed on recent kernels because tracefs_create_dir is not raised events by adding a dynamic event. Add available_filter_functions to requires line. Fixes: 7c1130ea5cae ("test: ftrace: Fix kprobe test for eventfs") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-08-14	refscale: Add TINY scenario	Paul E. McKenney
	This commit adds a TINY scenario in order to support tests of Tiny RCU and Tiny SRCU. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-14	torture: Add torture.sh --guest-cpu-limit argument for limited hosts	Paul E. McKenney
	Some servers have limitations on the number of CPUs a given guest OS can use. In my earlier experience, such limitations have been at least half of the host's CPUs, but in a recent example, this limit is less than 40%. This commit therefore adds a --guest-cpu-limit argument that allows such low limits to be made known to torture.sh. Signed-off-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-13	selftests/bpf: Avoid subtraction after htons() in ipip tests	Asbjørn Sloth Tønnesen
	On little-endian systems, doing subtraction after htons() leads to interesting results: Given: MAGIC_BYTES = 123 = 0x007B aka. in big endian: 0x7B00 = 31488 sizeof(struct iphdr) = 20 Before this patch: __bpf_constant_htons(MAGIC_BYTES) - sizeof(struct iphdr) = 0x7AEC 0x7AEC = htons(0xEC7A) = htons(60538) So these were outer IP packets with a total length of 123 bytes, containing an inner IP packet with a total length of 60538 bytes. After this patch: __bpf_constant_htons(MAGIC_BYTES - sizeof(struct iphdr)) = htons(103) Now these packets are outer IP packets with a total length of 123 bytes, containing an inner IP packet with a total length of 103 bytes. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Link: https://lore.kernel.org/r/20240808075906.1849564-1-ast@fiberby.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-13	perf inject: Inject build ids for entire call chain	Ian Rogers
	The DSO build id is injected when the dso is first encountered but the checking for first encountered only looks at the sample->ip not the entire callchain. Use the callchain logic to ensure all build ids are inserted. Fixes: 454c407ec17a0c63 ("perf: add perf-inject builtin") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> Link: https://lore.kernel.org/r/20240812224119.744968-1-irogers@google.com [ Split from a larger patch that introduced the API and use it ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf callchain: Add a for_each callback style API	Ian Rogers
	Add a for_each callback style API to callchain with sample__for_each_callchain_node(). Possibly in the future such an API can avoid the overhead of constructing the call chain list. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <tzanussi@gmail.com> Link: https://lore.kernel.org/r/20240812224119.744968-1-irogers@google.com [ Split from a larger patch that introduced the API and use it ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf test: Add test for Intel TPEBS counting mode	Weilin Wang
	Intel TPEBS sampling mode is supported through perf record. The counting mode code uses perf record to capture retire_latency value and use it in metric calculation. This test checks the counting mode code on Intel platforms. Committer testing: root@x1:~# perf test tpebs 123: test Intel TPEBS counting mode : Ok root@x1:~# set -o vi root@x1:~# perf test tpebs 123: test Intel TPEBS counting mode : Ok root@x1:~# perf test -v tpebs 123: test Intel TPEBS counting mode : Ok root@x1:~# perf test -vvv tpebs 123: test Intel TPEBS counting mode: --- start --- test child forked, pid 16603 Testing without --record-tpebs Testing with --record-tpebs ---- end(0) ---- 123: test Intel TPEBS counting mode : Ok root@x1:~# Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-9-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents	Weilin Wang
	TPEBS (Timed PEBS(Precise Event-Based Sampling)) is a new feature Intel PMU from Granite Rapids microarchitecture. It will be used in new TMA (Top-Down Microarchitecture Analysis) releases. Add related introduction to documents while adding new code to support it in 'perf stat'. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-8-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf stat: Add command line option for enabling TPEBS recording	Weilin Wang
	With this command line option, TPEBS recording is turned off in 'perf stat' on default. It will only be turned on when this option is given in 'perf stat' command. Example with --record-tpebs: perf stat -M tma_split_loads -C1-4 --record-tpebs sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.044 MB - ] Performance counter stats for 'CPU(s) 1-4': 53,259,156,071 cpu_core/TOPDOWN.SLOTS/ # 1.6 % tma_split_loads (50.00%) 15,867,565,250 cpu_core/topdown-retiring/ (50.00%) 15,655,580,731 cpu_core/topdown-mem-bound/ (50.00%) 11,738,022,218 cpu_core/topdown-bad-spec/ (50.00%) 6,151,265,424 cpu_core/topdown-fe-bound/ (50.00%) 20,445,917,581 cpu_core/topdown-be-bound/ (50.00%) 6,925,098,013 cpu_core/L1D_PEND_MISS.PENDING/ (50.00%) 3,838,653,421 cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/ (50.00%) 4,797,059,783 cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/ (50.00%) 11,931,916,714 cpu_core/CPU_CLK_UNHALTED.THREAD/ (50.00%) 102,576,164 cpu_core/MEM_LOAD_COMPLETED.L1_MISS_ANY/ (50.00%) 64,071,854 cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/ (50.00%) 3 cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/R 1.003049679 seconds time elapsed Example without --record-tpebs: perf stat -M tma_contested_accesses -C1 sleep 1 Performance counter stats for 'CPU(s) 1': 50,203,891 cpu_core/TOPDOWN.SLOTS/ # 0.0 % tma_contested_accesses (63.60%) 10,040,777 cpu_core/topdown-retiring/ (63.60%) 6,890,729 cpu_core/topdown-mem-bound/ (63.60%) 2,756,463 cpu_core/topdown-bad-spec/ (63.60%) 10,828,288 cpu_core/topdown-fe-bound/ (63.60%) 28,350,432 cpu_core/topdown-be-bound/ (63.60%) 98 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM/ (63.70%) 577,520 cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/ (54.62%) 313,339 cpu_core/MEMORY_ACTIVITY.STALLS_L3_MISS/ (54.62%) 14,155 cpu_core/MEM_LOAD_RETIRED.L1_MISS/ (45.54%) 0 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD/ (36.30%) 8,468,077 cpu_core/CPU_CLK_UNHALTED.THREAD/ (45.38%) 198 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/ (45.38%) 8,324 cpu_core/MEM_LOAD_RETIRED.FB_HIT/ (45.38%) 3,388,031,520 TSC 23,226,785 cpu_core/CPU_CLK_UNHALTED.REF_TSC/ (54.46%) 80 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/ (54.46%) 0 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/R 0 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/R 1,006,816,667 ns duration_time 1.002537737 seconds time elapsed Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-7-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf vendor events intel: Add MTL metric JSON files	Weilin Wang
	Add MTL metric JSON file for TMA4.8. Some of the metrics' formulas use TPEBS retire_latency in MTL. This also includes lated E-Core TMA3.6 changes. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-6-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	perf stat: Fork and launch 'perf record' when 'perf stat' needs to get ↵	Weilin Wang
	retire latency value for a metric. When retire_latency value is used in a metric formula, evsel would fork a 'perf record' process with "-e" and "-W" options. 'perf record' will collect required retire_latency values in parallel while 'perf stat' is collecting counting values. At the point of time that 'perf stat' stops counting, evsel would stop 'perf record' by sending sigterm signal to 'perf record' process. Sampled data will be processed to get retire latency value. Another thread is required to synchronize between 'perf stat' and 'perf record' when we pass data through pipe. Retire_latency evsel is not opened for 'perf stat' so that there is no counter wasted on it. This commit includes code suggested by Namhyung to adjust reading size for groups that include retire_latency evsels. In current :R parsing implementation, the parser would recognize events with retire_latency modifier and insert them into the evlist like a normal event. Ideally, we need to avoid counting these events. In this commit, at the time when a retire_latency evsel is read, set the retire latency value processed from the sampled data to count value. This sampled retire latency value will be used for metric calculation and final event count print out. No special metric calculation and event print out code required for retire_latency events. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-4-weilin.wang@intel.com [ Squashed the 3rd and 4th commit in the series to keep it building patch by patch ] [ Constified the 'struct perf_tool' pointer in process_sample_event() ] [ Use perf_tool__init(&tool, false) to address a segfault I reported and Ian/Weilin diagnosed ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13	KVM: selftests: Add a testcase to verify x2APIC is fully readonly	Michal Luczaj
	Add a test to verify that userspace can't change a vCPU's x2APIC ID by abusing KVM_SET_LAPIC. KVM models the x2APIC ID (and x2APIC LDR) as readonly, and silently ignores userspace attempts to change the x2APIC ID for backwards compatibility. Signed-off-by: Michal Luczaj <mhal@rbox.co> [sean: write changelog, add to existing test] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240802202941.344889-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13	Merge tag 'kvmarm-fixes-6.11-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.11, round #1 - Use kvfree() for the kvmalloc'd nested MMUs array - Set of fixes to address warnings in W=1 builds - Make KVM depend on assembler support for ARMv8.4 - Fix for vgic-debug interface for VMs without LPIs - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest - Minor code / comment cleanups for configuring PAuth traps - Take kvm->arch.config_lock to prevent destruction / initialization race for a vCPU's CPUIF which may lead to a UAF