linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-11-25	selftests/bpf: Mix legacy (maps) and modern (vars) BPF in one test	Andrii Nakryiko
	Add selftest that combines two BPF programs within single BPF object file such that one of the programs is using global variables, but can be skipped at runtime on old kernels that don't support global data. Another BPF program is written with the goal to be runnable on very old kernels and only relies on explicitly accessed BPF maps. Such test, run against old kernels (e.g., libbpf CI will run it against 4.9 kernel that doesn't support global data), allows to test the approach and ensure that libbpf doesn't make unnecessary assumption about necessary kernel features. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211123200105.387855-2-andrii@kernel.org
2021-11-23	dccp/tcp: Remove an unused argument in inet_csk_listen_start().	Kuniyuki Iwashima
	The commit 1295e2cf3065 ("inet: minor optimization for backlog setting in listen(2)") added change so that sk_max_ack_backlog is initialised earlier in inet_dccp_listen() and inet_listen(). Since then, we no longer use backlog in inet_csk_listen_start(), so let's remove it. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23	selftests: add arp_ndisc_evict_nocarrier to Makefile	James Prestwood
	This was previously added in selftests but never added to the Makefile Signed-off-by: James Prestwood <prestwoj@gmail.com> Link: https://lore.kernel.org/r/20211122171806.3529401-1-prestwoj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-22	selftests/bpf: Fix trivial typo	Drew Fustini
	Fix trivial typo in comment from 'oveflow' to 'overflow'. Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Drew Fustini <dfustini@baylibre.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211122070528.837806-1-dfustini@baylibre.com
2021-11-22	selftests: net: fib_nexthops: add test for group refcount imbalance bug	Nikolay Aleksandrov
	The new selftest runs a sequence which causes circular refcount dependency between deleted objects which cannot be released and results in a netdevice refcount imbalance. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22	selftests/tc-testings: Be compatible with newer tc output	Li Zhijian
	old tc(iproute2-5.9.0) output: action order 1: bpf action.o:[action-ok] id 60 tag bcf7977d3b93787c jited default-action pipe newer tc(iproute2-5.14.0) output: action order 1: bpf action.o:[action-ok] id 64 name tag bcf7977d3b93787c jited default-action pipe It can fix below errors: # ok 260 f84a - Add cBPF action with invalid bytecode # not ok 261 e939 - Add eBPF action with valid object-file # Could not match regex pattern. Verify command output: # total acts 0 # # action order 1: bpf action.o:[action-ok] id 42 name tag bcf7977d3b93787c jited default-action pipe # index 667 ref 1 bind 0 Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22	selftests/tc-testing: match any qdisc type	Li Zhijian
	We should not always presume all kernels use pfifo_fast as the default qdisc. For example, a fq_codel qdisk could have below output: qdisc fq_codel 0: parent 1:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-20	selftests: mptcp: add tproxy test case	Florian Westphal
	No hard dependencies here, just skip if test environ lacks nft binary or the needed kernel config options. The test case spawns listener in ns2 but ns1 will connect to the ip address of ns4. policy routing + tproxy rule will redirect packets to ns2 instead of forward. v3: - update mptcp/config (Mat Martineau) - more verbose SKIP messages in mptcp_connect.sh Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19	libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags	Florent Revest
	bpf_program__set_extra_flags has just been introduced so we can still change it without breaking users. This new interface is a bit more flexible (for example if someone wants to clear a flag). Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org
2021-11-19	Merge tag 'gpio-fixes-for-v5.16-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix a coccicheck warning in gpio-virtio - fix gpio selftests build issues - fix a Kconfig issue in gpio-rockchip * tag 'gpio-fixes-for-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: rockchip: needs GENERIC_IRQ_CHIP to fix build errors selftests: gpio: restore CFLAGS options selftests: gpio: fix uninitialised variable warning selftests: gpio: fix gpio compiling error gpio: virtio: remove unneeded semicolon
2021-11-19	selftests/bpf: Add btf_dedup case with duplicated structs within CU	Jiri Olsa
	Add an artificial minimal example simulating compilers producing two different types within a single CU that correspond to identical struct definitions. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211117194114.347675-2-andrii@kernel.org
2021-11-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller
	Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Add selftest for vrf+conntrack, from Florian Westphal. 2) Extend nfqueue selftest to cover nfqueue, also from Florian. 3) Remove duplicated include in nft_payload, from Wan Jiabing. 4) Several improvements to the nat port shadowing selftest, from Phil Sutter. 5) Fix filtering of reply tuple in ctnetlink, from Florent Fourcot. 6) Do not override error with -EINVAL in filter setup path, also from Florent. 7) Honor sysctl_expire_nodest_conn regardless conn_reuse_mode for reused connections, from yangxingwu. 8) Replace snprintf() by sysfs_emit() in xt_IDLETIMER as reported by Coccinelle, from Jing Yao. 9) Incorrect IPv6 tunnel match in flowtable offload, from Will Mortensen. 10) Switch port shadow selftest to use socat, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-18	Merge tag 'net-5.16-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, mac80211. Current release - regressions: - devlink: don't throw an error if flash notification sent before devlink visible - page_pool: Revert "page_pool: disable dma mapping support...", turns out there are active arches who need it Current release - new code bugs: - amt: cancel delayed_work synchronously in amt_fini() Previous releases - regressions: - xsk: fix crash on double free in buffer pool - bpf: fix inner map state pruning regression causing program rejections - mac80211: drop check for DONT_REORDER in __ieee80211_select_queue, preventing mis-selecting the best effort queue - mac80211: do not access the IV when it was stripped - mac80211: fix radiotap header generation, off-by-one - nl80211: fix getting radio statistics in survey dump - e100: fix device suspend/resume Previous releases - always broken: - tcp: fix uninitialized access in skb frags array for Rx 0cp - bpf: fix toctou on read-only map's constant scalar tracking - bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs - tipc: only accept encrypted MSG_CRYPTO msgs - smc: transfer remaining wait queue entries during fallback, fix missing wake ups - udp: validate checksum in udp_read_sock() (when sockmap is used) - sched: act_mirred: drop dst for the direction from egress to ingress - virtio_net_hdr_to_skb: count transport header in UFO, prevent allowing bad skbs into the stack - nfc: reorder the logic in nfc_{un,}register_device, fix unregister - ipsec: check return value of ipv6_skip_exthdr - usb: r8152: add MAC passthrough support for more Lenovo Docks" * tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits) ptp: ocp: Fix a couple NULL vs IS_ERR() checks net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock() net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound ipv6: check return value of ipv6_skip_exthdr e100: fix device suspend/resume devlink: Don't throw an error if flash notification sent before devlink visible page_pool: Revert "page_pool: disable dma mapping support..." ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port() octeontx2-af: debugfs: don't corrupt user memory NFC: add NCI_UNREG flag to eliminate the race NFC: reorder the logic in nfc_{un,}register_device NFC: reorganize the functions in nci_request tipc: check for null after calling kmemdup i40e: Fix display error code in dmesg i40e: Fix creation of first queue by omitting it if is not power of two i40e: Fix warning message and call stack during rmmod i40e driver i40e: Fix ping is lost after configuring ADq on VF i40e: Fix changing previously set num_queue_pairs for PFs i40e: Fix NULL ptr dereference on VSI filter sync i40e: Fix correct max_pkt_size on VF RX queue ...
2021-11-18	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull KVM fixes from Paolo Bonzini: "Selftest changes: - Cleanups for the perf test infrastructure and mapping hugepages - Avoid contention on mmap_sem when the guests start to run - Add event channel upcall support to xen_shinfo_test x86 changes: - Fixes for Xen emulation - Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache - Fixes for migration of 32-bit nested guests on 64-bit hypervisor - Compilation fixes - More SEV cleanups Generic: - Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS and num_online_cpus(). Most architectures were only using one of the two" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus() KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus() KVM: x86: Assume a 64-bit hypercall for guests with protected state selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore riscv: kvm: fix non-kernel-doc comment block KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror() KVM: SEV: Drop a redundant setting of sev->asid during initialization KVM: SEV: WARN if SEV-ES is marked active but SEV is not KVM: SEV: Set sev_info.active after initial checks in sev_guest_init() KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache KVM: nVMX: Use a gfn_to_hva_cache for vmptrld KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check KVM: x86/xen: Use sizeof_field() instead of open-coding it KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12 KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO ...
2021-11-18	selfetests/bpf: Adapt vmtest.sh to s390 libbpf CI changes	Ilya Leoshkevich
	[1] added s390 support to libbpf CI and added an ${ARCH} prefix to a number of paths and identifiers in libbpf GitHub repo, which vmtest.sh relies upon. Update these and make use of the new s390 support. [1] https://github.com/libbpf/libbpf/pull/204 Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211118115225.1349726-1-iii@linux.ibm.com
2021-11-18	selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore	Arnaldo Carvalho de Melo
	$ git status nothing to commit, working tree clean $ $ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1 $ git status Untracked files: (use "git add <file>..." to include in what will be committed) tools/testing/selftests/kvm/x86_64/sev_migrate_tests nothing added to commit but untracked files present (use "git add" to track) $ Fixes: 6a58150859fdec76 ("selftest: KVM: Add intra host migration tests") Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Rientjes <rientjes@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Message-Id: <YZPIPfvYgRDCZi/w@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-17	ipv4/raw: support binding to nonlocal addresses	Riccardo Paolo Bestetti
	Add support to inet v4 raw sockets for binding to nonlocal addresses through the IP_FREEBIND and IP_TRANSPARENT socket options, as well as the ipv4.ip_nonlocal_bind kernel parameter. Add helper function to inet_sock.h to check for bind address validity on the base of the address type and whether nonlocal address are enabled for the socket via any of the sockopts/sysctl, deduplicating checks in ipv4/ping.c, ipv4/af_inet.c, ipv6/af_inet6.c (for mapped v4->v6 addresses), and ipv4/raw.c. Add test cases with IP[V6]_FREEBIND verifying that both v4 and v6 raw sockets support binding to nonlocal addresses after the change. Add necessary support for the test cases to nettest. Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211117090010.125393-1-pbl@bestov.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-17	selftests/bpf: Fix xdpxceiver failures for no hugepages	Tirthendu Sarkar
	xsk_configure_umem() needs hugepages to work in unaligned mode. So when hugepages are not configured, 'unaligned' tests should be skipped which is determined by the helper function hugepages_present(). This function erroneously returns true with MAP_NORESERVE flag even when no hugepages are configured. The removal of this flag fixes the issue. The test TEST_TYPE_UNALIGNED_INV_DESC also needs to be skipped when there are no hugepages. However, this was not skipped as there was no check for presence of hugepages and hence was failing. The check to skip the test has now been added. Fixes: a4ba98dd0c69 (selftests: xsk: Add test for unaligned mode) Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211117123613.22288-1-tirthendu.sarkar@intel.com
2021-11-16	selftests/bpf: Mark variable as static	Yucong Sun
	Fix warnings from checkstyle.pl Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112192535.898352-4-fallentree@fb.com
2021-11-16	selftests/bpf: Variable naming fix	Yucong Sun
	Change log_fd to log_fp to reflect its type correctly. Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112192535.898352-3-fallentree@fb.com
2021-11-16	selftests/bpf: Move summary line after the error logs	Yucong Sun
	Makes it easier to find the summary line when there is a lot of logs to scroll back. Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112192535.898352-2-fallentree@fb.com
2021-11-16	selftests: add a test case for mirred egress to ingress	Davide Caratti
	add a selftest that verifies the correct behavior of TC act_mirred egress to ingress: in particular, it checks if the dst_entry is removed from skb before redirect egress -> ingress. The correct behavior is: an ICMP 'echo request' generated by ping will be received and generate a reply the same way as the one generated by mausezahn. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Jakub Kicinski
	Daniel Borkmann says: ==================== pull-request: bpf 2021-11-16 We've added 12 non-merge commits during the last 5 day(s) which contain a total of 23 files changed, 573 insertions(+), 73 deletions(-). The main changes are: 1) Fix pruning regression where verifier went overly conservative rejecting previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer. 2) Fix verifier TOCTOU bug when using read-only map's values as constant scalars during verification, from Daniel Borkmann. 3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson. 4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker. 5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_() helpers in tracing programs due to deadlock possibilities, from Dmitrii Banshchikov. 6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang. 7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin. 8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: udp: Validate checksum in udp_read_sock() bpf: Fix toctou on read-only map's constant scalar tracking samples/bpf: Fix build error due to -isystem removal selftests/bpf: Add tests for restricted helpers bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs libbpf: Perform map fd cleanup for gen_loader in case of error samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu tools/runqslower: Fix cross-build samples/bpf: Fix summary per-sec stats in xdp_sample_user selftests/bpf: Check map in map pruning bpf: Fix inner map state pruning regression. xsk: Fix crash on double free in buffer pool ==================== Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge branch 'kvm-selftest' into kvm-master	Paolo Bonzini
	- Cleanups for the perf test infrastructure and mapping hugepages - Avoid contention on mmap_sem when the guests start to run - Add event channel upcall support to xen_shinfo_test
2021-11-16	selftests/bpf: Add uprobe triggering overhead benchmarks	Andrii Nakryiko
	Add benchmark to measure overhead of uprobes and uretprobes. Also have a baseline (no uprobe attached) benchmark. On my dev machine, baseline benchmark can trigger 130M user_target() invocations. When uprobe is attached, this falls to just 700K. With uretprobe, we get down to 520K: $ sudo ./bench trig-uprobe-base -a Summary: hits 131.289 ± 2.872M/s # UPROBE $ sudo ./bench -a trig-uprobe-without-nop Summary: hits 0.729 ± 0.007M/s $ sudo ./bench -a trig-uprobe-with-nop Summary: hits 1.798 ± 0.017M/s # URETPROBE $ sudo ./bench -a trig-uretprobe-without-nop Summary: hits 0.508 ± 0.012M/s $ sudo ./bench -a trig-uretprobe-with-nop Summary: hits 0.883 ± 0.008M/s So there is almost 2.5x performance difference between probing nop vs non-nop instruction for entry uprobe. And 1.7x difference for uretprobe. This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe and 2 microseconds for non-nop uretprobe. For nop variants, uprobe and uretprobe overhead is down to 0.556 and 1.13 microseconds, respectively. For comparison, just doing a very low-overhead syscall (with no BPF programs attached anywhere) gives: $ sudo ./bench trig-base -a Summary: hits 4.830 ± 0.036M/s So uprobes are about 2.67x slower than pure context switch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
2021-11-16	selftests/bpf: Configure dir paths via env in test_bpftool_synctypes.py	Quentin Monnet
	Script test_bpftool_synctypes.py parses a number of files in the bpftool directory (or even elsewhere in the repo) to make sure that the list of types or options in those different files are consistent. Instead of having fixed paths, let's make the directories configurable through environment variable. This should make easier in the future to run the script in a different setup, for example on an out-of-tree bpftool mirror with a different layout. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211115225844.33943-4-quentin@isovalent.com
2021-11-16	bpftool: Update doc (use susbtitutions) and test_bpftool_synctypes.py	Quentin Monnet
	test_bpftool_synctypes.py helps detecting inconsistencies in bpftool between the different list of types and options scattered in the sources, the documentation, and the bash completion. For options that apply to all bpftool commands, the script had a hardcoded list of values, and would use them to check whether the man pages are up-to-date. When writing the script, it felt acceptable to have this list in order to avoid to open and parse bpftool's main.h every time, and because the list of global options in bpftool doesn't change so often. However, this is prone to omissions, and we recently added a new -l\|--legacy option which was described in common_options.rst, but not listed in the options summary of each manual page. The script did not complain, because it keeps comparing the hardcoded list to the (now) outdated list in the header file. To address the issue, this commit brings the following changes: - Options that are common to all bpftool commands (--json, --pretty, and --debug) are moved to a dedicated file, and used in the definition of a RST substitution. This substitution is used in the sources of all the man pages. - This list of common options is updated, with the addition of the new -l\|--legacy option. - The script test_bpftool_synctypes.py is updated to compare: - Options specific to a command, found in C files, for the interactive help messages, with the same specific options from the relevant man page for that command. - Common options, checked just once: the list in main.h is compared with the new list in substitutions.rst. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211115225844.33943-3-quentin@isovalent.com
2021-11-16	KVM: selftests: Use perf_test_destroy_vm in memslot_modification_stress_test	David Matlack
	Change memslot_modification_stress_test to use perf_test_destroy_vm instead of manually calling ucall_uninit and kvm_vm_free. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111001257.1446428-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Wait for all vCPU to be created before entering guest mode	David Matlack
	Thread creation requires taking the mmap_sem in write mode, which causes vCPU threads running in guest mode to block while they are populating memory. Fix this by waiting for all vCPU threads to be created and start running before entering guest mode on any one vCPU thread. This substantially improves the "Populate memory time" when using 1GiB pages since it allows all vCPUs to zero pages in parallel rather than blocking because a writer is waiting (which is waiting for another vCPU that is busy zeroing a 1GiB page). Before: $ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb ... Populate memory time: 52.811184013s After: $ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb ... Populate memory time: 10.204573342s Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111001257.1446428-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Move vCPU thread creation and joining to common helpers	David Matlack
	Move vCPU thread creation and joining to common helper functions. This is in preparation for the next commit which ensures that all vCPU threads are fully created before entering guest mode on any one vCPU. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111001257.1446428-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Start at iteration 0 instead of -1	David Matlack
	Start at iteration 0 instead of -1 to avoid having to initialize vcpu_last_completed_iteration when setting up vCPU threads. This simplifies the next commit where we move vCPU thread initialization out to a common helper. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111001257.1446428-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Sync perf_test_args to guest during VM creation	Sean Christopherson
	Copy perf_test_args to the guest during VM creation instead of relying on the caller to do so at their leisure. Ideally, tests wouldn't even be able to modify perf_test_args, i.e. they would have no motivation to do the sync, but enforcing that is arguably a net negative for readability. No functional change intended. [Set wr_fract=1 by default and add helper to override it since the new access_tracking_perf_test needs to set it dynamically.] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-13-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Fill per-vCPU struct during "perf_test" VM creation	Sean Christopherson
	Fill the per-vCPU args when creating the perf_test VM instead of having the caller do so. This helps ensure that any adjustments to the number of pages (and thus vcpu_memory_bytes) are reflected in the per-VM args. Automatically filling the per-vCPU args will also allow a future patch to do the sync to the guest during creation. Signed-off-by: Sean Christopherson <seanjc@google.com> [Updated access_tracking_perf_test as well.] Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-12-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Create VM with adjusted number of guest pages for perf tests	Sean Christopherson
	Use the already computed guest_num_pages when creating the so called extra VM pages for a perf test, and add a comment explaining why the pages are allocated as extra pages. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-11-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Remove perf_test_args.host_page_size	Sean Christopherson
	Remove perf_test_args.host_page_size and instead use getpagesize() so that it's somewhat obvious that, for tests that care about the host page size, they care about the system page size, not the hardware page size, e.g. that the logic is unchanged if hugepages are in play. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-10-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Move per-VM GPA into perf_test_args	Sean Christopherson
	Move the per-VM GPA into perf_test_args instead of storing it as a separate global variable. It's not obvious that guest_test_phys_mem holds a GPA, nor that it's connected/coupled with per_vcpu->gpa. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-9-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Use perf util's per-vCPU GPA/pages in demand paging test	Sean Christopherson
	Grab the per-vCPU GPA and number of pages from perf_util in the demand paging test instead of duplicating perf_util's calculations. Note, this may or may not result in a functional change. It's not clear that the test's calculations are guaranteed to yield the same value as perf_util, e.g. if guest_percpu_mem_size != vcpu_args->pages. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-8-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Capture per-vCPU GPA in perf_test_vcpu_args	Sean Christopherson
	Capture the per-vCPU GPA in perf_test_vcpu_args so that tests can get the GPA without having to calculate the GPA on their own. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Use shorthand local var to access struct perf_tests_args	Sean Christopherson
	Use 'pta' as a local pointer to the global perf_tests_args in order to shorten line lengths and make the code borderline readable. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-6-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Require GPA to be aligned when backed by hugepages	Sean Christopherson
	Assert that the GPA for a memslot backed by a hugepage is aligned to the hugepage size and fix perf_test_util accordingly. Lack of GPA alignment prevents KVM from backing the guest with hugepages, e.g. x86's write-protection of hugepages when dirty logging is activated is otherwise not exercised. Add a comment explaining that guest_page_size is for non-huge pages to try and avoid confusion about what it actually tracks. Cc: Ben Gardon <bgardon@google.com> Cc: Yanan Wang <wangyanan55@huawei.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> [Used get_backing_src_pagesz() to determine alignment dynamically.] Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Assert mmap HVA is aligned when using HugeTLB	Sean Christopherson
	Manually padding and aligning the mmap region is only needed when using THP. When using HugeTLB, mmap will always return an address aligned to the HugeTLB page size. Add a comment to clarify this and assert the mmap behavior for HugeTLB. [Removed requirement that HugeTLB mmaps must be padded per Yanan's feedback and added assertion that mmap returns aligned addresses when using HugeTLB.] Cc: Ben Gardon <bgardon@google.com> Cc: Yanan Wang <wangyanan55@huawei.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Expose align() helpers to tests	Sean Christopherson
	Refactor align() to work with non-pointers and split into separate helpers for aligning up vs. down. Add align_ptr_up() for use with pointers. Expose all helpers so that they can be used by tests and/or other utilities. The align_down() helper in particular will be used to ensure gpa alignment for hugepages. No functional change intended. [Added sepearate up/down helpers and replaced open-coded alignment bit math throughout the KVM selftests.] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Explicitly state indicies for vm_guest_mode_params array	Sean Christopherson
	Explicitly state the indices when populating vm_guest_mode_params to make it marginally easier to visualize what's going on. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> [Added indices for new guest modes.] Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	KVM: selftests: Add event channel upcall support to xen_shinfo_test	David Woodhouse
	When I first looked at this, there was no support for guest exception handling in the KVM selftests. In fact it was merged into 5.10 before the Xen support got merged in 5.11, and I could have used it from the start. Hook it up now, to exercise the Xen upcall delivery. I'm about to make things a bit more interesting by handling the full 2level event channel stuff in-kernel on top of the basic vector injection that we already have, and I'll want to build more tests on top. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211115165030.7422-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16	selftests/bpf: Add a dedup selftest with equivalent structure types	Yonghong Song
	Without previous libbpf patch, the following error will occur: $ ./test_progs -t btf ... do_test_dedup:FAIL:check btf_dedup failed errno:-22#13/205 btf/dedup: btf_type_tag #5, struct:FAIL And the previous libbpf patch fixed the issue. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211115163943.3922547-1-yhs@fb.com
2021-11-15	selftests/bpf: Add tests for restricted helpers	Dmitrii Banshchikov
	This patch adds tests that bpf_ktime_get_coarse_ns(), bpf_timer_* and bpf_spin_lock()/bpf_spin_unlock() helpers are forbidden in tracing progs as their use there may result in various locking issues. Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211113142227.566439-3-me@ubique.spb.ru
2021-11-15	selftests/sgx: Add test for multiple TCS entry	Reinette Chatre
	Each thread executing in an enclave is associated with a Thread Control Structure (TCS). The SGX test enclave contains two hardcoded TCS, thus supporting two threads in the enclave. Add a test to ensure it is possible to enter enclave at both entrypoints. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/7be151a57b4c7959a2364753b995e0006efa3da1.1636997631.git.reinette.chatre@intel.com
2021-11-15	selftests/sgx: Enable multiple thread support	Reinette Chatre
	Each thread executing in an enclave is associated with a Thread Control Structure (TCS). The test enclave contains two hardcoded TCS. Each TCS contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. The two TCS structures within the test enclave share their SSA (State Save Area) resulting in the threads clobbering each other's data. Fix this by providing each TCS their own SSA area. Additionally, there is an 8K stack space and its address is computed from the enclave entry point which is correctly done for TCS #1 that starts on the first address inside the enclave but results in out of bounds memory when entering as TCS #2. Split 8K stack space into two separate pages with offset symbol between to ensure the current enclave entry calculation can continue to be used for both threads. While using the enclave with multiple threads requires these fixes the impact is not apparent because every test up to this point enters the enclave from the first TCS. More detail about the stack fix: ------------------------------- Before this change the test enclave (test_encl) looks as follows: .tcs (2 pages): (page 1) TCS #1 (page 2) TCS #2 .text (1 page) One page of code .data (5 pages) (page 1) encl_buffer (page 2) encl_buffer (page 3) SSA (page 4 and 5) STACK encl_stack: As shown above there is a symbol, encl_stack, that points to the end of the .data segment (pointing to the end of page 5 in .data) which is also the end of the enclave. The enclave entry code computes the stack address by adding encl_stack to the pointer to the TCS that entered the enclave. When entering at TCS #1 the stack is computed correctly but when entering at TCS #2 the stack pointer would point to one page beyond the end of the enclave and a #PF would result when TCS #2 attempts to enter the enclave. The fix involves moving the encl_stack symbol between the two stack pages. Doing so enables the stack address computation in the entry code to compute the correct stack address for each TCS. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/a49dc0d85401db788a0a3f0d795e848abf3b1f44.1636997631.git.reinette.chatre@intel.com
2021-11-15	selftests/sgx: Add page permission and exception test	Reinette Chatre
	The Enclave Page Cache Map (EPCM) is a secure structure used by the processor to track the contents of the enclave page cache. The EPCM contains permissions with which enclave pages can be accessed. SGX support allows EPCM and PTE page permissions to differ - as long as the PTE permissions do not exceed the EPCM permissions. Add a test that: (1) Creates an SGX enclave page with writable EPCM permission. (2) Changes the PTE permission on the page to read-only. This should be permitted because the permission does not exceed the EPCM permission. (3) Attempts a write to the page. This should generate a page fault (#PF) because of the read-only PTE even though the EPCM permissions allow the page to be written to. This introduces the first test of SGX exception handling. In this test the issue that caused the exception (PTE page permissions) can be fixed from outside the enclave and after doing so it is possible to re-enter enclave at original entrypoint with ERESUME. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/3bcc73a4b9fe8780bdb40571805e7ced59e01df7.1636997631.git.reinette.chatre@intel.com