linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-11-08	perf stat: Fix crash with --per-node --metric-only in CSV mode	Namhyung Kim
	The following command will get segfault due to missing aggr_header_csv for AGGR_NODE: $ sudo perf stat -a --per-node -x, --metric-only true Committer testing: Before this patch: # perf stat -a --per-node -x, --metric-only true Segmentation fault (core dumped) # After: # gdb perf -bash: gdb: command not found # perf stat -a --per-node -x, --metric-only true node,Ghz,frontend cycles idle,backend cycles idle,insn per cycle,branch-misses of all branches, N0,32,0.335,2.10,0.65,0.69,0.03,1.92, # Fixes: 86895b480a2f10c7 ("perf stat: Add --per-node agregation support") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221107213314.3239159-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-08	kselftest/arm64: fix array_size.cocci warning	Kang Minchul
	Use ARRAY_SIZE to fix the following coccicheck warnings: tools/testing/selftests/arm64/mte/check_buffer_fill.c:341:20-21: WARNING: Use ARRAY_SIZE tools/testing/selftests/arm64/mte/check_buffer_fill.c:35:20-21: WARNING: Use ARRAY_SIZE tools/testing/selftests/arm64/mte/check_buffer_fill.c:168:20-21: WARNING: Use ARRAY_SIZE tools/testing/selftests/arm64/mte/check_buffer_fill.c:72:20-21: WARNING: Use ARRAY_SIZE tools/testing/selftests/arm64/mte/check_buffer_fill.c:369:25-26: WARNING: Use ARRAY_SIZE Signed-off-by: Kang Minchul <tegongkang@gmail.com> Link: https://lore.kernel.org/r/20221105073143.78521-1-tegongkang@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08	kselftest/arm64: Print ASCII version of unknown signal frame magic values	Mark Brown
	The signal magic values are supposed to be allocated as somewhat meaningful ASCII so if we encounter a bad magic value print the any alphanumeric characters we find in it as well as the hex value to aid debuggability. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221102140543.98193-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08	kselftest/arm64: Remove validation of extra_context from TODO	Mark Brown
	When fixing up support for extra_context in the signal handling tests I didn't notice that there is a TODO file in the directory which lists this as a thing to be done. Since it's been done remove it from the list. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221027110324.33802-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08	kselftest/arm64: Provide progress messages when signalling children	Mark Brown
	Especially when the test is configured to run for a longer time it can be reassuring to users to see that the supervising program is running OK so provide a message every second when the output timer expires. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221017144553.773176-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08	kselftest/arm64: Check that all children are producing output in fp-stress	Mark Brown
	Currently we don't have an explicit check that when it's been a second since we have seen output produced from the test programs starting up that means all of them are running and we should start both sending signals and timing out. This is not reliable, especially on very heavily loaded systems where the test programs might take longer than a second to run. We do skip sending signals to children that have not produced output yet so we won't cause them to exit unexpectedly by sending a signal but this can create confusion when interpreting output, for example appearing to show the tests running for less time than expected or appearing to show missed signal deliveries. Avoid issues by explicitly checking that we have seen output from all the child processes before we start sending signals or counting test run time. This is especially likely on virtual platforms with large numbers of vector lengths supported since the platforms are slow and there will be a lot of tasks per CPU. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221017144553.773176-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08	testing: use the copyleft-next-0.3.1 SPDX tag	Luis Chamberlain
	Two selftests drivers exist under the copyleft-next license. These drivers were added prior to SPDX practice taking full swing in the kernel. Now that we have an SPDX tag for copyleft-next-0.3.1 documented, embrace it and remove the boiler plate. Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Kuno Woudt <kuno@frob.nl> Cc: Richard Fontana <fontana@sharpeleven.org> Cc: copyleft-next@lists.fedorahosted.org Cc: Ciaran Farrell <Ciaran.Farrell@suse.com> Cc: Christopher De Nicolo <Christopher.DeNicolo@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Thorsten Leemhuis <linux@leemhuis.info> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-08	ACPICA: Add utcksum.o to the acpidump Makefile	Rafael J. Wysocki
	Commit 51aad1a6723b ("ACPICA: Finish support for the CDAT table") did not add utcksum.o to the acpidump Makefile by mistake. Do that now. Fixes: 51aad1a6723b ("ACPICA: Finish support for the CDAT table") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-08	memblock tests: remove completed TODO item	Rebecca Mckeever
	Remove completed item from TODO list. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/f2263abe45613b28f1583fbf04a4bffcf735bcf6.1667802195.git.remckee0@gmail.com
2022-11-08	memblock tests: add generic NUMA tests for memblock_alloc_exact_nid_raw	Rebecca Mckeever
	Add tests for memblock_alloc_exact_nid_raw() where the simulated physical memory is set up with multiple NUMA nodes. Additionally, all but one of these tests set nid != NUMA_NO_NODE. All tests are run for both top-down and bottom-up allocation directions. The tested scenarios are: Range unrestricted: - region cannot be allocated: + there are no previously reserved regions, but requested node is too small + the requested node is fully reserved + the requested node is partially reserved and does not have enough space + none of the nodes have enough memory to allocate the region Range restricted: - region can be allocated in the specific node requested without dropping min_addr: + the range fully overlaps with the node, and there are adjacent reserved regions - region cannot be allocated: + range partially overlaps with two different nodes, where the second node is the requested node + range overlaps with multiple nodes along node boundaries, and the requested node starts after max_addr + nid is set to NUMA_NO_NODE and the total range can fit the region, but the range is split between two nodes and everything else is reserved Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/51b14da46e6591428df3aefc5acc7dca9341a541.1667802195.git.remckee0@gmail.com
2022-11-08	memblock tests: add bottom-up NUMA tests for memblock_alloc_exact_nid_raw	Rebecca Mckeever
	Add tests for memblock_alloc_exact_nid_raw() where the simulated physical memory is set up with multiple NUMA nodes. Additionally, all of these tests set nid != NUMA_NO_NODE. These tests are run with a bottom-up allocation direction. The tested scenarios are: Range unrestricted: - region can be allocated in the specific node requested: + there are no previously reserved regions + the requested node is partially reserved but has enough space Range restricted: - region can be allocated in the specific node requested after dropping min_addr: + range partially overlaps with two different nodes, where the first node is the requested node + range partially overlaps with two different nodes, where the requested node ends before min_addr + range overlaps with multiple nodes along node boundaries, and the requested node ends before min_addr Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/935f0eed5e06fd44dc67d9f49b277923d7896bd3.1667802195.git.remckee0@gmail.com
2022-11-08	memblock tests: add top-down NUMA tests for memblock_alloc_exact_nid_raw	Rebecca Mckeever
	Add tests for memblock_alloc_exact_nid_raw() where the simulated physical memory is set up with multiple NUMA nodes. Additionally, all of these tests set nid != NUMA_NO_NODE. These tests are run with a top-down allocation direction. The tested scenarios are: Range unrestricted: - region can be allocated in the specific node requested: + there are no previously reserved regions + the requested node is partially reserved but has enough space Range restricted: - region can be allocated in the specific node requested after dropping min_addr: + range partially overlaps with two different nodes, where the first node is the requested node + range partially overlaps with two different nodes, where the requested node ends before min_addr + range overlaps with multiple nodes along node boundaries, and the requested node ends before min_addr Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/2cc0883243d68ddc3faf833d2d9e86f48534c1d7.1667802195.git.remckee0@gmail.com
2022-11-08	memblock tests: introduce range tests for memblock_alloc_exact_nid_raw	Rebecca Mckeever
	Add TEST_F_EXACT flag, which specifies that tests should run memblock_alloc_exact_nid_raw(). Introduce range tests for memblock_alloc_exact_nid_raw() by using the TEST_F_EXACT flag to run the range tests in alloc_nid_api.c, since memblock_alloc_exact_nid_raw() and memblock_alloc_try_nid_raw() behave the same way when nid = NUMA_NO_NODE. Rename tests and other functions in alloc_nid_api.c by removing "_try". Since the test names will be displayed in verbose output, they need to be general enough to refer to any of the memblock functions that the tests may run. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Rebecca Mckeever <remckee0@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/5a4b6d1b6130ab7375314e1c45a6d5813dfdabbd.1667802195.git.remckee0@gmail.com
2022-11-07	selftests/bpf: Fix u32 variable compared with less than zero	Kang Minchul
	Variable ret is compared with less than zero even though it was set as u32. So u32 to int conversion is needed. Signed-off-by: Kang Minchul <tegongkang@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20221105183656.86077-1-tegongkang@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-11-07	perf trace: Add BPF augmenter to perf_event_open()'s 'struct ↵	Arnaldo Carvalho de Melo
	perf_event_attr' arg Using BPF for that, doing a cleverish reuse of perf_event_attr__fprintf(), that really needs to be turned into __snprintf(), etc. But since the plan is to go the BTF way probably use libbpf's btf_dump__dump_type_data(). Example: [root@quaco ~]# perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,perf_event_open --max-events 10 perf stat --quiet sleep 0.001 fg 0.000 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x1, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 0.067 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x3, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 0.120 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x4, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5 0.172 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x2, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 7 0.190 perf_event_open(attr_uptr: { size: 128, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 8 0.199 perf_event_open(attr_uptr: { size: 128, config: 0x1, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 9 0.204 perf_event_open(attr_uptr: { size: 128, config: 0x4, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 10 0.210 perf_event_open(attr_uptr: { size: 128, config: 0x5, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 11 [root@quaco ~]# Suggested-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Y2V2Tpu+2vzJyon2@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-06	Merge tag 'cxl-fixes-for-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "Several fixes for CXL region creation crashes, leaks and failures. This is mainly fallout from the original implementation of dynamic CXL region creation (instantiate new physical memory pools) that arrived in v6.0-rc1. Given the theme of "failures in the presence of pass-through decoders" this also includes new regression test infrastructure for that case. Summary: - Fix region creation crash with pass-through decoders - Fix region creation crash when no decoder allocation fails - Fix region creation crash when scanning regions to enforce the increasing physical address order constraint that CXL mandates - Fix a memory leak for cxl_pmem_region objects, track 1:N instead of 1:1 memory-device-to-region associations. - Fix a memory leak for cxl_region objects when regions with active targets are deleted - Fix assignment of NUMA nodes to CXL regions by CFMWS (CXL Window) emulated proximity domains. - Fix region creation failure for switch attached devices downstream of a single-port host-bridge - Fix false positive memory leak of cxl_region objects by recycling recently used region ids rather than freeing them - Add regression test infrastructure for a pass-through decoder configuration - Fix some mailbox payload handling corner cases" * tag 'cxl-fixes-for-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Recycle region ids cxl/region: Fix 'distance' calculation with passthrough ports tools/testing/cxl: Add a single-port host-bridge regression config tools/testing/cxl: Fix some error exits cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak cxl/region: Fix cxl_region leak, cleanup targets at region delete cxl/region: Fix region HPA ordering validation cxl/pmem: Use size_add() against integer overflow cxl/region: Fix decoder allocation crash ACPI: NUMA: Add CXL CFMWS 'nodes' to the possible nodes set cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. cxl/region: Fix null pointer dereference due to pass through decoder commit cxl/mbox: Add a check on input payload size
2022-11-05	Merge tag 'iio-fixes-for-6.1b' of ↵	Greg Kroah-Hartman
	https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: "2nd set of IIO fixes for 6.1 Another mixed bag of driver fixes. * atmel,at91-sama5d2 - Drop a 5 degree offset as not needed for production devices. - Missing iio_trigger_free() in error path. * bosch,bma400 - Turn power on before trying to read chip ID. * bosch,bno055 - Avoid uninitialized variable warning (no actual impact) * meas,ms5611 - Fix multiple instances of driver sharing single prom array. - Stop forcing SPI speed to max devices supports * mps,mp2629 - Wrong structure field used to match channel. - Missing NULL terminator. * sysfs-trigger - Fix memory leak in error path. * tools - Fix wrong read size when calling with noevents." * tag 'iio-fixes-for-6.1b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: tools: iio: iio_generic_buffer: Fix read size iio: imu: bno055: uninitialized variable bug in bno055_trigger_handler() iio: adc: at91_adc: fix possible memory leak in at91_adc_allocate_trigger() iio: adc: mp2629: fix potential array out of bound access iio: adc: mp2629: fix wrong comparison of channel iio: pressure: ms5611: changed hardcoded SPI speed to value limited iio: pressure: ms5611: fixed value compensation bug iio: accel: bma400: Ensure VDDIO is enable defore reading the chip ID. iio: adc: at91-sama5d2_adc: get rid of 5 degrees Celsius adjustment iio: trigger: sysfs: fix possible memory leak in iio_sysfs_trig_init()
2022-11-05	objtool: Fix weak hole vs prefix symbol	Peter Zijlstra
	Boris (and the robot) reported that objtool grew a new complaint about unreachable instructions. Upon inspection it was immediately clear the __weak zombie instructions struck again. For the unweary, the linker will simply remove the symbol for overriden __weak symbols but leave the instructions in place, creating unreachable instructions -- and objtool likes to report these. Commit 4adb23686795 ("objtool: Ignore extra-symbol code") was supposed to have dealt with that, but the new commit 9f2899fe36a6 ("objtool: Add option to generate prefix symbols") subtly broke that logic by created unvisited symbols. Fixes: 9f2899fe36a6 ("objtool: Add option to generate prefix symbols") Reported-by: Borislav Petkov <bp@alien8.de> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2022-11-05	objtool: Optimize elf_dirty_reloc_sym()	Peter Zijlstra
	When moving a symbol in the symtab its index changes and any reloc referring that symtol-table-index will need to be rewritten too. In order to facilitate this, objtool simply marks the whole reloc section 'changed' which will cause the whole section to be re-generated. However, finding the relocs that use any given symbol is implemented rather crudely -- a fully iteration of all sections and their relocs. Given that some builds have over 20k sections (kallsyms etc..) iterating all that for each symbol moved takes a bit of time. Instead have each symbol keep a list of relocs that reference it. This vastly improves build times for certain configs. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net
2022-11-04	tools/testing/cxl: Add a single-port host-bridge regression config	Dan Williams
	Jonathan reports that region creation fails when a single-port host-bridge connects to a multi-port switch. Mock up that configuration so a fix can be tested and regression tested going forward. Reported-by: Bobo WL <lmw.bobo@gmail.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752184838.947915.2167957540894293891.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04	tools/testing/cxl: Fix some error exits	Dan Williams
	Fix a few typos where 'goto err_port' was used rather than the object specific cleanup. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752184255.947915.16163477849330181425.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04	selftests/bpf: Use consistent build-id type for liburandom_read.so	Artem Savkov
	lld produces "fast" style build-ids by default, which is inconsistent with ld's "sha1" style. Explicitly specify build-id style to be "sha1" when linking liburandom_read.so the same way it is already done for urandom_read. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20221104094016.102049-1-asavkov@redhat.com
2022-11-04	selftests/bpf: cgroup_helpers.c: Fix strncpy() fortify warning	Rong Tao
	Copy libbpf_strlcpy() from libbpf_internal.h to bpf_util.h, and rename it to bpf_strlcpy(), then replace selftests strncpy()/libbpf_strlcpy() with bpf_strlcpy(), fix compile warning. The libbpf_internal.h header cannot be used directly here, because references to cgroup_helpers.c in samples/bpf will generate compilation errors. We also can't add libbpf_strlcpy() directly to bpf_util.h, because the definition of libbpf_strlcpy() in libbpf_internal.h is duplicated. In order not to modify the libbpf code, add a new function bpf_strlcpy() to selftests bpf_util.h. How to reproduce this compilation warning: $ make -C samples/bpf cgroup_helpers.c: In function ‘__enable_controllers’: cgroup_helpers.c:80:17: warning: ‘strncpy’ specified bound 4097 equals destination size [-Wstringop-truncation] 80 \| strncpy(enable, controllers, sizeof(enable)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/tencent_469D8AF32BD56816A29981BED06E96D22506@qq.com
2022-11-04	Merge tag 'landlock-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fix from Mickaël Salaün: "Fix the test build for some distros" * tag 'landlock-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Build without static libraries
2022-11-04	selftests/bpf: Tests for enum fwd resolved as full enum64	Eduard Zingerman
	A set of test cases to verify enum fwd resolution logic: - verify that enum fwd can be resolved as full enum64; - verify that enum64 fwd can be resolved as full enum; - verify that enum size is considered when enums are compared for equivalence. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221101235413.1824260-2-eddyz87@gmail.com
2022-11-04	libbpf: Resolve enum fwd as full enum64 and vice versa	Eduard Zingerman
	Changes de-duplication logic for enums in the following way: - update btf_hash_enum to ignore size and kind fields to get ENUM and ENUM64 types in a same hash bucket; - update btf_compat_enum to consider enum fwd to be compatible with full enum64 (and vice versa); This allows BTF de-duplication in the following case: // CU #1 enum foo; struct s { enum foo a; } x; // CU #2 enum foo { x = 0xfffffffff // big enough to force enum64 }; struct s { enum foo a; } y; De-duplicated BTF prior to this commit: [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1 'x' val=68719476735ULL [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) [3] STRUCT 's' size=8 vlen=1 'a' type_id=4 bits_offset=0 [4] PTR '(anon)' type_id=1 [5] PTR '(anon)' type_id=3 [6] STRUCT 's' size=8 vlen=1 'a' type_id=8 bits_offset=0 [7] ENUM 'foo' encoding=UNSIGNED size=4 vlen=0 [8] PTR '(anon)' type_id=7 [9] PTR '(anon)' type_id=6 De-duplicated BTF after this commit: [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1 'x' val=68719476735ULL [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) [3] STRUCT 's' size=8 vlen=1 'a' type_id=4 bits_offset=0 [4] PTR '(anon)' type_id=1 [5] PTR '(anon)' type_id=3 Enum forward declarations in C do not provide information about enumeration values range. Thus the `btf_type->size` field is meaningless for forward enum declarations. In fact, GCC does not encode size in DWARF for forward enum declarations (but dwarves sets enumeration size to a default value of `sizeof(int) * 8` when size is not specified see dwarf_loader.c:die__create_new_enumeration). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221101235413.1824260-1-eddyz87@gmail.com
2022-11-04	selftests/bpf: make test_align selftest more robust	Andrii Nakryiko
	test_align selftest relies on BPF verifier log emitting register states for specific instructions in expected format. Unfortunately, BPF verifier precision backtracking log interferes with such expectations. And instruction on which precision propagation happens sometimes don't output full expected register states. This does indeed look like something to be improved in BPF verifier, but is beyond the scope of this patch set. So to make test_align a bit more robust, inject few dummy R4 = R5 instructions which capture desired state of R5 and won't have precision tracking logs on them. This fixes tests until we can improve BPF verifier output in the presence of precision tracking. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221104163649.121784-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-04	perf bpf: Rename perf_include_dir to libbpf_include_dir	Arnaldo Carvalho de Melo
	As this is where we expect to find bpf/bpf_helpers.h, etc. This needs more work to make it follow LIBBPF_DYNAMIC=1 usage, i.e. when not using the system libbpf it should use the headers in the in-kernel sources libbpf in tools/lib/bpf. We need to do that anyway to avoid this mixup system libbpf and in-kernel files, so we'll get this sorted out that way. And this also may become moot as we move to using BPF skels for this feature. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf examples bpf: Remove augmented_syscalls.c, the raw_syscalls one should ↵	Arnaldo Carvalho de Melo
	be used instead The attempt at using BPF to copy syscall pointer arguments to show them like strace does started with sys_{enter,exit}_SYSCALL_NAME tracepoints, in tools/perf/examples/bpf/augmented_syscalls.c, but then achieving this result using raw_syscalls:{enter,exit} and BPF tail calls was deemed more flexible. The 'perf trace' codebase was adapted to using it while trying to continue supporting the old style per-syscall tracepoints, which at some point became too unwieldly and now isn't working properly. So lets scale back and concentrate on the augmented_raw_syscalls.c model on the way to using BPF skeletons. For the same reason remove the etcsnoop.c example, that used the old style per-tp syscalls just for the 'open' and 'openat' syscalls, looking at the pathnames starting with "/etc/", we should be able to do this later using filters, after we move to BPF skels. The augmented_raw_syscalls.c one continues to work, now with libbpf 1.0, after Ian work on using the libbpf map style: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,open* --max-events 4 0.000 ping/194815 openat(dfd: CWD, filename: "/etc/hosts", flags: RDONLY\|CLOEXEC) = 5 20.225 systemd-oomd/972 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY\|CLOEXEC) = 12 20.285 abrt-dump-jour/1371 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY\|CLOEXEC\|NONBLOCK) = 21 20.301 abrt-dump-jour/1370 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY\|CLOEXEC\|NONBLOCK) = 21 # This is using this: # cat ~/.perfconfig [trace] show_zeros = yes show_duration = no no_inherit = yes args_alignment = 40 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf bpf: Remove now unused BPF headers	Ian Rogers
	Example code has migrated to use standard BPF header files, remove unnecessary perf equivalents. Update install step to not try to copy these. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221103045437.163510-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf trace: 5sec fix libbpf 1.0+ compatibility	Ian Rogers
	Avoid use of tools/perf/include/bpf/bpf.h and use the more regular BPF headers. Committer testing: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/5sec.c sleep 5 0.000 perf_bpf_probe:hrtimer_nanosleep(__probe_ip: -1474734416, rqtp: 5000000000) # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/5sec.c/max-stack=7/ sleep 5 0.000 perf_bpf_probe:hrtimer_nanosleep(__probe_ip: -1474734416, rqtp: 5000000000) hrtimer_nanosleep ([kernel.kallsyms]) common_nsleep ([kernel.kallsyms]) __x64_sys_clock_nanosleep ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __GI___clock_nanosleep (/usr/lib64/libc.so.6) [0] ([unknown]) # Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221103045437.163510-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf trace: empty fix libbpf 1.0+ compatibility	Ian Rogers
	Avoid use of tools/perf/include/bpf/bpf.h and use the more regular BPF headers. Add raw_syscalls:sys_enter to avoid the evlist being empty. Committer testing: # time perf trace -e ~acme/git/perf/tools/perf/examples/bpf/empty.c sleep 5 real 0m5.697s user 0m0.217s sys 0m0.453s # I.e. it sets up everything successfully (use -v to see the details) and filters out all syscalls, then exits when the workload (sleep 5) finishes. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221103045437.163510-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf trace: hello fix libbpf 1.0+ compatibility	Ian Rogers
	Don't use deprecated and now broken map style. Avoid use of tools/perf/include/bpf/bpf.h and use the more regular BPF headers. Switch to raw_syscalls:sys_enter to avoid the evlist being empty and fixing generating output. Committer testing: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/hello.c --call-graph=dwarf --max-events 5 0.000 perf/206852 __bpf_stdout__(Hello, world) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __GI___sched_setaffinity_new (/usr/lib64/libc.so.6) 8.561 pipewire/2290 __bpf_stdout__(Hello, world) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __libc_read (/usr/lib64/libc.so.6) 8.571 pipewire/2290 __bpf_stdout__(Hello, world) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __GI___ioctl (/usr/lib64/libc.so.6) 8.586 pipewire/2290 __bpf_stdout__(Hello, world) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __GI___write (/usr/lib64/libc.so.6) 8.592 pipewire/2290 __bpf_stdout__(Hello, world) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) syscall_trace_enter.constprop.0 ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __timerfd_settime (/usr/lib64/libc.so.6) # Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221103045437.163510-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility	Ian Rogers
	Don't use deprecated and now broken map style. Avoid use of tools/perf/include/bpf/bpf.h and use the more regular BPF headers. Committer notes: Add /usr/include to the include path so that bpf/bpf_helpers.h can be found, remove sys/socket.h, adding the sockaddr_storage definition, also remove stdbool.h, both were preventing building the augmented_raw_syscalls.c file with clang, revisit later. Testing it: Asking for syscalls that have string arguments: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,string --max-events 10 0.000 thermald/1144 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/energy_uj", flags: RDONLY) = 13 0.158 thermald/1144 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj", flags: RDONLY) = 13 0.215 thermald/1144 openat(dfd: CWD, filename: "/sys/class/thermal/thermal_zone3/temp", flags: RDONLY) = 13 16.448 cgroupify/36478 openat(dfd: 4, filename: ".", flags: RDONLY\|CLOEXEC\|DIRECTORY\|NONBLOCK) = 5 16.468 cgroupify/36478 newfstatat(dfd: 5, filename: "", statbuf: 0x7fffca5b4130, flag: 4096) = 0 16.473 systemd-oomd/972 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY\|CLOEXEC) = 12 16.499 systemd-oomd/972 newfstatat(dfd: 12, filename: "", statbuf: 0x7ffd2bc73cc0, flag: 4096) = 0 16.516 abrt-dump-jour/1370 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY\|CLOEXEC\|NONBLOCK) = 21 16.538 abrt-dump-jour/1370 newfstatat(dfd: 21, filename: "", statbuf: 0x7ffc651b8980, flag: 4096) = 0 16.540 abrt-dump-jour/1371 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY\|CLOEXEC\|NONBLOCK) = 21 # Networking syscalls: # perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,sendto,connect --max-events 10 0.000 isc-net-0005/1206 connect(fd: 512, uservaddr: { .family: INET, port: 53, addr: 23.211.132.65 }, addrlen: 16) = 0 0.070 isc-net-0002/1203 connect(fd: 515, uservaddr: { .family: INET6, port: 53, addr: 2600:1401:2::43 }, addrlen: 28) = -1 ENETUNREACH (Network is unreachable) 0.031 isc-net-0006/1207 connect(fd: 513, uservaddr: { .family: INET6, port: 53, addr: 2600:1401:2::43 }, addrlen: 28) = -1 ENETUNREACH (Network is unreachable) 0.079 isc-net-0006/1207 sendto(fd: 3, buff: 0x7f73a40611b0, len: 106, flags: NOSIGNAL, addr: { .family: UNSPEC }, addr_len: NULL) = 106 0.180 isc-net-0006/1207 connect(fd: 519, uservaddr: { .family: INET6, port: 53, addr: 2600:1401:1::43 }, addrlen: 28) = -1 ENETUNREACH (Network is unreachable) 0.211 isc-net-0006/1207 sendto(fd: 3, buff: 0x7f73a4061230, len: 106, flags: NOSIGNAL, addr: { .family: UNSPEC }, addr_len: NULL) = 106 0.298 isc-net-0006/1207 connect(fd: 515, uservaddr: { .family: INET, port: 53, addr: 96.7.49.67 }, addrlen: 16) = 0 0.109 isc-net-0004/1205 connect(fd: 518, uservaddr: { .family: INET6, port: 53, addr: 2600:1401:2::43 }, addrlen: 28) = -1 ENETUNREACH (Network is unreachable) 0.164 isc-net-0002/1203 sendto(fd: 3, buff: 0x7f73ac064300, len: 107, flags: NOSIGNAL, addr: { .family: UNSPEC }, addr_len: NULL) = 107 0.247 isc-net-0002/1203 connect(fd: 522, uservaddr: { .family: INET6, port: 53, addr: 2600:1401:1::43 }, addrlen: 28) = -1 ENETUNREACH (Network is unreachable) # Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221103045437.163510-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-04	selftests/net: give more time to udpgro bg processes to complete startup	Adrien Thierry
	In some conditions, background processes in udpgro don't have enough time to set up the sockets. When foreground processes start, this results in the test failing with "./udpgso_bench_tx: sendmsg: Connection refused". For instance, this happens from time to time on a Qualcomm SA8540P SoC running CentOS Stream 9. To fix this, increase the time given to background processes to complete the startup before foreground processes start. Signed-off-by: Adrien Thierry <athierry@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-03	selftests/bpf: support stat filtering in comparison mode in veristat	Andrii Nakryiko
	Finally add support for filtering stats values, similar to non-comparison mode filtering. For comparison mode 4 variants of stats are important for filtering, as they allow to filter either A or B side, but even more importantly they allow to filter based on value difference, and for verdict stat value difference is MATCH/MISMATCH classification. So with these changes it's finally possible to easily check if there were any mismatches between failure/success outcomes on two separate data sets. Like in an example below: $ ./veristat -e file,prog,verdict,insns -C ~/baseline-results.csv ~/shortest-results.csv -f verdict_diff=mismatch File Program Verdict (A) Verdict (B) Verdict (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------------------------------- --------------------- ----------- ----------- -------------- --------- --------- ------------------- dynptr_success.bpf.linked1.o test_data_slice success failure MISMATCH 85 0 -85 (-100.00%) dynptr_success.bpf.linked1.o test_read_write success failure MISMATCH 1992 0 -1992 (-100.00%) dynptr_success.bpf.linked1.o test_ringbuf success failure MISMATCH 74 0 -74 (-100.00%) kprobe_multi.bpf.linked1.o test_kprobe failure success MISMATCH 0 246 +246 (+100.00%) kprobe_multi.bpf.linked1.o test_kprobe_manual failure success MISMATCH 0 246 +246 (+100.00%) kprobe_multi.bpf.linked1.o test_kretprobe failure success MISMATCH 0 248 +248 (+100.00%) kprobe_multi.bpf.linked1.o test_kretprobe_manual failure success MISMATCH 0 248 +248 (+100.00%) kprobe_multi.bpf.linked1.o trigger failure success MISMATCH 0 2 +2 (+100.00%) netcnt_prog.bpf.linked1.o bpf_nextcnt failure success MISMATCH 0 56 +56 (+100.00%) pyperf600_nounroll.bpf.linked1.o on_event success failure MISMATCH 568128 1000001 +431873 (+76.02%) ringbuf_bench.bpf.linked1.o bench_ringbuf success failure MISMATCH 8 0 -8 (-100.00%) strobemeta.bpf.linked1.o on_event success failure MISMATCH 557149 1000001 +442852 (+79.49%) strobemeta_nounroll1.bpf.linked1.o on_event success failure MISMATCH 57240 1000001 +942761 (+1647.03%) strobemeta_nounroll2.bpf.linked1.o on_event success failure MISMATCH 501725 1000001 +498276 (+99.31%) strobemeta_subprogs.bpf.linked1.o on_event success failure MISMATCH 65420 1000001 +934581 (+1428.59%) test_map_in_map_invalid.bpf.linked1.o xdp_noop0 success failure MISMATCH 2 0 -2 (-100.00%) test_mmap.bpf.linked1.o test_mmap success failure MISMATCH 46 0 -46 (-100.00%) test_verif_scale3.bpf.linked1.o balancer_ingress success failure MISMATCH 845499 1000001 +154502 (+18.27%) ------------------------------------- --------------------- ----------- ----------- -------------- --------- --------- ------------------- Note that by filtering on verdict_diff=mismatch, it's now extremely easy and fast to see any changes in verdict. Example above showcases both failure -> success transitions (which are generally surprising) and success -> failure transitions (which are expected if bugs are present). Given veristat allows to query relative percent difference values, internal logic for comparison mode is based on floating point numbers, so requires a bit of epsilon precision logic, deviating from typical integer simple handling rules. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: support stats ordering in comparison mode in veristat	Andrii Nakryiko
	Introduce the concept of "stat variant", by which it's possible to specify whether to use the value from A (baseline) side, B (comparison or control) side, the absolute difference value or relative (percentage) difference value. To support specifying this, veristat recognizes `_a`, `_b`, `_diff`, `_pct` suffixes, which can be appended to stat name(s). In non-comparison mode variants are ignored (there is only `_a` variant effectively), if no variant suffix is provided, `_b` is assumed, as control group is of primary interest in comparison mode. These stat variants can be flexibly combined with asc/desc orders. Here's an example of ordering results first by verdict match/mismatch (or n/a if one of the sides is missing; n/a is always considered to be the lowest value), and within each match/mismatch/n/a group further sort by number of instructions in B side. In this case we don't have MISMATCH cases, but N/A are split from MATCH, demonstrating this custom ordering. $ ./veristat -e file,prog,verdict,insns -s verdict_diff,insns_b_ -C ~/base.csv ~/comp.csv File Program Verdict (A) Verdict (B) Verdict (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------------ ------------------------------ ----------- ----------- -------------- --------- --------- -------------- bpf_xdp.o tail_lb_ipv6 N/A success N/A N/A 151895 N/A bpf_xdp.o tail_nodeport_nat_egress_ipv4 N/A success N/A N/A 15619 N/A bpf_xdp.o tail_nodeport_ipv6_dsr N/A success N/A N/A 1206 N/A bpf_xdp.o tail_nodeport_ipv4_dsr N/A success N/A N/A 1162 N/A bpf_alignchecker.o tail_icmp6_send_echo_reply N/A failure N/A N/A 74 N/A bpf_alignchecker.o __send_drop_notify success N/A N/A 53 N/A N/A bpf_host.o __send_drop_notify success N/A N/A 53 N/A N/A bpf_host.o cil_from_host success N/A N/A 762 N/A N/A bpf_xdp.o tail_lb_ipv4 success success MATCH 71736 73430 +1694 (+2.36%) bpf_xdp.o tail_handle_nat_fwd_ipv4 success success MATCH 21547 20920 -627 (-2.91%) bpf_xdp.o tail_rev_nodeport_lb6 success success MATCH 17954 17905 -49 (-0.27%) bpf_xdp.o tail_handle_nat_fwd_ipv6 success success MATCH 16974 17039 +65 (+0.38%) bpf_xdp.o tail_nodeport_nat_ingress_ipv4 success success MATCH 7658 7713 +55 (+0.72%) bpf_xdp.o tail_rev_nodeport_lb4 success success MATCH 7126 6934 -192 (-2.69%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 success success MATCH 6405 6397 -8 (-0.12%) bpf_xdp.o tail_nodeport_nat_ipv6_egress failure failure MATCH 752 752 +0 (+0.00%) bpf_xdp.o cil_xdp_entry success success MATCH 423 423 +0 (+0.00%) bpf_xdp.o __send_drop_notify success success MATCH 151 151 +0 (+0.00%) bpf_alignchecker.o tail_icmp6_handle_ns failure failure MATCH 33 33 +0 (+0.00%) ------------------ ------------------------------ ----------- ----------- -------------- --------- --------- -------------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: handle missing records in comparison mode better in veristat	Andrii Nakryiko
	When comparing two datasets, if either side is missing corresponding record with the same file and prog name, currently veristat emits misleading zeros/failures, and even tried to calculate a difference, even though there is no data to compare against. This patch improves internal logic of handling such situations. Now we'll emit "N/A" in places where data is missing and comparison is non-sensical. As an example, in an artificially truncated and mismatched Cilium results, the output looks like below: $ ./veristat -e file,prog,verdict,insns -C ~/base.csv ~/comp.csv File Program Verdict (A) Verdict (B) Verdict (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------------ ------------------------------ ----------- ----------- -------------- --------- --------- -------------- bpf_alignchecker.o __send_drop_notify success N/A N/A 53 N/A N/A bpf_alignchecker.o tail_icmp6_handle_ns failure failure MATCH 33 33 +0 (+0.00%) bpf_alignchecker.o tail_icmp6_send_echo_reply N/A failure N/A N/A 74 N/A bpf_host.o __send_drop_notify success N/A N/A 53 N/A N/A bpf_host.o cil_from_host success N/A N/A 762 N/A N/A bpf_xdp.o __send_drop_notify success success MATCH 151 151 +0 (+0.00%) bpf_xdp.o cil_xdp_entry success success MATCH 423 423 +0 (+0.00%) bpf_xdp.o tail_handle_nat_fwd_ipv4 success success MATCH 21547 20920 -627 (-2.91%) bpf_xdp.o tail_handle_nat_fwd_ipv6 success success MATCH 16974 17039 +65 (+0.38%) bpf_xdp.o tail_lb_ipv4 success success MATCH 71736 73430 +1694 (+2.36%) bpf_xdp.o tail_lb_ipv6 N/A success N/A N/A 151895 N/A bpf_xdp.o tail_nodeport_ipv4_dsr N/A success N/A N/A 1162 N/A bpf_xdp.o tail_nodeport_ipv6_dsr N/A success N/A N/A 1206 N/A bpf_xdp.o tail_nodeport_nat_egress_ipv4 N/A success N/A N/A 15619 N/A bpf_xdp.o tail_nodeport_nat_ingress_ipv4 success success MATCH 7658 7713 +55 (+0.72%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 success success MATCH 6405 6397 -8 (-0.12%) bpf_xdp.o tail_nodeport_nat_ipv6_egress failure failure MATCH 752 752 +0 (+0.00%) bpf_xdp.o tail_rev_nodeport_lb4 success success MATCH 7126 6934 -192 (-2.69%) bpf_xdp.o tail_rev_nodeport_lb6 success success MATCH 17954 17905 -49 (-0.27%) ------------------ ------------------------------ ----------- ----------- -------------- --------- --------- -------------- Internally veristat now separates joining two datasets and remembering the join, and actually emitting a comparison view. This will come handy when we add support for filtering and custom ordering in comparison mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: make veristat emit all stats in CSV mode by default	Andrii Nakryiko
	Make veristat distinguish between table and CSV output formats and use different default set of stats (columns) that are emitted. While for human-readable table output it doesn't make sense to output all known stats, it is very useful for CSV mode to record all possible data, so that it can later be queried and filtered in replay or comparison mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: support simple filtering of stats in veristat	Andrii Nakryiko
	Define simple expressions to filter not just by file and program name, but also by resulting values of collected stats. Support usual equality and inequality operators. Verdict, which is a boolean-like field can be also filtered either as 0/1, failure/success (with f/s, fail/succ, and failure/success aliases) symbols, or as false/true (f/t). Aliases are case insensitive. Currently this filtering is honored only in verification and replay modes. Comparison mode support will be added in next patch. Here's an example of verifying a bunch of BPF object files and emitting only results for successfully validated programs that have more than 100 total instructions processed by BPF verifier, sorted by number of instructions in ascending order: $ sudo ./veristat *.bpf.o -s insns^ -f 'insns>100' There can be many filters (both allow and deny flavors), all of them are combined. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: allow to define asc/desc ordering for sort specs in veristat	Andrii Nakryiko
	Allow to specify '^' at the end of stat name to designate that it should be sorted in ascending order. Similarly, allow any of 'v', 'V', '.', '!', or '_' suffix "symbols" to designate descending order. It's such a zoo for descending order because there is no single intuitive symbol that could be used (using 'v' looks pretty weird in practice), so few symbols that are "downwards leaning or pointing" were chosen. Either way, it shouldn't cause any troubles in practice. This new feature allows to customize sortering order to match user's needs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: ensure we always have non-ambiguous sorting in veristat	Andrii Nakryiko
	Always fall back to unique file/prog comparison if user's custom order specs are ambiguous. This ensures stable output no matter what. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: consolidate and improve file/prog filtering in veristat	Andrii Nakryiko
	Slightly change rules of specifying file/prog glob filters. In practice it's quite often inconvenient to do `/<prog-glob>` if that program glob is unique enough and won't accidentally match any file names. This patch changes the rules so that `-f <glob>` will apply specified glob to both file and program names. User still has all the control by doing '/<prog-only-glob>' or '<file-only-glob/*'. We also now allow '/<prog-glob>' and '<file-glob/' (all matching wildcard is assumed if missing). Also, internally unify file-only and file+prog checks (should_process_file and should_process_prog are now should_process_file_prog that can handle prog name as optional). This makes maintaining and extending this code easier. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: shorten "Total insns/states" column names in veristat	Andrii Nakryiko
	In comparison mode the "Total " part is pretty useless, but takes a considerable amount of horizontal space. Drop the "Total " parts. Also make sure that table headers for numerical columns are aligned in the same fashion as integer values in those columns. This looks better and is now more obvious with shorter "Insns" and "States" column headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests/bpf: add veristat replay mode	Andrii Nakryiko
	Replay mode allow to parse previously stored CSV file with verification results and present it in desired output (presumable human-readable table, but CSV to CSV convertion is supported as well). While doing that, it's possible to use veristat's sorting rules, specify subset of columns, and filter by file and program name. In subsequent patches veristat's filtering capabilities will just grow making replay mode even more useful in practice for post-processing results. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221103055304.2904589-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03	selftests: forwarding: Add MAC Authentication Bypass (MAB) test cases	Hans J. Schultz
	Add four test cases to verify MAB functionality: * Verify that a locked FDB entry can be generated by the bridge, preventing a host from communicating via the bridge. Test that user space can clear the "locked" flag by replacing the entry, thereby authenticating the host and allowing it to communicate via the bridge. * Test that an entry cannot roam to a locked port, but that it can roam to an unlocked port. * Test that MAB can only be enabled on a port that is both locked and has learning enabled. * Test that locked FDB entries are flushed from a port when MAB is disabled. Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== bpf 2022-11-04 We've added 8 non-merge commits during the last 3 day(s) which contain a total of 10 files changed, 113 insertions(+), 16 deletions(-). The main changes are: 1) Fix memory leak upon allocation failure in BPF verifier's stack state tracking, from Kees Cook. 2) Fix address leakage when BPF progs release reference to an object, from Youlin Li. 3) Fix BPF CI breakage from buggy in.h uapi header dependency, from Andrii Nakryiko. 4) Fix bpftool pin sub-command's argument parsing, from Pu Lehui. 5) Fix BPF sockmap lockdep warning by cancelling psock work outside of socket lock, from Cong Wang. 6) Follow-up for BPF sockmap to fix sk_forward_alloc accounting, from Wang Yufen. bpf-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add verifier test for release_reference() bpf: Fix wrong reg type conversion in release_reference() bpf, sock_map: Move cancel_work_sync() out of sock lock tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CI net/ipv4: Fix linux/in.h header dependencies bpftool: Fix NULL pointer dereference when pin {PROG, MAP, LINK} without FILE bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues bpf, verifier: Fix memory leak in array reallocation for stack state ==================== Link: https://lore.kernel.org/r/20221104000445.30761-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04	selftests/bpf: Add verifier test for release_reference()	Youlin Li
	Add a test case to ensure that released pointer registers will not be leaked into the map. Before fix: ./test_verifier 984 984/u reference tracking: try to leak released ptr reg FAIL Unexpected success to load! verification time 67 usec stack depth 4 processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 984/p reference tracking: try to leak released ptr reg OK Summary: 1 PASSED, 0 SKIPPED, 1 FAILED After fix: ./test_verifier 984 984/u reference tracking: try to leak released ptr reg OK 984/p reference tracking: try to leak released ptr reg OK Summary: 2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Youlin Li <liulin063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221103093440.3161-2-liulin063@gmail.com
2022-11-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03	Merge tag 'linux-kselftest-fixes-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Fixes to the pidfd test" * tag 'linux-kselftest-fixes-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/pidfd_test: Remove the erroneous ',' selftests: pidfd: Fix compling warnings ksefltests: pidfd: Fix wait_states: Test terminated by timeout