git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-12-07	KVM: selftests: Fix vmxon_pa == vmcs12_pa == -1ull nVMX testcase for !eVMCS	Vitaly Kuznetsov
	The "vmxon_pa == vmcs12_pa == -1ull" test happens to work by accident: as Enlightened VMCS is always supported, set_default_vmx_state() adds 'KVM_STATE_NESTED_EVMCS' to 'flags' and the following branch of vmx_set_nested_state() is executed: if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) && (!guest_can_use(vcpu, X86_FEATURE_VMX) \|\| !vmx->nested.enlightened_vmcs_enabled)) return -EINVAL; as 'enlightened_vmcs_enabled' is false. In fact, "vmxon_pa == vmcs12_pa == -1ull" is a valid state when not tainted by wrong flags so the test should aim for this branch: if (kvm_state->hdr.vmx.vmxon_pa == INVALID_GPA) return 0; Test all this properly: - Without KVM_STATE_NESTED_EVMCS in the flags, the expected return value is '0'. - With KVM_STATE_NESTED_EVMCS flag (when supported) set, the expected return value is '-EINVAL' prior to enabling eVMCS and '0' after. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20231205103630.1391318-11-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-12-07	KVM: selftests: Make Hyper-V tests explicitly require KVM Hyper-V support	Vitaly Kuznetsov
	In preparation for conditional Hyper-V emulation enablement in KVM, make Hyper-V specific tests skip gracefully instead of failing when KVM support for emulating Hyper-V is not there. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20231205103630.1391318-10-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-12-07	selftests/hid: fix failing tablet button tests	Benjamin Tissoires
	An overlook from commit 74452d6329be ("selftests/hid: tablets: add variants of states with buttons"), where I don't use the Enum... Fixes: 74452d6329be ("selftests/hid: tablets: add variants of states with buttons") Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231207-b4-wip-selftests-v1-1-c4e13fe04a70@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: fix ruff linter complains	Benjamin Tissoires
	rename ambiguous variables l, r, and m, and ignore the return values of uhdev.get_evdev() and uhdev.get_slot() Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-15-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: fix mypy complains	Benjamin Tissoires
	No code change, only typing information added/ignored Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-14-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: be stricter for some transitions	Benjamin Tissoires
	To accommodate for legacy devices, we rely on the last state of a transition to be valid: for example when we test PEN_IS_OUT_OF_RANGE to PEN_IS_IN_CONTACT, any "normal" device that reports an InRange bit would insert a PEN_IS_IN_RANGE state between the 2. This is of course valid, but this solution prevents to detect false releases emitted by some firmware: when pressing an "eraser mode" button, they might send an extra PEN_IS_OUT_OF_RANGE that we may want to filter. So define 2 sets of transitions: one that is the ideal behavior, and one that is OK, it won't break user space, but we have serious doubts if we are doing the right thing. And depending on the test, either ask only for valid transitions, or tolerate weird ones. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-13-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: add a secondary barrel switch test	Benjamin Tissoires
	Some tablets report 2 barrel switches. We better test those too. Use the same transistions description from the primary button tests. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-12-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: convert the primary button tests	Benjamin Tissoires
	We get more descriptive in what we are doing, and also get more information of what is actually being tested. Instead of having a non exhaustive button changes that are semi-randomly done, we can describe all the states we want to test. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-11-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: add variants of states with buttons	Benjamin Tissoires
	Turns out that there are transitions that are unlikely to happen: for example, having both the tip switch and a button being changed at the same time (in the same report) would require either a very talented and precise user or a very bad hardware with a very low sampling rate. So instead of manually building the button test by hand and forgetting about some cases, let's reuse the state machine and transitions we have. This patch only adds the states and the valid transitions. The actual tests will be replaced later. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-10-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: define the elements of PenState	Benjamin Tissoires
	This introduces a little bit more readability by not using the raw values but a dedicated Enum Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-9-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: set initial data for tilt/twist	Benjamin Tissoires
	Avoids getting a null event when these usages are set Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-8-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: do not set invert when the eraser is used	Benjamin Tissoires
	Turns out that the chart from Microsoft is not exactly what I got here: when the rubber is used, and is touching the surface, invert can (should) be set to 0... [0] https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/windows-pen-states Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-7-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: move move_to function to PenDigitizer	Benjamin Tissoires
	We can easily subclass PenDigitizer for introducing firmware bugs when subclassing Pen is harder. Move move_to from Pen to PenDigitizer so we get that ability Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-6-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: move the transitions to PenState	Benjamin Tissoires
	Those transitions have nothing to do with `Pen`, so migrate them to `PenState`. The hidden agenda is to remove `Pen` and integrate it into `PenDigitizer` so that we can tweak the events in each state to emulate firmware bugs. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-5-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: tablets: remove unused class	Benjamin Tissoires
	Looks like this is a leftover Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-4-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: base: allow for multiple skip_if_uhdev	Benjamin Tissoires
	We can actually have multiple occurences of `skip_if_uhdev` if we follow the information from the pytest doc[0]. This is not immediately used, but can be if we need multiple conditions on a given test. [0] https://docs.pytest.org/en/latest/historical-notes.html#update-marker-code Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-3-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: vmtest.sh: allow finer control on the build steps	Benjamin Tissoires
	vmtest.sh works great for a one shot test, but not so much for CI where I want to build (with different configs) the bzImage in a separate job than the one I am running it. Add a "build_only" option to specify whether we need to boot the currently built kernel in the vm. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-2-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-07	selftests/hid: vmtest.sh: update vm2c and container	Benjamin Tissoires
	boot2container is now on an official project, so let's use that. The container image is now the same I use for the CI, so let's keep to it. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-1-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2023-12-06	Merge branch 'master' into mm-hotfixes-stable	Andrew Morton

2023-12-06	selftests/mm: prevent duplicate runs caused by TEST_GEN_PROGS	Nico Pache
	Commit 05f1edac8009 ("selftests/mm: run all tests from run_vmtests.sh") fixed the inconsistency caused by tests being defined as TEST_GEN_PROGS. This issue was leading to tests not being executed via run_vmtests.sh and furthermore some tests running twice due to the kselftests wrapper also executing them. Fix the definition of two tests (soft-dirty and pagemap_ioctl) that are still incorrectly defined. Link: https://lkml.kernel.org/r/20231120222908.28559-1-npache@redhat.com Signed-off-by: Nico Pache <npache@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06	mm/selftests: fix pagemap_ioctl memory map test	Peter Xu
	__FILE__ is not guaranteed to exist in current dir. Replace that with argv[0] for memory map test. Link: https://lkml.kernel.org/r/20231116201547.536857-4-peterx@redhat.com Fixes: 46fd75d4a3c9 ("selftests: mm: add pagemap ioctl tests") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06	bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistency	Andrii Nakryiko
	To stay consistent with the naming pattern used for similar cases in BPF UAPI (__MAX_BPF_ATTACH_TYPE, etc), rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE. Also similar to MAX_BPF_ATTACH_TYPE and MAX_BPF_REG, add: #define MAX_BPF_LINK_TYPE __MAX_BPF_LINK_TYPE Not all __MAX_xxx enums have such #define, so I'm not sure if we should add it or not, but I figured I'll start with a completely backwards compatible way, and we can drop that, if necessary. Also adjust a selftest that used MAX_BPF_LINK_TYPE enum. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206190920.1651226-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	selftests/bpf: Add test for early update in prog_array_map_poke_run	Jiri Olsa
	Adding test that tries to trigger the BUG_IN during early map update in prog_array_map_poke_run function. The idea is to share prog array map between thread that constantly updates it and another one loading a program that uses that prog array. Eventually we will hit a place where the program is ok to be updated (poke->tailcall_target_stable check) but the address is still not registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL and cause imbalance for the next tail call update check, which will fail with -EBUSY in bpf_arch_text_poke as described in previous fix. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-3-jolsa@kernel.org
2023-12-06	perf stat: Exit perf stat if parse groups fails	Ian Rogers
	Metrics were added by a callback but commit a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") postponed this to allow optimizations based on the CPU configuration. In doing so it stopped errors in metric parsing from causing 'perf stat' termination. This change adds the termination for bad metric names back in. Fixes: a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Closes: https://lore.kernel.org/lkml/ZXByT1K6enTh2EHT@kernel.org/ Link: https://lore.kernel.org/r/20231206183533.972028-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	selftests/bpf: add BPF token-enabled tests	Andrii Nakryiko
	Add a selftest that attempts to conceptually replicate intended BPF token use cases inside user namespaced container. Child process is forked. It is then put into its own userns and mountns. Child creates BPF FS context object. This ensures child userns is captured as the owning userns for this instance of BPF FS. Given setting delegation mount options is privileged operation, we ensure that child cannot set them. This context is passed back to privileged parent process through Unix socket, where parent sets up delegation options, creates, and mounts it as a detached mount. This mount FD is passed back to the child to be used for BPF token creation, which allows otherwise privileged BPF operations to succeed inside userns. We validate that all of token-enabled privileged commands (BPF_BTF_LOAD, BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only succeed inside the userns if a) BPF token is provided with proper allowed sets of commands and types; and b) namespaces CAP_BPF and other privileges are set. Lacking a) or b) should lead to -EPERM failures. Based on suggested workflow by Christian Brauner ([0]). [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-17-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	libbpf: add BPF token support to bpf_prog_load() API	Andrii Nakryiko
	Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-16-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	libbpf: add BPF token support to bpf_btf_load() API	Andrii Nakryiko
	Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	libbpf: add BPF token support to bpf_map_create() API	Andrii Nakryiko
	Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	libbpf: add bpf_token_create() API	Andrii Nakryiko
	Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	bpf: add BPF token support to BPF_PROG_LOAD command	Andrii Nakryiko
	Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	bpf: add BPF token support to BPF_BTF_LOAD command	Andrii Nakryiko
	Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	bpf: add BPF token support to BPF_MAP_CREATE command	Andrii Nakryiko
	Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	bpf: introduce BPF token object	Andrii Nakryiko
	Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a trusted unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to also have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06	perf thread: Add missing RC_CHK_EQUAL	Ian Rogers
	Comparing pointers without RC_CHK_ACCESS means the indirect object will be compared rather than the underlying maps when REFCNT_CHECKING is enabled. Fix by adding missing RC_CHK_EQUAL. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf maps: Move symbol maps functions to maps.c	Ian Rogers
	Move the find and certain other symbol maps__* functions to maps.c for better abstraction. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf map: Simplify map_ip/unmap_ip and make 'struct map' smaller	Ian Rogers
	When mapping an IP it is either an identity mapping or a DSO relative mapping, so a single bit is required in the struct to identify this. The current code uses function pointers, adding 2 pointers per map and also pushing the size of a map beyond 1 cache line. Switch to using a byte to identify the mapping type (as well as priv and erange_warned), to avoid any masking. Change struct maps's layout to avoid holes. Before: ``` struct map { u64 start; /* 0 8 / u64 end; / 8 8 / _Bool erange_warned:1; / 16: 0 1 / _Bool priv:1; / 16: 1 1 / / XXX 6 bits hole, try to pack / / XXX 3 bytes hole, try to pack / u32 prot; / 20 4 / u64 pgoff; / 24 8 / u64 reloc; / 32 8 / u64 (map_ip)(const struct map , u64); / 40 8 / u64 (unmap_ip)(const struct map , u64); / 48 8 / struct dso dso; /* 56 8 / / --- cacheline 1 boundary (64 bytes) --- / refcount_t refcnt; / 64 4 / u32 flags; / 68 4 / / size: 72, cachelines: 2, members: 12 / / sum members: 68, holes: 1, sum holes: 3 / / sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits / / last cacheline: 8 bytes / }; ``` After: ``` struct map { u64 start; / 0 8 / u64 end; / 8 8 / u64 pgoff; / 16 8 / u64 reloc; / 24 8 / struct dso dso; /* 32 8 / refcount_t refcnt; / 40 4 / u32 prot; / 44 4 / u32 flags; / 48 4 / enum mapping_type mapping_type:8; / 52: 0 4 / / Bitfield combined with next fields / _Bool erange_warned; / 53 1 / _Bool priv; / 54 1 / / size: 56, cachelines: 1, members: 11 / / padding: 1 / / last cacheline: 56 bytes */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf test shell diff: Skip test if test_loop symbol is missing in the perf ↵	Ian Rogers
	binary The diff test depends on finding the symbol test_loop in perf and will fail if perf has been stripped and no debug object is available. In that case, skip the test instead. Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231205164924.835682-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf symbols: Parse NOTE segments until the build id is found	Chengen Du
	In the ELF file, multiple NOTE segments may exist. To locate the build id, the process shall persist in parsing NOTE segments until the build id is found. Signed-off-by: Chengen Du <chengen.du@canonical.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231130135723.17562-1-chengen.du@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf record: Be lazier in allocating lost samples buffer	Ian Rogers
	Wait until a lost sample occurs to allocate the lost samples buffer, often the buffer isn't necessary. This saves a 64kb allocation and 5.3kb of peak memory consumption. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06	perf evsel: Fallback to "task-clock" when not system wide	Ian Rogers
	When the "cycles" event isn't available evsel will fallback to the "cpu-clock" software event. "task-clock" is similar to "cpu-clock" but only runs when the process is running. Falling back to "cpu-clock" when not system wide leads to confusion, by falling back to "task-clock" it is hoped the confusion is less. Pass the target to determine if "task-clock" is more appropriate. Update a nearby comment and debug string for the change. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ajay Kaher <akaher@vmware.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Makhalov <amakhalov@vmware.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231121000420.368075-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-05	tools: ynl: move private definitions to a separate header	Jakub Kicinski
	ynl.h has a growing amount of "internal" stuff, which may confuse users who try to take a look at the external API. Currently the internals are at the bottom of the file with a banner in between, but this arrangement makes it hard to add external APIs / inline helpers which need internal definitions. Move internals to a separate header. Link: https://lore.kernel.org/r/20231202211225.342466-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-05	tools: ynl: use strerror() if no extack of note provided	Jakub Kicinski
	If kernel didn't give use any meaningful error - print a strerror() to the ynl error message. Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20231202211310.342716-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-05	tools: pynl: make flags argument optional for do()	Jakub Kicinski
	Commit 1768d8a767f8 ("tools/net/ynl: Add support for create flags") added support for setting legacy netlink CRUD flags on netlink messages (NLM_F_REPLACE, _EXCL, _CREATE etc.). Most of genetlink won't need these, don't force callers to pass in an empty argument to each do() call. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231202211005.341613-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-05	selftests/bpf: validate precision logic in partial_stack_load_preserves_zeros	Andrii Nakryiko
	Enhance partial_stack_load_preserves_zeros subtest with detailed precision propagation log checks. We know expect fp-16 to be spilled, initially imprecise, zero const register, which is later marked as precise even when partial stack slot load is performed, even if it's not a register fill (!). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05	selftests/bpf: validate zero preservation for sub-slot loads	Andrii Nakryiko
	Validate that 1-, 2-, and 4-byte loads from stack slots not aligned on 8-byte boundary still preserve zero, when loading from all-STACK_ZERO sub-slots, or when stack sub-slots are covered by spilled register with known constant zero value. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05	selftests/bpf: validate STACK_ZERO is preserved on subreg spill	Andrii Nakryiko
	Add tests validating that STACK_ZERO slots are preserved when slot is partially overwritten with subregister spill. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05	selftests/bpf: add stack access precision test	Andrii Nakryiko
	Add a new selftests that validates precision tracking for stack access instruction, using both r10-based and non-r10-based accesses. For non-r10 ones we also make sure to have non-zero var_off to validate that final stack offset is tracked properly in instruction history information inside verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05	bpf: support non-r10 register spill/fill to/from stack in precision tracking	Andrii Nakryiko
	Use instruction (jump) history to record instructions that performed register spill/fill to/from stack, regardless if this was done through read-only r10 register, or any other register after copying r10 into it and potentially adjusting offset. To make this work reliably, we push extra per-instruction flags into instruction history, encoding stack slot index (spi) and stack frame number in extra 10 bit flags we take away from prev_idx in instruction history. We don't touch idx field for maximum performance, as it's checked most frequently during backtracking. This change removes basically the last remaining practical limitation of precision backtracking logic in BPF verifier. It fixes known deficiencies, but also opens up new opportunities to reduce number of verified states, explored in the subsequent patches. There are only three differences in selftests' BPF object files according to veristat, all in the positive direction (less states). File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) -------------------------------------- ------------- --------- --------- ------------- ---------- ---------- ------------- test_cls_redirect_dynptr.bpf.linked3.o cls_redirect 2987 2864 -123 (-4.12%) 240 231 -9 (-3.75%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 82848 82661 -187 (-0.23%) 5107 5073 -34 (-0.67%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 85116 84964 -152 (-0.18%) 5162 5130 -32 (-0.62%) Note, I avoided renaming jmp_history to more generic insn_hist to minimize number of lines changed and potential merge conflicts between bpf and bpf-next trees. Notice also cur_hist_entry pointer reset to NULL at the beginning of instruction verification loop. This pointer avoids the problem of relying on last jump history entry's insn_idx to determine whether we already have entry for current instruction or not. It can happen that we added jump history entry because current instruction is_jmp_point(), but also we need to add instruction flags for stack access. In this case, we don't want to entries, so we need to reuse last added entry, if it is present. Relying on insn_idx comparison has the same ambiguity problem as the one that was fixed recently in [0], so we avoid that. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231110002638.4168352-3-andrii@kernel.org/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reported-by: Tao Lyu <tao.lyu@epfl.ch> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-05	perf list: Fix JSON segfault by setting the used skip_duplicate_pmus callback	Ian Rogers
	Json output didn't set the skip_duplicate_pmus callback yielding a segfault. Fixes: cd4e1efbbc40 ("perf pmus: Skip duplicate PMUs and don't print list suffix by default") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231129213428.2227448-2-irogers@google.com [namhyung: updated subject line according to Arnaldo] Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-12-05	perf bench sched-seccomp-notify: Fix spelling mistake "synchronious" -> ↵	Colin Ian King
	"synchronous" There is a spelling mistake in an option description. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20230630080029.15614-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>