Age | Commit message (Collapse) | Author |
|
Add a new test for stage 2 faults when using different combinations of
guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
and types of faults (e.g., read on hugetlbfs with a hole). The next
commits will add different handling methods and more faults (e.g., uffd
and dirty logging). This first commit starts by adding two sanity checks
for all types of accesses: AF setting by the hw, and accessing memslots
with holes.
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-11-ricarkol@google.com
|
|
allocations
Now that kvm_vm allows specifying different memslots for code, page tables,
and data, use the appropriate memslot when making allocations in
common/libraty code. Change them accordingly:
- code (allocated by lib/elf) use the CODE memslot
- stacks, exception tables, and other core data pages (like the TSS in x86)
use the DATA memslot
- page tables and the PGD use the PT memslot
- test data (anything allocated with vm_vaddr_alloc()) uses the TEST_DATA
memslot
No functional change intended. All allocators keep using memslot #0.
Cc: Sean Christopherson <seanjc@google.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-10-ricarkol@google.com
|
|
Refactor virt_arch_pgd_alloc() and vm_vaddr_alloc() in both RISC-V and
aarch64 to fix the alignment of parameters in a couple of calls. This will
make it easier to fix the alignment in a future commit that adds an extra
parameter (that happens to be very long).
No functional change intended.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-9-ricarkol@google.com
|
|
The vm_create() helpers are hardcoded to place most page types (code,
page-tables, stacks, etc) in the same memslot #0, and always backed with
anonymous 4K. There are a couple of issues with that. First, tests
willing to differ a bit, like placing page-tables in a different backing
source type must replicate much of what's already done by the vm_create()
functions. Second, the hardcoded assumption of memslot #0 holding most
things is spread everywhere; this makes it very hard to change.
Fix the above issues by having selftests specify how they want memory to be
laid out. Start by changing ____vm_create() to not create memslot #0; a
test (to come) will specify all memslots used by the VM. Then, add the
vm->memslots[] array to specify the right memslot for different memory
allocators, e.g.,: lib/elf should use the vm->[MEM_REGION_CODE] memslot.
This will be used as a way to specify the page-tables memslots (to be
backed by huge pages for example).
There is no functional change intended. The current commit lays out memory
exactly as before. A future commit will change the allocators to get the
region they should be using, e.g.,: like the page table allocators using
the pt memslot.
Cc: Sean Christopherson <seanjc@google.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-8-ricarkol@google.com
|
|
Add the backing_src_type into struct userspace_mem_region. This struct
already stores a lot of info about memory regions, except the backing
source type. This info will be used by a future commit in order to
determine the method for punching a hole.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-7-ricarkol@google.com
|
|
Copy bitfield.h from include/linux/bitfield.h. A subsequent change will
make use of some FIELD_{GET,PREP} macros defined in this header.
The header was copied as-is, no changes needed.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-6-ricarkol@google.com
|
|
Define macros for memory type indexes and construct DEFAULT_MAIR_EL1
with macros from asm/sysreg.h. The index macros can then be used when
constructing PTEs (instead of using raw numbers).
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-5-ricarkol@google.com
|
|
Deleting a memslot (when freeing a VM) is not closing the backing fd,
nor it's unmapping the alias mapping. Fix by adding the missing close
and munmap.
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-4-ricarkol@google.com
|
|
Add a library function to get the PTE (a host virtual address) of a
given GVA. This will be used in a future commit by a test to clear and
check the access flag of a particular page.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-3-ricarkol@google.com
|
|
Move the generic userfaultfd code out of demand_paging_test.c into a
common library, userfaultfd_util. This library consists of a setup and a
stop function. The setup function starts a thread for handling page
faults using the handler callback function. This setup returns a
uffd_desc object which is then used in the stop function (to wait and
destroy the threads).
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221017195834.2295901-2-ricarkol@google.com
|
|
Currently, the debug-exceptions test always uses only
{break,watch}point#0 and the highest numbered context-aware
breakpoint. Modify the test to use all {break,watch}points and
context-aware breakpoints supported on the system.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-10-reijiw@google.com
|
|
Currently, the debug-exceptions test doesn't have a test case for
a linked watchpoint. Add a test case for the linked watchpoint to
the test. The new test case uses the highest numbered context-aware
breakpoint (for Context ID match), and the watchpoint#0, which is
linked to the context-aware breakpoint.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-9-reijiw@google.com
|
|
Currently, the debug-exceptions test doesn't have a test case for
a linked breakpoint. Add a test case for the linked breakpoint to
the test. The new test case uses a pair of breakpoints. One is the
higiest numbered context-aware breakpoint (for Context ID match),
and the other one is the breakpoint#0 (for Address Match), which
is linked to the context-aware breakpoint.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-8-reijiw@google.com
|
|
Change debug_version() to take the ID_AA64DFR0_EL1 value instead of
vcpu as an argument, and change its callsite to read ID_AA64DFR0_EL1
(and pass it to debug_version()).
Subsequent patches will reuse the register value in the callsite.
No functional change intended.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-7-reijiw@google.com
|
|
Currently, debug-exceptions test unnecessarily tracks some test stages
using GUEST_SYNC(). The code for it needs to be updated as test cases
are added or removed. Stop doing the unnecessary stage tracking,
as they are not so useful and are a bit pain to maintain.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-6-reijiw@google.com
|
|
Add helpers to enable breakpoint and watchpoint exceptions.
No functional change intended.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-5-reijiw@google.com
|
|
Remove the hard-coded {break,watch}point #0 from the guest_code() in
debug-exceptions to allow {break,watch}point number to be specified.
Change reset_debug_state() to zeroing all dbg{b,w}{c,v}r_el0 registers
so that guest_code() can use the function to reset those registers
even when non-zero {break,watch}points are specified for guest_code().
Subsequent patches will add test cases for non-zero {break,watch}points.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-4-reijiw@google.com
|
|
Introduce helpers in the debug-exceptions test to write to
dbg{b,w}{c,v}r registers. Those helpers will be useful for
test cases that will be added to the test in subsequent patches.
No functional change intended.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-3-reijiw@google.com
|
|
Use FIELD_GET() macro to extract ID register fields for existing
aarch64 selftests code. No functional change intended.
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020054202.2119018-2-reijiw@google.com
|
|
The memory area in each slot should be aligned to host page size.
Otherwise, the test will fail. For example, the following command
fails with the following messages with 64KB-page-size-host and
4KB-pae-size-guest. It's not user friendly to abort the test.
Lets do something to report the optimal memory slots, instead of
failing the test.
# ./memslot_perf_test -v -s 1000
Number of memory slots: 999
Testing map performance with 1 runs, 5 seconds each
Adding slots 1..999, each slot with 8 pages + 216 extra pages last
==== Test Assertion Failure ====
lib/kvm_util.c:824: vm_adjust_num_guest_pages(vm->mode, npages) == npages
pid=19872 tid=19872 errno=0 - Success
1 0x00000000004065b3: vm_userspace_mem_region_add at kvm_util.c:822
2 0x0000000000401d6b: prepare_vm at memslot_perf_test.c:273
3 (inlined by) test_execute at memslot_perf_test.c:756
4 (inlined by) test_loop at memslot_perf_test.c:994
5 (inlined by) main at memslot_perf_test.c:1073
6 0x0000ffff7ebb4383: ?? ??:0
7 0x00000000004021ff: _start at :?
Number of guest pages is not compatible with the host. Try npages=16
Report the optimal memory slots instead of failing the test when
the memory area in each slot isn't aligned to host page size. With
this applied, the optimal memory slots is reported.
# ./memslot_perf_test -v -s 1000
Number of memory slots: 999
Testing map performance with 1 runs, 5 seconds each
Memslot count too high for this test, decrease the cap (max is 514)
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-7-gshan@redhat.com
|
|
The addresses and sizes passed to vm_userspace_mem_region_add() and
madvise() should be aligned to host page size, which can be 64KB on
aarch64. So it's wrong by passing additional fixed 4KB memory area
to various tests.
Fix it by passing additional fixed 64KB memory area to various tests.
We also add checks to ensure that none of host/guest page size exceeds
64KB. MEM_TEST_MOVE_SIZE is fixed up to 192KB either.
With this, the following command works fine on 64KB-page-size-host and
4KB-page-size-guest.
# ./memslot_perf_test -v -s 512
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-6-gshan@redhat.com
|
|
The test case is obviously broken on aarch64 because non-4KB guest
page size is supported. The guest page size on aarch64 could be 4KB,
16KB or 64KB.
This supports variable guest page size, mostly for aarch64.
- The host determines the guest page size when virtual machine is
created. The value is also passed to guest through the synchronization
area.
- The number of guest pages are unknown until the virtual machine
is to be created. So all the related macros are dropped. Instead,
their values are dynamically calculated based on the guest page
size.
- The static checks on memory sizes and pages becomes dependent
on guest page size, which is unknown until the virtual machine
is about to be created. So all the static checks are converted
to dynamic checks, done in check_memory_sizes().
- As the address passed to madvise() should be aligned to host page,
the size of page chunk is automatically selected, other than one
page.
- MEM_TEST_MOVE_SIZE has fixed and non-working 64KB. It will be
consolidated in next patch. However, the comments about how
it's calculated has been correct.
- All other changes included in this patch are almost mechanical
replacing '4096' with 'guest_page_size'.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-5-gshan@redhat.com
|
|
prepare_vm() is called in every iteration and run. The allowed memory
slots (KVM_CAP_NR_MEMSLOTS) are probed for multiple times. It's not
free and unnecessary.
Move the probing logic for the allowed memory slots to parse_args()
for once, which is upper layer of prepare_vm().
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-4-gshan@redhat.com
|
|
There are two loops in prepare_vm(), which have different conditions.
'slot' is treated as meory slot index in the first loop, but index of
the host virtual address array in the second loop. It makes it a bit
hard to understand the code.
Change the usage of 'slot' in the second loop, to treat it as the
memory slot index either.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-3-gshan@redhat.com
|
|
In prepare_vm(), 'data->nslots' is assigned with 'max_mem_slots - 1'
at the beginning, meaning they are interchangeable.
Use 'data->nslots' isntead of 'max_mem_slots - 1'. With this, it
becomes easier to move the logic of probing number of slots into
upper layer in subsequent patches.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221020071209.559062-2-gshan@redhat.com
|
|
Rename the neoverse-n2 folder to make it clear that it includes V2, and
add V2 to mapfile.csv. V2 has the same events as N2, visible by running
the following command in the ARM-software/data github repo [1]:
diff pmu/neoverse-v2.json pmu/neoverse-n2.json | grep code
Testing:
$ perf test pmu
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
[1]: https://github.com/ARM-software/data
Reviewed-by: Nick Forrington <nick.forrington@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221020134512.1345013-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since variable npmus is unsigned int, comparing with 0 is unnecessary.
Signed-off-by: Kang Minchul <tegongkang@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221105135932.81612-1-tegongkang@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When converting recorded data into JSON format, perf data omits probe
variables. Add them to the output in the format "field name": "field value"
using tep_print_field:
$ perf data convert --to-json output.json
// output.json
{
"linux-perf-json-version": 1,
"headers": { ... },
"samples": [
{
"timestamp": 29182079082999,
"pid": 309194,
[...]
"__probe_ip": "0x93ee35",
"query_string_string": "select 2;",
"nxids": "0"
}
]
}
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221109103932.65675-1-9erthalion6@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To support live monitoring of kernel lock contention without BPF,
it should support something like below:
# perf lock record -a -o- sleep 1 | perf lock contention -i-
contended total wait max wait avg wait type caller
2 10.27 us 6.17 us 5.13 us spinlock load_balance+0xc03
1 5.29 us 5.29 us 5.29 us rwlock:W ep_scan_ready_list+0x54
1 4.12 us 4.12 us 4.12 us spinlock smpboot_thread_fn+0x116
1 3.28 us 3.28 us 3.28 us mutex pipe_read+0x50
To do that, it needs to handle HEAD_ATTR, HEADER_EVENT_UPDATE and
HEADER_TRACING_DATA which are generated only for the pipe mode.
And setting event handler also should be delayed until it gets the
event information.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221104051440.220989-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One more before going the BTF way:
# perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o,*nanosleep
? pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0
? gpm/1042 ... [continued]: clock_nanosleep()) = 0
1.232 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
1.232 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0
327.329 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ...
1002.482 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) = 0
327.329 gpm/1042 ... [continued]: clock_nanosleep()) = 0
2003.947 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
2003.947 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0
2327.858 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ...
? crond/1384 ... [continued]: clock_nanosleep()) = 0
3005.382 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
3005.382 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0
3675.633 crond/1384 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffcc02b66b0) ...
^C#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In the dirty ring case, we rely on vcpu exit due to full dirty ring
state. On ARM64 system, there are 4096 host pages when the host
page size is 64KB. In this case, the vcpu never exits due to the
full dirty ring state. The similar case is 4KB page size on host
and 64KB page size on guest. The vcpu corrupts same set of host
pages, but the dirty page information isn't collected in the main
thread. This leads to infinite loop as the following log shows.
# ./dirty_log_test -M dirty-ring -c 65536 -m 5
Setting log mode to: 'dirty-ring'
Test iterations: 32, interval: 10 (ms)
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
guest physical test memory offset: 0xffbffe0000
vcpu stops because vcpu is kicked out...
Notifying vcpu to continue
vcpu continues now.
Iteration 1 collected 576 pages
<No more output afterwards>
Fix the issue by automatically choosing the best dirty ring size,
to ensure vcpu exit due to full dirty ring state. The option '-c'
becomes a hint to the dirty ring count, instead of the value of it.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110104914.31280-8-gshan@redhat.com
|
|
There are two states, which need to be cleared before next mode
is executed. Otherwise, we will hit failure as the following messages
indicate.
- The variable 'dirty_ring_vcpu_ring_full' shared by main and vcpu
thread. It's indicating if the vcpu exit due to full ring buffer.
The value can be carried from previous mode (VM_MODE_P40V48_4K) to
current one (VM_MODE_P40V48_64K) when VM_MODE_P40V48_16K isn't
supported.
- The current ring buffer index needs to be reset before next mode
(VM_MODE_P40V48_64K) is executed. Otherwise, the stale value is
carried from previous mode (VM_MODE_P40V48_4K).
# ./dirty_log_test -M dirty-ring
Setting log mode to: 'dirty-ring'
Test iterations: 32, interval: 10 (ms)
Testing guest mode: PA-bits:40, VA-bits:48, 4K pages
guest physical test memory offset: 0xffbfffc000
:
Dirtied 995328 pages
Total bits checked: dirty (1012434), clear (7114123), track_next (966700)
Testing guest mode: PA-bits:40, VA-bits:48, 64K pages
guest physical test memory offset: 0xffbffc0000
vcpu stops because vcpu is kicked out...
vcpu continues now.
Notifying vcpu to continue
Iteration 1 collected 0 pages
vcpu stops because dirty ring is full...
vcpu continues now.
vcpu stops because dirty ring is full...
vcpu continues now.
vcpu stops because dirty ring is full...
==== Test Assertion Failure ====
dirty_log_test.c:369: cleared == count
pid=10541 tid=10541 errno=22 - Invalid argument
1 0x0000000000403087: dirty_ring_collect_dirty_pages at dirty_log_test.c:369
2 0x0000000000402a0b: log_mode_collect_dirty_pages at dirty_log_test.c:492
3 (inlined by) run_test at dirty_log_test.c:795
4 (inlined by) run_test at dirty_log_test.c:705
5 0x0000000000403a37: for_each_guest_mode at guest_modes.c:100
6 0x0000000000401ccf: main at dirty_log_test.c:938
7 0x0000ffff9ecd279b: ?? ??:0
8 0x0000ffff9ecd286b: ?? ??:0
9 0x0000000000401def: _start at ??:?
Reset dirty pages (0) mismatch with collected (35566)
Fix the issues by clearing 'dirty_ring_vcpu_ring_full' and the ring
buffer index before next new mode is to be executed.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110104914.31280-7-gshan@redhat.com
|
|
In vcpu_map_dirty_ring(), the guest's page size is used to figure out
the offset in the virtual area. It works fine when we have same page
sizes on host and guest. However, it fails when the page sizes on host
and guest are different on arm64, like below error messages indicates.
# ./dirty_log_test -M dirty-ring -m 7
Setting log mode to: 'dirty-ring'
Test iterations: 32, interval: 10 (ms)
Testing guest mode: PA-bits:40, VA-bits:48, 64K pages
guest physical test memory offset: 0xffbffc0000
vcpu stops because vcpu is kicked out...
Notifying vcpu to continue
vcpu continues now.
==== Test Assertion Failure ====
lib/kvm_util.c:1477: addr == MAP_FAILED
pid=9000 tid=9000 errno=0 - Success
1 0x0000000000405f5b: vcpu_map_dirty_ring at kvm_util.c:1477
2 0x0000000000402ebb: dirty_ring_collect_dirty_pages at dirty_log_test.c:349
3 0x00000000004029b3: log_mode_collect_dirty_pages at dirty_log_test.c:478
4 (inlined by) run_test at dirty_log_test.c:778
5 (inlined by) run_test at dirty_log_test.c:691
6 0x0000000000403a57: for_each_guest_mode at guest_modes.c:105
7 0x0000000000401ccf: main at dirty_log_test.c:921
8 0x0000ffffb06ec79b: ?? ??:0
9 0x0000ffffb06ec86b: ?? ??:0
10 0x0000000000401def: _start at ??:?
Dirty ring mapped private
Fix the issue by using host's page size to map the ring buffer.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110104914.31280-6-gshan@redhat.com
|
|
If you don't run slabinfo with a superuser, return 0 when read_slab_dir()
reads get_obj_and_str("slabs", &t), because fopen() fails (sometimes
EACCES), causing slabcache() to return directly, without any error during
this time, we should tell the user about the EACCES problem instead of
running successfully($?=0) without any error printing.
For example:
$ ./slabinfo
Permission denied, Try using superuser <== What this submission did
$ sudo ./slabinfo
Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg
Acpi-Namespace 5950 48 286.7K 65/0/5 85 0 0 99
Acpi-Operand 13664 72 999.4K 231/0/13 56 0 0 98
...
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
When showing the result of a test group, if one
of the subtests was skipped, while still having
passing subtests, the group result was marked as
SKIP. E.g.:
223/1 usdt/basic:SKIP
223/2 usdt/multispec:OK
223/3 usdt/urand_auto_attach:OK
223/4 usdt/urand_pid_attach:OK
223 usdt:SKIP
The test result of usdt in the example above
should be OK instead of SKIP, because the test
group did have passing tests and it would be
considered in "normal" state.
With this change, only if all of the subtests
were skipped, the group test is marked as SKIP.
When only some of the subtests are skipped, a
more detailed result is given, stating how
many of the subtests were skipped. E.g:
223/1 usdt/basic:SKIP
223/2 usdt/multispec:OK
223/3 usdt/urand_auto_attach:OK
223/4 usdt/urand_pid_attach:OK
223 usdt:OK (SKIP: 1/4)
Signed-off-by: Domenico Cerasuolo <dceras@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221109184039.3514033-1-cerasuolodomenico@gmail.com
|
|
Tests to verify the following behavior of `btf_dedup_resolve_fwds`:
- remapping for struct forward declarations;
- remapping for union forward declarations;
- no remapping if forward declaration kind does not match similarly
named struct or union declaration;
- no remapping if forward declaration name is ambiguous;
- base ids are considered for fwd resolution in split btf scenario.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20221109142611.879983-4-eddyz87@gmail.com
|
|
Resolve forward declarations that don't take part in type graphs
comparisons if declaration name is unambiguous. Example:
CU #1:
struct foo; // standalone forward declaration
struct foo *some_global;
CU #2:
struct foo { int x; };
struct foo *another_global;
The `struct foo` from CU #1 is not a part of any definition that is
compared against another definition while `btf_dedup_struct_types`
processes structural types. The the BTF after `btf_dedup_struct_types`
the BTF looks as follows:
[1] STRUCT 'foo' size=4 vlen=1 ...
[2] INT 'int' size=4 ...
[3] PTR '(anon)' type_id=1
[4] FWD 'foo' fwd_kind=struct
[5] PTR '(anon)' type_id=4
This commit adds a new pass `btf_dedup_resolve_fwds`, that maps such
forward declarations to structs or unions with identical name in case
if the name is not ambiguous.
The pass is positioned before `btf_dedup_ref_types` so that types
[3] and [5] could be merged as a same type after [1] and [4] are merged.
The final result for the example above looks as follows:
[1] STRUCT 'foo' size=4 vlen=1
'x' type_id=2 bits_offset=0
[2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] PTR '(anon)' type_id=1
For defconfig kernel with BTF enabled this removes 63 forward
declarations. Examples of removed declarations: `pt_regs`, `in6_addr`.
The running time of `btf__dedup` function is increased by about 3%.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221109142611.879983-3-eddyz87@gmail.com
|
|
An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing both long and void* keys and values.
This simplifies many use cases in libbpf as hashmaps there are mostly
integer to integer.
Perf copies hashmap implementation from libbpf and has to be
updated as well.
Changes to libbpf, selftests/bpf and perf are packed as a single
commit to avoid compilation issues with any future bisect.
Polymorphic interface is acheived by hiding hashmap interface
functions behind auxiliary macros that take care of necessary
type casts, for example:
#define hashmap_cast_ptr(p) \
({ \
_Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\
#p " pointee should be a long-sized integer or a pointer"); \
(long *)(p); \
})
bool hashmap_find(const struct hashmap *map, long key, long *value);
#define hashmap__find(map, key, value) \
hashmap_find((map), (long)(key), hashmap_cast_ptr(value))
- hashmap__find macro casts key and value parameters to long
and long* respectively
- hashmap_cast_ptr ensures that value pointer points to a memory
of appropriate size.
This hack was suggested by Andrii Nakryiko in [1].
This is a follow up for [2].
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/
[2] https://lore.kernel.org/bpf/af1facf9-7bc8-8a3d-0db4-7b3f333589a2@meta.com/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221109142611.879983-2-eddyz87@gmail.com
|
|
Test that locked bridge port configurations that are not supported by
mlxsw are rejected.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Test that packets received via a locked bridge port whose {SMAC, VID}
does not appear in the bridge's FDB or appears with a different port,
trigger the "locked_port" packet trap.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Test that packets with a destination MAC of 01:80:C2:00:00:03 trigger
the "eapol" packet trap.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merely checking whether a trap counter incremented or not without
logging a test result is useful on its own. Split this functionality to
a helper which will be used by subsequent patches.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
test_progs fails to be compiled in the 32-bit arch, log is as follows:
test_progs.c:1013:52: error: format '%ld' expects argument of type 'long int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
1013 | sprintf(buf, "MSG_TEST_LOG (cnt: %ld, last: %d)",
| ~~^
| |
| long int
| %d
1014 | strlen(msg->test_log.log_buf),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| size_t {aka unsigned int}
Fix it.
Fixes: 91b2c0afd00c ("selftests/bpf: Add parallelism to test_progs")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221108015857.132457-1-yangjihong1@huawei.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
32-bit platforms
When cross-compiling test_verifier for 32-bit platforms, the casting error is shown below:
test_verifier.c:1263:27: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
1263 | info.xlated_prog_insns = (__u64)*buf;
| ^
cc1: all warnings being treated as errors
Fix it by adding zero-extension for it.
Fixes: 933ff53191eb ("selftests/bpf: specify expected instructions in test_verifier tests")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221108121945.4104644-1-pulehui@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The kernel driver assumes hybrid CPUs will have Intel PT capabilities
that are compatible with the boot CPU. Add a test to check that is the
case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In preparation for adding more Intel PT testing, redefine the test_suite
to allow for adding more subtests.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
intel-pt subtests
In preparation for adding more Intel PT testing, rename
intel-pt-pkt-decoder-test.c to intel-pt-test.c.
Subtests will later be added to intel-pt-test.c.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add coverage for FEAT_SVE2p1.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since the newly added instruction is in the HINT space we can't reasonably
test for it actually being present.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add FEAT_CSSC to the set of features checked by the hwcap selftest.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|