Age | Commit message (Collapse) | Author |
|
Add a test to compare the output of perf-stat with and without option
--bpf-counters. If the difference is more than 10%, the test is considered
as failed.
Committer testing:
# perf test bpf-counters
86: perf stat --bpf-counters test : Ok
# perf test -v bpf-counters
86: perf stat --bpf-counters test :
--- start ---
test child forked, pid 2433339
test child finished with 0
---- end ----
perf stat --bpf-counters test: Ok
#
Signed-off-by: Song Liu <songliubraving@fb.com>
Requested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/EC00E37D-8587-4662-8E30-7AD5F874FA84@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Take measurements of 't0' and 'ref_time' after enable_counters(), so
that they only measure the time consumed when the counters are enabled.
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Andi Kleen <andi@firstfloor.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: kernel-team@fb.com
Link: http://lore.kernel.org/lkml/20210316211837.910506-3-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The perf tool uses performance monitoring counters (PMCs) to monitor
system performance. The PMCs are limited hardware resources. For
example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
Modern data center systems use these PMCs in many different ways: system
level monitoring, (maybe nested) container level monitoring, per process
monitoring, profiling (in sample mode), etc. In some cases, there are
more active perf_events than available hardware PMCs. To allow all
perf_events to have a chance to run, it is necessary to do expensive
time multiplexing of events.
On the other hand, many monitoring tools count the common metrics
(cycles, instructions). It is a waste to have multiple tools create
multiple perf_events of "cycles" and occupy multiple PMCs.
bperf tries to reduce such wastes by allowing multiple perf_events of
"cycles" or "instructions" (at different scopes) to share PMUs. Instead
of having each perf-stat session to read its own perf_events, bperf uses
BPF programs to read the perf_events and aggregate readings to BPF maps.
Then, the perf-stat session(s) reads the values from these BPF maps.
Please refer to the comment before the definition of bperf_ops for the
description of bperf architecture.
bperf is off by default. To enable it, pass --bpf-counters option to
perf-stat. bperf uses a BPF hashmap to share information about BPF
programs and maps used by bperf. This map is pinned to bpffs. The
default path is /sys/fs/bpf/perf_attr_map. The user could change the
path with option --bpf-attr-map.
Committer testing:
# dmesg|grep "Performance Events" -A5
[ 0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
[ 0.225280] ... version: 0
[ 0.225280] ... bit width: 48
[ 0.225281] ... generic registers: 6
[ 0.225281] ... value mask: 0000ffffffffffff
[ 0.225281] ... max period: 00007fffffffffff
#
# for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
[1] 2436231
[2] 2436232
[3] 2436233
[4] 2436234
[5] 2436235
[6] 2436236
# perf stat -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
310,326,987 cycles (41.87%)
236,143,290 instructions # 0.76 insn per cycle (41.87%)
0.100800885 seconds time elapsed
#
We can see that the counters were enabled for this workload 41.87% of
the time.
Now with --bpf-counters:
# for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
[1] 2436514
[2] 2436515
[3] 2436516
[4] 2436517
[5] 2436518
[6] 2436519
[7] 2436520
[8] 2436521
[9] 2436522
[10] 2436523
[11] 2436524
[12] 2436525
[13] 2436526
[14] 2436527
[15] 2436528
[16] 2436529
[17] 2436530
[18] 2436531
[19] 2436532
[20] 2436533
[21] 2436534
[22] 2436535
[23] 2436536
[24] 2436537
[25] 2436538
[26] 2436539
[27] 2436540
[28] 2436541
[29] 2436542
[30] 2436543
[31] 2436544
[32] 2436545
#
# ls -la /sys/fs/bpf/perf_attr_map
-rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
# bpftool map | grep bperf | wc -l
64
#
# bpftool map | tail
1265: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1266: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1267: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 996
pids perf(2436545)
1268: percpu_array name accum_readings flags 0x0
key 4B value 24B max_entries 1 memlock 4096B
1269: hash name filter flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
1270: array name bperf_fo.bss flags 0x400
key 4B value 8B max_entries 1 memlock 4096B
btf_id 997
pids perf(2436541)
1285: array name pid_iter.rodata flags 0x480
key 4B value 4B max_entries 1 memlock 4096B
btf_id 1017 frozen
pids bpftool(2437504)
1286: array flags 0x0
key 4B value 32B max_entries 1 memlock 4096B
#
# bpftool map dump id 1268 | tail
value (CPU 21):
8f f3 bc ca 00 00 00 00 80 fd 2a d1 4d 00 00 00
80 fd 2a d1 4d 00 00 00
value (CPU 22):
7e d5 64 4d 00 00 00 00 a4 8a 2e ee 4d 00 00 00
a4 8a 2e ee 4d 00 00 00
value (CPU 23):
a7 78 3e 06 01 00 00 00 b2 34 94 f6 4d 00 00 00
b2 34 94 f6 4d 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
c6 8b d9 ca 00 00 00 00 20 c6 fc 83 4e 00 00 00
20 c6 fc 83 4e 00 00 00
value (CPU 22):
9c b4 d2 4d 00 00 00 00 3e 0c df 89 4e 00 00 00
3e 0c df 89 4e 00 00 00
value (CPU 23):
18 43 66 06 01 00 00 00 5b 69 ed 83 4e 00 00 00
5b 69 ed 83 4e 00 00 00
Found 1 element
# bpftool map dump id 1268 | tail
value (CPU 21):
f2 6e db ca 00 00 00 00 92 67 4c ba 4e 00 00 00
92 67 4c ba 4e 00 00 00
value (CPU 22):
dc 8e e1 4d 00 00 00 00 d9 32 7a c5 4e 00 00 00
d9 32 7a c5 4e 00 00 00
value (CPU 23):
bd 2b 73 06 01 00 00 00 7c 73 87 bf 4e 00 00 00
7c 73 87 bf 4e 00 00 00
Found 1 element
#
# perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
Performance counter stats for 'system wide':
119,410,122 cycles
152,105,479 instructions # 1.27 insn per cycle
0.101395093 seconds time elapsed
#
See? We had the counters enabled all the time.
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kernel-team@fb.com
Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
accumulated over the years.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com
Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit fixes from Shuah Khan:
"Two fixes to the kunit tool from David Gow"
* tag 'linux-kselftest-kunit-fixes-5.12-rc5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: Disable PAGE_POISONING under --alltests
kunit: tool: Fix a python tuple typing error
|
|
Out of the box Ubuntu's 20.04 compiler warns about missing return value
checks for fscanf() calls.
Make GCC happy by checking whether we actually parsed one integer.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broone@kernel.org>
Link: https://lore.kernel.org/r/20210319165334.29213-4-andre.przywara@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The GCC manual suggests to use -pthread, when linking with the PThread
library, also to add this switch to both the compilation and linking
stages.
Do as the manual says, to fix compilation with Ubuntu's 20.04 toolchain,
which was getting -lpthread too early on the command line:
------------
/usr/bin/ld: /tmp/cc5zbo2A.o: in function `execute_test':
tools/testing/selftests/arm64/mte/check_gcr_el1_cswitch.c:86:
undefined reference to `pthread_create'
/usr/bin/ld: tools/testing/selftests/arm64/mte/check_gcr_el1_cswitch.c:90:
undefined reference to `pthread_join'
------------
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broone@kernel.org>
Link: https://lore.kernel.org/r/20210319165334.29213-3-andre.przywara@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The mte selftest Makefile contains a check for GCC, to add the memtag
-march flag to the compiler options. This check fails if the compiler
is not explicitly specified, so reverts to the standard "cc", in which
case --version doesn't mention the "gcc" string we match against:
$ cc --version | head -n 1
cc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
This will not add the -march switch to the command line, so compilation
fails:
mte_helper.S: Assembler messages:
mte_helper.S:25: Error: selected processor does not support `irg x0,x0,xzr'
mte_helper.S:38: Error: selected processor does not support `gmi x1,x0,xzr'
...
Actually clang accepts the same -march option as well, so we can just
drop this check and add this unconditionally to the command line, to avoid
any future issues with this check altogether (gcc actually prints
basename(argv[0]) when called with --version).
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mark Brown <broone@kernel.org>
Link: https://lore.kernel.org/r/20210319165334.29213-2-andre.przywara@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Fix the following coccicheck warnings:
./tools/testing/selftests/firmware/fw_namespace.c:98:54-59: WARNING:
conversion to bool not needed here.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/1613639529-41139-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Skip BTF fixup step when input object file is missing BTF altogether.
Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org
|
|
Fix ~56 single-word typos in timekeeping & clocksource code comments.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-kernel@vger.kernel.org
|
|
Some versions of grep are happy to interpret a nonsensically placed "-"
within a "[]" pattern as a dash, while others give an error message.
This commit therefore places the "-" at the end of the expression where
it was supposed to be in the first place.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Currently, kvm-again.sh updates the duration in the "seconds=" comment
in the qemu-cmd file, but kvm-transform.sh updates the duration in the
actual qemu command arguments. This is an accident waiting to happen.
This commit therefore consolidates these updates into kvm-transform.sh.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The kvm-again.sh script does not copy over the vmlinux files due to
their large size. This means that a gdb run must use the vmlinux file
from the original "res" directory. This commit therefore finds that
directory and prints it out so that the user can copy and pasted the
gdb command just as for the initial run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Because the TORTURE_TRUST_MAKE environment variable is not recorded,
kvm-again.sh runs can result in the parse-build.sh script emitting
false-positive "BUG: TREE03 no build" messages. These messages are
intended to complain about any lack of compiler invocations when the
--trust-make flag is not given to kvm.sh. However, when this flag is
given to kvm.sh (and thus when TORTURE_TRUST_MAKE=y), lack of compiler
invocations is expected behavior when rebuilding from identical source
code.
This commit therefore makes kvm-test-1-run.sh record the value of the
TORTURE_TRUST_MAKE environment variable as an additional comment in the
qemu-cmd file, and also makes kvm-again.sh reconstitute that variable
from that comment.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
When rerunning an old run using kvm-again.sh, the jitter commands
will re-use the original "res" directory. This works, but is clearly
an accident waiting to happen. And this accident will happen with
remote runs, where the original directory lives on some other system.
This commit therefore updates the qemu-cmd commands to use the new res
directory created for this specific run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit adds a --duration argument to kvm-again.sh to allow the user
to override the --duration specified for the original kvm.sh run.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit adds a kvm-again.sh script that, given the results directory
of a torture-test run, re-runs that test. This means that the kernels
need not be rebuilt, but it also is a step towards running torture tests
on remote systems.
This commit also adds a kvm-test-1-run-batch.sh script that runs one
batch out of the torture test. The idea is to copy a results directory
tree to remote systems, then use kvm-test-1-run-batch.sh to run batches
on these systems.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit creates a "batches" file in the res/$ds directory, where $ds
is the datestamp. This file contains the batches and the number of CPUs,
for example:
1 TREE03 16
1 SRCU-P 8
2 TREE07 16
2 TREE01 8
3 TREE02 8
3 TREE04 8
3 TREE05 8
4 SRCU-N 4
4 TRACE01 4
4 TRACE02 4
4 RUDE01 2
4 RUDE01.2 2
4 TASKS01 2
4 TASKS03 2
4 SRCU-t 1
4 SRCU-u 1
4 TASKS02 1
4 TINY01 1
5 TINY02 1
5 TREE09 1
The first column is the batch number, the second the scenario number
(possibly suffixed by a repetition number, as in "RUDE01.2"), and the
third is the number of CPUs required by that scenario. The last line
shows the number of CPUs expected by this batch file, which allows
the run to be re-batched if a different number of CPUs is available.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Although it might be unlikely that someone would name a scenario
"TORTURE_SUITE", they are within their rights to do so. This script
therefore renames the "TORTURE_SUITE" file in the top-level date-stamped
directory within "res" to "torture_suite" to avoid this name collision.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit enforces the defacto restriction on scenario names, which is
that they contain neither "/", ".", nor lowercase alphabetic characters.
This restriction avoids collisions between scenario names and the torture
scripting's files and directories.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The convention that scenario names are all uppercase has two exceptions,
SRCU-t and SRCU-u. This commit therefore renames them to SRCU-T and
SRCU-U, respectively, to bring them in line with this convention. This in
turn permits tighter argument checking in the torture-test scripting.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
The cpus2use.sh script complains if the mpstat command is not available,
and instead uses all available CPUs. Unfortunately, this complaint
goes to stdout, where it confuses invokers who expect a single number.
This commit removes this error message in order to avoid this confusion.
The tendency of late has been to give rcutorture a full system, so this
should not cause issues.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit records the process IDs of the kvm-test-1-run.sh and
kvm-test-1-run-qemu.sh scripts to ease monitoring of remotely running
instances of these scripts.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Distributed runs of rcutorture will need to start and stop jittering on
the remote hosts, which means that the commands must be communicated to
those hosts. The commit therefore causes kvm.sh to place these commands
in new TORTURE_JITTER_START and TORTURE_JITTER_STOP environment variables
to communicate them to the scripts that will set this up. In addition,
this commit causes kvm-test-1-run.sh to append these commands to each
generated qemu-cmd file, which allows any remotely executing script to
extract the needed commands from this file.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Currently, kvm-test-1-run.sh both builds and runs an rcutorture kernel,
which is inconvenient when it is necessary to re-run an old run or to
carry out a run on a remote system. This commit therefore extracts the
portion of kvm-test-1-run.sh that invoke qemu to actually run rcutorture
and places it in kvm-test-1-run-qemu.sh.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
When re-running old rcutorture builds, if the original run involved
gdb, the re-run also needs to do so. This commit therefore records the
TORTURE_KCONFIG_GDB_ARG environment variable into the qemu-cmd file so
that the re-run can access it.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit creates jitterstart.sh and jitterstop.sh scripts that handle
the starting and stopping of the jitter.sh scripts. These must be sourced
using the bash "." command to allow the generated script to wait on the
backgrounded jitter.sh scripts.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
Trivial fix.
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The "First Fault Register" (FFR) is an SVE register that mimics a
predicate register, but clears bits when a load or store fails to handle
an element of a vector. The supposed usage scenario is to initialise
this register (using SETFFR), then *read* it later on to learn about
elements that failed to load or store. Explicit writes to this register
using the WRFFR instruction are only supposed to *restore* values
previously read from the register (for context-switching only).
As the manual describes, this register holds only certain values, it:
"... contains a monotonic predicate value, in which starting from bit 0
there are zero or more 1 bits, followed only by 0 bits in any remaining
bit positions."
Any other value is UNPREDICTABLE and is not supposed to be "restored"
into the register.
The SVE test currently tries to write a signature pattern into the
register, which is *not* a canonical FFR value. Apparently the existing
setups treat UNPREDICTABLE as "read-as-written", but a new
implementation actually only stores canonical values. As a consequence,
the sve-test fails immediately when comparing the FFR value:
-----------
# ./sve-test
Vector length: 128 bits
PID: 207
Mismatch: PID=207, iteration=0, reg=48
Expected [cf00]
Got [0f00]
Aborted
-----------
Fix this by only populating the FFR with proper canonical values.
Effectively the requirement described above limits us to 17 unique
values over 16 bits worth of FFR, so we condense our signature down to 4
bits (2 bits from the PID, 2 bits from the generation) and generate the
canonical pattern from it. Any bits describing elements above the
minimum 128 bit are set to 0.
This aligns the FFR usage to the architecture and fixes the test on
microarchitectures implementing FFR in a more restricted way.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210319120128.29452-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Conflicts:
arch/x86/kernel/kprobes/ftrace.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2021-03-20
The following pull-request contains BPF updates for your *net* tree.
We've added 5 non-merge commits during the last 3 day(s) which contain
a total of 8 files changed, 155 insertions(+), 12 deletions(-).
The main changes are:
1) Use correct nops in fexit trampoline, from Stanislav.
2) Fix BTF dump, from Jean-Philippe.
3) Fix umd memory leak, from Zqiang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bpftool used to issue forward declarations for a struct used as part of
a pointer to array, which is invalid. Add a test to check that the
struct is fully defined in this case:
@@ -134,9 +134,9 @@
};
};
-struct struct_in_array {};
+struct struct_in_array;
-struct struct_in_array_typed {};
+struct struct_in_array_typed;
typedef struct struct_in_array_typed struct_in_array_t[2];
@@ -189,3 +189,7 @@
struct struct_with_embedded_stuff _14;
};
+struct struct_in_array {};
+
+struct struct_in_array_typed {};
+
...
#13/1 btf_dump: syntax:FAIL
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319112554.794552-3-jean-philippe@linaro.org
|
|
The vmlinux.h generated from BTF is invalid when building
drivers/phy/ti/phy-gmii-sel.c with clang:
vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’
61702 | const struct reg_field (*regfields)[3];
| ^~~~~~~~~
bpftool generates a forward declaration for this struct regfield, which
compilers aren't happy about. Here's a simplified reproducer:
struct inner {
int val;
};
struct outer {
struct inner (*ptr_to_array)[2];
} A;
After build with clang -> bpftool btf dump c -> clang/gcc:
./def-clang.h:11:23: error: array has incomplete element type 'struct inner'
struct inner (*ptr_to_array)[2];
Member ptr_to_array of struct outer is a pointer to an array of struct
inner. In the DWARF generated by clang, struct outer appears before
struct inner, so when converting BTF of struct outer into C, bpftool
issues a forward declaration to struct inner. With GCC the DWARF info is
reversed so struct inner gets fully defined.
That forward declaration is not sufficient when compilers handle an
array of the struct, even when it's only used through a pointer. Note
that we can trigger the same issue with an intermediate typedef:
struct inner {
int val;
};
typedef struct inner inner2_t[2];
struct outer {
inner2_t *ptr_to_array;
} A;
Becomes:
struct inner;
typedef struct inner inner2_t[2];
And causes:
./def-clang.h:10:30: error: array has incomplete element type 'struct inner'
typedef struct inner inner2_t[2];
To fix this, clear through_ptr whenever we encounter an intermediate
array, to make the inner struct part of a strong link and force full
declaration.
Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org
|
|
Similar to
https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/
When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g:
DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts,
.field_name = var_ident,
.indent_level = 2,
.strip_mods = strip_mods,
);
and compiled in debug mode, the compiler generates code which
leaves the padding uninitialized and triggers errors within libbpf APIs
which require strict zero initialization of OPTS structs.
Adding anonymous padding field fixes the issue.
Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org
|
|
The ECN bit defines ECT(1) = 1, ECT(0) = 2. So inner 0x02 + outer 0x01
should be inner ECT(0) + outer ECT(1). Based on the description of
__INET_ECN_decapsulate, the final decapsulate value should be
ECT(1). So fix the test expect value to 0x01.
Before the fix:
TEST: VXLAN: ECN decap: 01/02->0x02 [FAIL]
Expected to capture 10 packets, got 0.
After the fix:
TEST: VXLAN: ECN decap: 01/02->0x01 [ OK ]
Fixes: a0b61f3d8ebf ("selftests: forwarding: vxlan_bridge_1d: Add an ECN decap test")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SGX device file (/dev/sgx_enclave) is unusual in that it requires
execute permissions. It has to be both "chmod +x" *and* be on a
filesystem without 'noexec'.
In the future, udev and systemd should get updates to set up systems
automatically. But, for now, nobody's systems do this automatically,
and everybody gets error messages like this when running ./test_sgx:
0x0000000000000000 0x0000000000002000 0x03
0x0000000000002000 0x0000000000001000 0x05
0x0000000000003000 0x0000000000003000 0x03
mmap() failed, errno=1.
That isn't very user friendly, even for forgetful kernel developers.
Further, the test case is rather haphazard about its use of fprintf()
versus perror().
Improve the error messages. Use perror() where possible. Lastly,
do some sanity checks on opening and mmap()ing the device file so
that we can get a decent error message out to the user.
Now, if your user doesn't have permission, you'll get the following:
$ ls -l /dev/sgx_enclave
crw------- 1 root root 10, 126 Mar 18 11:29 /dev/sgx_enclave
$ ./test_sgx
Unable to open /dev/sgx_enclave: Permission denied
If you then 'chown dave:dave /dev/sgx_enclave' (or whatever), but
you leave execute permissions off, you'll get:
$ ls -l /dev/sgx_enclave
crw------- 1 dave dave 10, 126 Mar 18 11:29 /dev/sgx_enclave
$ ./test_sgx
no execute permissions on device file
If you fix that with "chmod ug+x /dev/sgx" but you leave /dev as
noexec, you'll get this:
$ mount | grep "/dev .*noexec"
udev on /dev type devtmpfs (rw,nosuid,noexec,...)
$ ./test_sgx
ERROR: mmap for exec: Operation not permitted
mmap() succeeded for PROT_READ, but failed for PROT_EXEC
check that user has execute permissions on /dev/sgx_enclave and
that /dev does not have noexec set: 'mount | grep "/dev .*noexec"'
That can be fixed with:
mount -o remount,noexec /devESC
Hopefully, the combination of better error messages and the search
engines indexing this message will help people fix their systems
until we do this properly.
[ bp: Improve error messages more. ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/r/20210318194301.11D9A984@viggo.jf.intel.com
|
|
s/verfied/verified/
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add Makefile infra to specify multi-file BPF object files (and derivative
skeletons). Add first selftest validating BPF static linker can merge together
successfully two independent BPF object files and resulting object and
skeleton are correct and usable.
Use the same F(F(F(X))) = F(F(X)) identity test on linked object files as for
the case of single BPF object files.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-13-andrii@kernel.org
|
|
Pass all individual BPF object files (generated from progs/*.c) through
`bpftool gen object` command to validate that BPF static linker doesn't
corrupt them.
As an additional sanity checks, validate that passing resulting object files
through linker again results in identical ELF files. Exact same ELF contents
can be guaranteed only after two passes, as after the first pass ELF sections
order changes, and thus .BTF.ext data sections order changes. That, in turn,
means that strings are added into the final BTF string sections in different
order, so .BTF strings data might not be exactly the same. But doing another
round of linking afterwards should result in the identical ELF file, which is
checked with additional `diff` command.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-12-andrii@kernel.org
|
|
Trigger vmlinux.h and BPF skeletons re-generation if detected that bpftool was
re-compiled. Otherwise full `make clean` is required to get updated skeletons,
if bpftool is modified.
Fixes: acbd06206bbb ("selftests/bpf: Add vmlinux.h selftest exercising tracing of syscalls")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-11-andrii@kernel.org
|
|
Add `bpftool gen object <output-file> <input_file>...` command to statically
link multiple BPF ELF object files into a single output BPF ELF object file.
This patch also updates bash completions and man page. Man page gets a short
section on `gen object` command, but also updates the skeleton example to show
off workflow for BPF application with two .bpf.c files, compiled individually
with Clang, then resulting object files are linked together with `gen object`,
and then final object file is used to generate usable BPF skeleton. This
should help new users understand realistic workflow w.r.t. compiling
mutli-file BPF application.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-10-andrii@kernel.org
|
|
Add optional name OBJECT_NAME parameter to `gen skeleton` command to override
default object name, normally derived from input file name. This allows much
more flexibility during build time.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-9-andrii@kernel.org
|
|
Add .BTF and .BTF.ext static linking logic.
When multiple BPF object files are linked together, their respective .BTF and
.BTF.ext sections are merged together. BTF types are not just concatenated,
but also deduplicated. .BTF.ext data is grouped by type (func info, line info,
core_relos) and target section names, and then all the records are
concatenated together, preserving their relative order. All the BTF type ID
references and string offsets are updated as necessary, to take into account
possibly deduplicated strings and types.
BTF DATASEC types are handled specially. Their respective var_secinfos are
accumulated separately in special per-section data and then final DATASEC
types are emitted at the very end during bpf_linker__finalize() operation,
just before emitting final ELF output file.
BTF data can also provide "section annotations" for some extern variables.
Such concept is missing in ELF, but BTF will have DATASEC types for such
special extern datasections (e.g., .kconfig, .ksyms). Such sections are called
"ephemeral" internally. Internally linker will keep metadata for each such
section, collecting variables information, but those sections won't be emitted
into the final ELF file.
Also, given LLVM/Clang during compilation emits BTF DATASECS that are
incomplete, missing section size and variable offsets for static variables,
BPF static linker will initially fix up such DATASECs, using ELF symbols data.
The final DATASECs will preserve section sizes and all variable offsets. This
is handled correctly by libbpf already, so won't cause any new issues. On the
other hand, it's actually a nice property to have a complete BTF data without
runtime adjustments done during bpf_object__open() by libbpf. In that sense,
BPF static linker is also a BTF normalizer.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org
|
|
Introduce BPF static linker APIs to libbpf. BPF static linker allows to
perform static linking of multiple BPF object files into a single combined
resulting object file, preserving all the BPF programs, maps, global
variables, etc.
Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are
concatenated together. Similarly, code sections are also concatenated. All the
symbols and ELF relocations are also concatenated in their respective ELF
sections and are adjusted accordingly to the new object file layout.
Static variables and functions are handled correctly as well, adjusting BPF
instructions offsets to reflect new variable/function offset within the
combined ELF section. Such relocations are referencing STT_SECTION symbols and
that stays intact.
Data sections in different files can have different alignment requirements, so
that is taken care of as well, adjusting sizes and offsets as necessary to
satisfy both old and new alignment requirements.
DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG
section, which is ignored by libbpf in bpf_object__open() anyways. So, in
a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty
nice property, especially if resulting .o file is then used to generate BPF
skeleton.
Original string sections are ignored and instead we construct our own set of
unique strings using libbpf-internal `struct strset` API.
To reduce the size of the patch, all the .BTF and .BTF.ext processing was
moved into a separate patch.
The high-level API consists of just 4 functions:
- bpf_linker__new() creates an instance of BPF static linker. It accepts
output filename and (currently empty) options struct;
- bpf_linker__add_file() takes input filename and appends it to the already
processed ELF data; it can be called multiple times, one for each BPF
ELF object file that needs to be linked in;
- bpf_linker__finalize() needs to be called to dump final ELF contents into
the output file, specified when bpf_linker was created; after
bpf_linker__finalize() is called, no more bpf_linker__add_file() and
bpf_linker__finalize() calls are allowed, they will return error;
- regardless of whether bpf_linker__finalize() was called or not,
bpf_linker__free() will free up all the used resources.
Currently, BPF static linker doesn't resolve cross-object file references
(extern variables and/or functions). This will be added in the follow up patch
set.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org
|
|
Add btf__add_type() API that performs shallow copy of a given BTF type from
the source BTF into the destination BTF. All the information and type IDs are
preserved, but all the strings encountered are added into the destination BTF
and corresponding offsets are rewritten. BTF type IDs are assumed to be
correct or such that will be (somehow) modified afterwards.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org
|
|
Extract BTF logic for maintaining a set of strings data structure, used for
BTF strings section construction in writable mode, into separate re-usable
API. This data structure is going to be used by bpf_linker to maintains ELF
STRTAB section, which has the same layout as BTF strings section.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org
|
|
Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details
of dynamically resizable memory to use libbpf_ prefix, as they are not
BTF-specific. No functional changes.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org
|
|
Extract and generalize the logic to iterate BTF type ID and string offset
fields within BTF types and .BTF.ext data. Expose this internally in libbpf
for re-use by bpf_linker.
Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE
access strings), which was previously missing. There previously was no
case of deduplicating .BTF.ext data, but bpf_linker is going to use it.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org
|
|
btf_type_by_id() is internal-only convenience API returning non-const pointer
to struct btf_type. Expose it outside of btf.c for re-use.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org
|