Age | Commit message (Collapse) | Author |
|
This patch adds new *loadall* command which slightly differs from the
existing *load*. *load* command loads all programs from the obj file,
but pins only the first programs. *loadall* pins all programs from the
obj file under specified directory.
The intended usecase is flow_dissector, where we want to load a bunch
of progs, pin them all and after that construct a jump table.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
pin_name is the same as section_name where '/' is replaced
by '_'. bpf_object__pin_programs is converted to use pin_name
to avoid the situation where section_name would require creating another
subdirectory for a pin (as, for example, when calling bpf_object__pin_programs
for programs in sections like "cgroup/connect6").
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When bpf_program has only one instance, don't create a subdirectory with
per-instance pin files (<prog>/0). Instead, just create a single pin file
for that single instance. This simplifies object pinning by not creating
unnecessary subdirectories.
This can potentially break existing users that depend on the case
where '/0' is always created. However, I couldn't find any serious
usage of bpf_program__pin inside the kernel tree and I suppose there
should be none outside.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpftool will use bpf_object__pin in the next commits to pin all programs
and maps from the file; in case of a partial failure, we need to get
back to the clean state (undo previous program/map pins).
As part of a cleanup, I've added and exported separate routines to
pin all maps (bpf_object__pin_maps) and progs (bpf_object__pin_programs)
of an object.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Makes it compatible with the logic that derives program type
from section name in libbpf_prog_type_by_name.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
add a script that adds a ipsec tunnel between two network
namespaces plus following policies:
.0/24 -> ipsec tunnel
.240/28 -> bypass
.253/32 -> ipsec tunnel
Then check that .254 bypasses tunnel (match /28 exception),
and .2 (match /24) and .253 (match direct policy) pass through the
tunnel.
Abuses iptables to check if ping did resolve an ipsec policy or not.
Also adds a bunch of 'block' rules that are not supposed to match.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
At commit deee2cae27d1 ("kselftests/bpf: use ping6 as the default ipv6 ping
binary if it exists"), it fixed similar issues for shell script, but it
missed a same issue in the C code.
Fixes: 371e4fcc9d96 ("selftests/bpf: cgroup local storage-based network counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
CC: Philip Li <philip.li@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
bpftool output is not user friendly when dumping a map with only a few
populated entries:
$ bpftool map
1: devmap name tx_devmap flags 0x0
key 4B value 4B max_entries 64 memlock 4096B
2: array name tx_idxmap flags 0x0
key 4B value 4B max_entries 64 memlock 4096B
$ bpftool map dump id 1
key:
00 00 00 00
value:
No such file or directory
key:
01 00 00 00
value:
No such file or directory
key:
02 00 00 00
value:
No such file or directory
key: 03 00 00 00 value: 03 00 00 00
Handle ENOENT by keeping the line format sane and dumping
"<no entry>" for the value
$ bpftool map dump id 1
key: 00 00 00 00 value: <no entry>
key: 01 00 00 00 value: <no entry>
key: 02 00 00 00 value: <no entry>
key: 03 00 00 00 value: 03 00 00 00
...
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This patch provides a tcp_bpf based eBPF sample. The test
- ncat(1) as the TCP client program to connect() to a port
with the intention of triggerring SYN retransmissions: we
first install an iptables DROP rule to make sure ncat SYNs are
resent (instead of aborting instantly after a TCP RST)
- has a bpf kernel module that sends a perf-event notification for
each TCP retransmit, and also tracks the number of such notifications
sent in the global_map
The test passes when the number of event notifications intercepted
in user-space matches the value in the global_map.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Update references to other bpftool man pages at the bottom of each
manual page. Also reference the "bpf(2)" and "bpf-helpers(7)" man pages.
References are sorted by number of man section, then by
"prog-and-map-go-first", the other pages in alphabetical order.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Function open_obj_pinned() prints error messages when it fails to open a
link in the BPF virtual file system. However, in some occasions it is
not desirable to print an error, for example when we parse all links
under the bpffs root, and the error is due to some paths actually being
symbolic links.
Example output:
# ls -l /sys/fs/bpf/
lrwxrwxrwx 1 root root 0 Oct 18 19:00 ip -> /sys/fs/bpf/tc/
drwx------ 3 root root 0 Oct 18 19:00 tc
lrwxrwxrwx 1 root root 0 Oct 18 19:00 xdp -> /sys/fs/bpf/tc/
# bpftool --bpffs prog show
Error: bpf obj get (/sys/fs/bpf): Permission denied
Error: bpf obj get (/sys/fs/bpf): Permission denied
# strace -e bpf bpftool --bpffs prog show
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/ip", bpf_fd=0}, 72) = -1 EACCES (Permission denied)
Error: bpf obj get (/sys/fs/bpf): Permission denied
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/xdp", bpf_fd=0}, 72) = -1 EACCES (Permission denied)
Error: bpf obj get (/sys/fs/bpf): Permission denied
...
To fix it, pass a bool as a second argument to the function, and prevent
it from printing an error when the argument is set to true.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Edit the documentation of the -f|--bpffs option to make it explicit that
it dumps paths of pinned programs when bpftool is used to list the
programs only, so that users do not believe they will see the name of
the newly pinned program with "bpftool prog pin" or "bpftool prog load".
Also fix the plain output: do not add a blank line after each program
block, in order to remain consistent with what bpftool does when the
option is not passed.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Function getline() returns -1 on failure to read a line, thus creating
an infinite loop in get_fdinfo() if the key is not found. Fix it by
calling the function only as long as we get a strictly positive return
value.
Found by copying the code for a key which is not always present...
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Commit f6f3bac08ff9 ("tools/bpf: bpftool: add net support")
added certain networking support to bpftool.
The implementation relies on a relatively recent uapi header file
linux/tc_act/tc_bpf.h on the host which contains the marco
definition of TCA_ACT_BPF_ID.
Unfortunately, this is not the case for all distributions.
See the email message below where rhel-7.2 does not have
an up-to-date linux/tc_act/tc_bpf.h.
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1799211.html
Further investigation found that linux/pkt_cls.h is also needed for macro
TCA_BPF_TAG.
This patch fixed the issue by copying linux/tc_act/tc_bpf.h
and linux/pkt_cls.h from kernel include/uapi directory to
tools/include/uapi directory so building the bpftool does not depend
on host system for these files.
Fixes: f6f3bac08ff9 ("tools/bpf: bpftool: add net support")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This reduces the size of the init executable from ~800 kB to ~800 bytes
on x86_64. This is only implemented for x86_64, i386, arm and arm64.
Others not tested.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This is a definition of the most common syscalls needed in minimalist
init executables, allowing to statically build them with no external
dependencies. It is sufficient in its current form to build rcutorture's
init on x86_64, i386, arm, and arm64. Others have not been ported or
tested. Updates may be found here :
http://git.formilux.org/?p=people/willy/nolibc.git
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
If the build fails, we can end up with an empty initrd directory which
prevents the build script from operating again. Better rely on the
resulting init executable instead.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Strip using -s on the compiler command line instead of calling the "strip"
utility as the latter isn't necessarily compatible with the target arch.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This adds the CROSS_COMPILE environment to the initrd.sh script's
gcc command to enable cross compilation.
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Currently, the initrd/init script and executable remain blocked almost
all the time. However, it is necessary to test nohz_full userspace
execution, which both variants of initrd/init fail to do. This commit
therefore causes initrd/init to spend about a millisecond per second
executing in userspace.
Reported-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The support for creating initrd directories using dracut is a great
improvement over having to always hand-create them, it is a bit annoying
to have to install some otherwise irrelevant package just to be able to
run rcutorture. This commit therefore adds support for creating initrd
directories on systems innocent of dracut. You do need gcc, but then
again you need that to build the kernel (or to build llvm) in any case.
The idea is to create an initrd directory containing nothing but a
statically linked binary having a for-loop over a long-term sleep().
The result is a Linux kernel with almost no userspace: even the
time-honored /dev, /lib, /tmp, and /usr directories are gone. In fact,
the only directory present is "/", but only because I don't know how to
get rid of it, at least short of not having an initrd in the first place.
Although statically linked binaries are much maligned, and rightly so,
their disadvantages seem to be irrelevant for this particular use case.
From https://www.akkadia.org/drepper/no_static_linking.html:
1. Fixes are difficult to apply to hordes of widely scattered
statically linked binaries. But in this case, there is only one
binary, but there would otherwise be no fewer than four libraries.
2. Security measures like local address randomization cannot be used.
Prudence prevents me from asserting that it is impossible to
base a remote attack on a networking-free rcutorture instance.
Nevertheless, bonus points to the first person who comes up with
such an attack!
3. More efficient use of physical memory. Not in this case, given
that libc is 1.8MB and the statically linked binary "only" 800K.
4. Features such as locales, name service switch (NSS),
internationalized domain names (IDN) tool, and so on require
dynamic linking. Bonus points to the first person coming up
with a valid rcutorture use case requiring these features in
its initrd.
5. Accidental violations of (L)GPL. Actually, this change actually
helps -avoid- such violations by reducing the temptation to
pass around tarballs of rcutorture-ready initrd directories.
After all, the rcutorture scripts automatically create an initrd
directory for you, so why bother with the tarballs?
6. Tools and hacks like ltrace, LD_PRELOAD, LD_PROFILE, and LD_AUDIT
don't work. Again, bonus points to the first person coming up
with a valid rcutorture use case requiring these features in
its initrd.
Nevertheless, the script will use dracut if available, and will create the
statically linked binary only when dracut are missing. Those preferring
the smaller initrd directory resulting from the statically linked binary
(like me) are free to hand-edit mkinitrd.sh to remove the code using
dracut. ;-)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The rcutorture scripts currently expect the user to create the
tools/testing/selftests/rcutorture/initrd directory. Should the user
fail to do this, the kernel build will fail with obscure and confusing
error messages. This commit therefore adds explicit checks for the
tools/testing/selftests/rcutorture/initrd directory, and if not present,
creates one on systems on which dracut is installed. If this directory
could not be created, a less obscure error message is emitted and the
test is aborted.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Connor Shu <Connor.Shu@ibm.com>
[ paulmck: Adapt the script to fit into the rcutorture framework and
severely abbreviate the initrd/init script. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Introduce eight tests, for FoU and GUE, with IPv4 and IPv6 payload,
on IPv4 and IPv6 transport, that check that PMTU exceptions are created
with the right value when exceeding the MTU on a link of the path.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use a router between endpoints, implemented via namespaces, set a low MTU
between router and destination endpoint, exceed it and check PMTU value in
route exceptions.
v2:
- Introduce IPv4 tests right away, if iproute2 doesn't support the 'df'
link option they will be skipped (David Ahern)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use a router between endpoints, implemented via namespaces, set a low MTU
between router and destination endpoint, exceed it and check PMTU value in
route exceptions.
v2:
- Change all occurrences of VxLAN to VXLAN (Jiri Benc)
- Introduce IPv4 tests right away, if iproute2 doesn't support the 'df'
link option they will be skipped (David Ahern)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix warnings found using static analysis with cppcheck, use %d printf
format specifier for signed ints rather than %u
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Extends the existing udp programs to allow checking for proper
GRO aggregation/GSO size, and run the tests via a shell script, using
a veth pair with XDP program attached to trigger the GRO code path.
rfc v3 -> v1:
- use ip route to attach the xdp helper to the veth
rfc v2 -> rfc v3:
- add missing test program options documentation
- fix sporatic test failures (receiver faster than sender)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Run on top of veth pair, using a dummy XDP program to enable the GRO.
rfc v3 -> v1:
- use ip route to attach the xdp helper to the veth
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This trivial XDP program does nothing, but will be used by the
next patch to test the GRO path in a net namespace, leveraging
the veth XDP implementation.
It's added here, despite its 'net' usage, to avoid the duplication
of the llc-related makefile boilerplate.
rfc v3 -> v1:
- move the helper implementation into the bpf directory, don't
touch udpgso_bench_rx
rfc v2 -> rfc v3:
- move 'x' option handling here
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
And fix a couple of buglets (port option processing,
clean termination on SIGINT). This is preparatory work
for GRO tests.
rfc v2 -> rfc v3:
- use ETH_MAX_MTU
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The limit for memory locked in the kernel by a process is usually set to
64 kbytes by default. This can be an issue when creating large BPF maps
and/or loading many programs. A workaround is to raise this limit for
the current process before trying to create a new BPF map. Changing the
hard limit requires the CAP_SYS_RESOURCE and can usually only be done by
root user (for non-root users, a call to setrlimit fails (and sets
errno) and the program simply goes on with its rlimit unchanged).
There is no API to get the current amount of memory locked for a user,
therefore we cannot raise the limit only when required. One solution,
used by bcc, is to try to create the map, and on getting a EPERM error,
raising the limit to infinity before giving another try. Another
approach, used in iproute2, is to raise the limit in all cases, before
trying to create the map.
Here we do the same as in iproute2: the rlimit is raised to infinity
before trying to load programs or to create maps with bpftool.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
libbpf is now able to load successfully test_l4lb_noinline.o and
samples/bpf/tracex3_kern.o.
For the test_l4lb_noinline, uncomment related tests from test_libbpf.c
and remove the associated "TODO".
For tracex3_kern.o, instead of loading a program from samples/bpf/ that
might not have been compiled at this stage, try loading a program from
BPF selftests. Since this test case is about loading a program compiled
without the "-target bpf" flag, change the Makefile to compile one
program accordingly (instead of passing the flag for compiling all
programs).
Regarding test_xdp_noinline.o: in its current shape the program fails to
load because it provides no version section, but the loader needs one.
The test was added to make sure that libbpf could load XDP programs even
if they do not provide a version number in a dedicated section. But
libbpf is already capable of doing that: in our case loading fails
because the loader does not know that this is an XDP program (it does
not need to, since it does not attach the program). So trying to load
test_xdp_noinline.o does not bring much here: just delete this subtest.
For the record, the error message obtained with tracex3_kern.o was
fixed by commit e3d91b0ca523 ("tools/libbpf: handle issues with bpf ELF
objects containing .eh_frames")
I have not been abled to reproduce the "libbpf: incorrect bpf_call
opcode" error for test_l4lb_noinline.o, even with the version of libbpf
present at the time when test_libbpf.sh and test_libbpf_open.c were
created.
RFC -> v1:
- Compile test_xdp without the "-target bpf" flag, and try to load it
instead of ../../samples/bpf/tracex3_kern.o.
- Delete test_xdp_noinline.o subtest.
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent improvements and fixes from Arnaldo Carvalho de Melo:
Intel PT SQL viewer: (Adrian Hunter)
- Fall back to /usr/local/lib/libxed.so
- Add Selected branches report
- Add help window
- Fix table find when table re-ordered
Intel PT debug log (Adrian Hunter)
- Add more event information
- Add MTC and CYC timestamps
perf record: (Andi Kleen)
- Support weak groups, just like with 'perf stat'
perf trace: (Arnaldo Carvalho de Melo)
- Start augmenting raw_syscalls:{sys_enter,sys_exit}: goal is to have a
generic, arch independent eBPF kernel component that is programmed with
syscall table details, what to copy, how many bytes, pid, arg filters from the
userspace via eBPF maps by the 'perf trace' tool that continues to use all its
argument beautifiers, just taking advantage of the extra pointer contents.
JVMTI: (Gustavo Romero)
- Fix undefined symbol scnprintf in libperf-jvmti.so
perf top: (Jin Yao)
- Display the LBR stats in callchain entries
perf stat: (Thomas Richter)
- Handle different PMU names with common prefix
arm64: Will (Deacon)
- Fix arm64 tools build failure wrt smp_load_{acquire,release}.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
ir-loopback can transmit IR on one rc device and check the correct
scancode and protocol is decoded on a different rc device. This can be
used to check IR transmission between two rc devices. Using rc-loopback,
we use it to check the IR encoders and decoders themselves.
No hardware is required for this test.
Signed-off-by: Sean Young <sean@mess.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
|
|
So user could specify outside CFLAGS values.
Cc: Thomas Renninger <trenn@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
|
|
Adding CFLAGS and LDFLAGS to be used during the build.
Cc: Thomas Renninger <trenn@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
|
|
Rename duplicate sysfs_read_file into cpupower_read_sysfs and fix linking.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Thomas Renninger <trenn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
|
|
Andi reported following malfunction:
# perf record -e '{ref-cycles,cycles}:S' -a sleep 1
# perf script
non matching sample_id_all
That's because we disable sample_id_all bit for non-sampling group
members. We can't do that, because it needs to be the same over the
whole event list. This patch keeps it untouched again.
Reported-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180923150420.27327-1-jolsa@kernel.org
Fixes: e9add8bac6c6 ("perf evsel: Disable write_backward for leader sampling group events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is the commit that porting the perf for nds32.
1.Raw event:
The raw events start with 'r'.
Usage:
perf stat -e rXYZ ./app
X: the index of performance counter.
YZ: the index(convert to hexdecimal) of events
Example:
'perf stat -e r101 ./app' means the counter 1 will count the instruction
event.
The index of counter and events can be found in
"Andes System Privilege Architecture Version 3 Manual".
Or you can perform the 'perf list' to find the symbolic name of raw events.
2.Perf mmap2:
Fix unexpected perf mmap2() page fault
When the mmap2() called by perf application,
you will encounter such condition:"failed to write."
With return value -EFAULT
This is due to the page fault caused by "reading" buffer
from the mapped legal address region to write to the descriptor.
The page_fault handler will get a VM_FAULT_SIGBUS return value,
which should not happens here.(Due to this is a read request.)
You can refer to kernel/events/core.c:perf_mmap_fault(...)
If "(vmf->pgoff && (vmf->flags & FAULT_FLAG_WRITE))" is evaluated
as true, you will get VM_FAULT_SIGBUS as return value.
However, this is not an write request. The flags which indicated
why the page fault happens is wrong.
Furthermore, NDS32 SPAv3 is not able to detect it is read or write.
It only know either it is instruction fetch or data access.
Therefore, by removing the wrong flag assignment(actually, the hardware
is not able to show the reason), we can fix this bug.
3.Perf multiple events map to same counter.
When there are multiple events map to the same counter, the counter
counts inaccurately. This is because each counter only counts one event
in the same time.
So when there are multiple events map to same counter, they have to take
turns in each context.
There are two solution:
1. Print the error message when multiple events map to the same counter.
But print the error message would let the program hang in loop. The ltp
(linux test program) would be failed when the program hang in loop.
2. Don't print the error message, the ltp would pass. But the user need to
have the knowledge that don't count the events which map to the same
counter, or the user will get the inaccurate results.
We choose method 2 for the solution
Signed-off-by: Nickhu <nickhu@andestech.com>
Acked-by: Greentime Hu <greentime@andestech.com>
Signed-off-by: Greentime Hu <greentime@andestech.com>
|
|
KASAN reports following global out of bounds access while
nfit_test is being loaded. The out of bound access happens
the following reference to dimm_fail_cmd_flags[dimm]. 'dimm' is
over than the index value, NUM_DCR (==5).
static int override_return_code(int dimm, unsigned int func, int rc)
{
if ((1 << func) & dimm_fail_cmd_flags[dimm]) {
dimm_fail_cmd_flags[] definition:
static unsigned long dimm_fail_cmd_flags[NUM_DCR];
'dimm' is the return value of get_dimm(), and get_dimm() returns
the index of handle[] array. The handle[] has 7 index. Let's use
ARRAY_SIZE(handle) as the array size.
KASAN report:
==================================================================
BUG: KASAN: global-out-of-bounds in nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
Read of size 8 at addr ffffffffc10cbbe8 by task kworker/u41:0/8
...
Call Trace:
dump_stack+0xea/0x1b0
? dump_stack_print_info.cold.0+0x1b/0x1b
? kmsg_dump_rewind_nolock+0xd9/0xd9
print_address_description+0x65/0x22e
? nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
kasan_report.cold.6+0x92/0x1a6
nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
...
The buggy address belongs to the variable:
dimm_fail_cmd_flags+0x28/0xffffffffffffa440 [nfit_test]
==================================================================
Fixes: 39611e83a28c ("tools/testing/nvdimm: Make DSM failure code injection...")
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Currently jvmti agent can not be used because function scnprintf is not
present in the agent libperf-jvmti.so. As a result the JVM when using
such agent to record JITed code profiling information will fail on
looking up scnprintf:
java: symbol lookup error: lib/libperf-jvmti.so: undefined symbol: scnprintf
This commit fixes that by reverting to the use of snprintf, that can be
looked up, instead of scnprintf, adding a proper check for the returned
value in order to print a better error message when the jitdump file
pathname is too long. Checking the returned value also helps to comply
with some recent gcc versions, like gcc8, which will fail due to
truncated writing checks related to the -Werror=format-truncation= flag.
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 1541117601-18937-2-git-send-email-gromero@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-mvpxxxy7wnzaj74cq75muw3f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Guenter reported that using ARCH=x86_64 to build perf has regressed:
$ make -C tools/perf O=/tmp/build/perf ARCH=x86_64
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
<SNIP>
... bpf: [ on ]
GEN /tmp/build/perf/common-cmds.h
make[2]: *** No rule to make target '/home/acme/git/perf/tools/arch/x86_64/include/uapi/asm//mman.h', needed by '/tmp/build/perf/trace/beauty/generated/mmap_flags_array.c'. Stop.
make[2]: *** Waiting for unfinished jobs....
PERF_VERSION = 4.19.gf6c23e3
make[1]: *** [Makefile.perf:207: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
This is because we must use $(SRCARCH) where we were using $(ARCH), so
that, just like the top level Makefile, we get this done:
# Additional ARCH settings for x86
ifeq ($(ARCH),i386)
SRCARCH := x86
endif
ifeq ($(ARCH),x86_64)
SRCARCH := x86
endif
Which is done in tools/scripts/Makefile.arch, so switch to use
$(SRCARCH).
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: fbd7458db757 ("perf beauty: Wire up the mmap flags table generator to the Makefile")
Link: https://lkml.kernel.org/r/20181105184612.GD7077@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One cause of decoding errors is un-synchronized side-band data.
Timestamps are needed to debug such cases. TSC packet timestamps are
logged. Log also MTC and CYC timestamps.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20181105073505.8129-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
More event information is useful for debugging, especially MMAP events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20181105073505.8129-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
re-ordered
Table rows can be re-ordered by selecting a column to sort by. After
re-ordering, the "find" operation was highlighting the wrong row, fix
it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a window to display help. It is also possible to display the help
only, by using the option "--help-only" instead of a database name.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fetching data from the database can be slow. Add a report that provides
the ability to select a subset of branches.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
/usr/local/lib/libxed.so
Fall back to /usr/local/lib/libxed.so to cater for distributions that do
not have /usr/local/lib in the library path by default.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf report' has supported the displaying of LBR stats (such as cycles,
predicted%) in callchain entry.
For example:
$ perf report --branch-history --stdio
--1.01%--intel_idle mwait.h:29
intel_idle cpufeature.h:164 (cycles:5)
intel_idle cpufeature.h:164 (predicted:76.4%)
intel_idle mwait.h:102 (cycles:41)
intel_idle current.h:15
While 'perf top' doesn't support that.
For example:
$ perf top -a -b --call-graph branch
- 13.86% 0.23% [kernel] [k] __x86_indirect_thunk_rax
- 13.65% __x86_indirect_thunk_rax
+ 1.69% do_syscall_64
+ 1.68% do_select
+ 1.41% ktime_get
+ 0.70% __schedule
+ 0.62% do_sys_poll
0.58% __x86_indirect_thunk_rax
Actually it's very easy to enable this feature in 'perf top'.
With this patch, the result is:
$ perf top -a -b --call-graph branch
$ - 13.58% 0.00% [kernel] [k] __x86_indirect_thunk_rax
$ - 13.57% __x86_indirect_thunk_rax (predicted:93.9%)
$ + 1.78% do_select (cycles:2)
$ + 1.68% perf_pmu_disable.part.99 (cycles:1)
$ + 1.45% ___sys_recvmsg (cycles:25)
$ + 0.81% unix_stream_sendmsg (cycles:18)
$ + 0.80% ktime_get (cycles:400)
$ 0.58% pick_next_task_fair (cycles:47)
$ + 0.56% i915_request_retire (cycles:2)
$ + 0.52% do_sys_poll (cycles:4)
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1540983995-20462-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On s390 the CPU Measurement Facility for counters now supports
2 PMUs named cpum_cf (CPU Measurement Facility for counters) and
cpum_cf_diag (CPU Measurement Facility for diagnostic counters)
for one and the same CPU.
Running command
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
2 tx_c_tend
0.002120091 seconds time elapsed
0.000121000 seconds user
0.002127000 seconds sys
[root@s35lp76 perf]#
displays output which is unexpected (and wrong):
2 tx_c_tend
The test program definitely triggers only one transaction, as shown
in line 'TX_C_TEND: 1 expected:1'.
This is caused by the following call sequence:
pmu_lookup() scans and installs a PMU.
+--> pmu_aliases() parses all aliases in directory
.../<pmu-name>/events/* which are file names.
+--> pmu_aliases_parse() Read each file in directory and create
an new alias entry. This is done with
+--> perf_pmu__new_alias() and
+--> __perf_pmu__new_alias() which also check for
identical alias names.
After pmu_aliases() returns, a complete list of event names
for this pmu has been created. Now function
pmu_add_cpu_aliases() is called to add the events listed in the json
| files to the alias list of the cpu.
+--> perf_pmu__find_map() Returns a pointer to the json events.
Now function pmu_add_cpu_aliases() scans through all events listed
in the JSON files for this CPU.
Each json event pmu name is compared with the current PMU being
built up and if they mismatch, the json event is added to the
current PMUs alias list.
To avoid duplicate entries the following comparison is done:
if (!is_arm_pmu_core(name)) {
pname = pe->pmu ? pe->pmu : "cpu";
if (strncmp(pname, name, strlen(pname)))
continue;
}
The culprit is the strncmp() function.
Using current s390 PMU naming, the first PMU is 'cpum_cf'
and a long list of events is added, among them 'tx_c_tend'
When the second PMU named 'cpum_cf_diag' is added, only one event
named 'CF_DIAG' is added by the pmu_aliases() function.
Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'.
Since the CPUID string is the same for both PMUs, json file events
for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag'
This happens because the strncmp() actually compares:
strncmp("cpum_cf", "cpum_cf_diag", 6);
The first parameter is the pmu name taken from the event in
the json file. The second parameter is the pmu name of the PMU
currently being built.
They are different, but the length of the compare only tests the
common prefix and this returns 0(true) when it should return false.
Now all events for PMU cpum_cf are added to the alias list for pmu
cpum_cf_diag.
Later on in function parse_events_add_pmu() the event 'tx_c_end' is
searched in all available PMUs and found twice, adding it two
times to the evsel_list global variable which is the root
of all events. This results in a counter value of 2 instead
of 1.
Output with this patch:
[root@s35lp76 perf]# ./perf stat -e tx_c_tend \
-- ~/mytests/cf-tx-events 1
Measuring transactions
TX_C_TABORT_NO_SPECIAL: 0 expected:0
TX_C_TABORT_SPECIAL: 0 expected:0
TX_C_TEND: 1 expected:1
TX_NC_TABORT: 11 expected:11
TX_NC_TEND: 1 expected:1
Performance counter stats for '/root/mytests/cf-tx-events 1':
1 tx_c_tend
0.001815365 seconds time elapsed
0.000123000 seconds user
0.001756000 seconds sys
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Sebastien Boisvert <sboisvert@gydle.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform")
Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|