linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-10-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03	Merge tag 'linux-kselftest-4.19-rc7' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Shuah writes: "kselftest fixes for 4.19-rc7 This fixes update for 4.19-rc7 consists one fix to rseq test to prevent it from seg-faulting when compiled with -fpie." * tag 'linux-kselftest-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: rseq/selftests: fix parametrized test with -fpie
2018-10-03	selftests/bpf: Add C tests for reference tracking	Joe Stringer
	Add some tests that demonstrate and test the balanced lookup/free nature of socket lookup. Section names that start with "fail" represent programs that are expected to fail verification; all others should succeed. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	libbpf: Support loading individual progs	Joe Stringer
	Allow the individual program load to be invoked. This will help with testing, where a single ELF may contain several sections, some of which denote subprograms that are expected to fail verification, along with some which are expected to pass verification. By allowing programs to be iterated and individually loaded, each program can be independently checked against its expected verification result. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	selftests/bpf: Add tests for reference tracking	Joe Stringer
	reference tracking: leak potential reference reference tracking: leak potential reference on stack reference tracking: leak potential reference on stack 2 reference tracking: zero potential reference reference tracking: copy and zero potential references reference tracking: release reference without check reference tracking: release reference reference tracking: release reference twice reference tracking: release reference twice inside branch reference tracking: alloc, check, free in one subbranch reference tracking: alloc, check, free in both subbranches reference tracking in call: free reference in subprog reference tracking in call: free reference in subprog and outside reference tracking in call: alloc & leak reference in subprog reference tracking in call: alloc in subprog, release outside reference tracking in call: sk_ptr leak into caller stack reference tracking in call: sk_ptr spill into caller stack reference tracking: allow LD_ABS reference tracking: forbid LD_ABS while holding reference reference tracking: allow LD_IND reference tracking: forbid LD_IND while holding reference reference tracking: check reference or tail call reference tracking: release reference then tail call reference tracking: leak possible reference over tail call reference tracking: leak checked reference over tail call reference tracking: mangle and release sock_or_null reference tracking: mangle and release sock reference tracking: access member reference tracking: write to member reference tracking: invalid 64-bit access of member reference tracking: access after release reference tracking: direct access for lookup unpriv: spill/fill of different pointers stx - ctx and sock unpriv: spill/fill of different pointers stx - leak sock unpriv: spill/fill of different pointers stx - sock and ctx (read) unpriv: spill/fill of different pointers stx - sock and ctx (write) Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	selftests/bpf: Generalize dummy program types	Joe Stringer
	Don't hardcode the dummy program types to SOCKET_FILTER type, as this prevents testing bpf_tail_call in conjunction with other program types. Instead, use the program type specified in the test case. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	bpf: Add helper to retrieve socket in BPF	Joe Stringer
	This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and bpf_sk_lookup_udp() which allows BPF programs to find out if there is a socket listening on this host, and returns a socket pointer which the BPF program can then access to determine, for instance, whether to forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the socket, so when a BPF program makes use of this function, it must subsequently pass the returned pointer into the newly added sk_release() to return the reference. By way of example, the following pseudocode would filter inbound connections at XDP if there is no corresponding service listening for the traffic: struct bpf_sock_tuple tuple; struct bpf_sock_ops *sk; populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0); if (!sk) { // Couldn't find a socket listening for this traffic. Drop. return TC_ACT_SHOT; } bpf_sk_release(sk, 0); return TC_ACT_OK; Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	bpf: Reuse canonical string formatter for ctx errs	Joe Stringer
	The array "reg_type_str" provides canonical formatting of register types, however a couple of places would previously check whether a register represented the context and write the name "context" directly. An upcoming commit will add another pointer type to these statements, so to provide more accurate error messages in the verifier, update these error messages to use "reg_type_str" instead. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-03	bpf: Simplify ptr_min_max_vals adjustment	Joe Stringer
	An upcoming commit will add another two pointer types that need very similar behaviour, so generalise this function now. Signed-off-by: Joe Stringer <joe@wand.net.nz> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-02	selftests/x86: Add clock_gettime() tests to test_vdso	Andy Lutomirski
	Now that the vDSO implementation of clock_gettime() is getting reworked, add a selftest for it. This tests that its output is consistent with the syscall version. This is marked for stable to serve as a test for commit 715bd9d12f84 ("x86/vdso: Fix asm constraints on vDSO syscall fallbacks") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/082399674de2619b2befd8c0dde49b260605b126.1538422295.git.luto@kernel.org
2018-10-01	selftests/tls: Fix recv(MSG_PEEK) & splice() test cases	Vakul Garg
	TLS test cases splice_from_pipe, send_and_splice & recv_peek_multiple_records expect to receive a given nummber of bytes and then compare them against the number of bytes which were sent. Therefore, system call recv() must not return before receiving the requested number of bytes, otherwise the subsequent memcmp() fails. This patch passes MSG_WAITALL flag to recv() so that it does not return prematurely before requested number of bytes are copied to receive buffer. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01	selftests/bpf: cgroup local storage-based network counters	Roman Gushchin
	This commit adds a bpf kselftest, which demonstrates how percpu and shared cgroup local storage can be used for efficient lookup-free network accounting. Cgroup local storage provides generic memory area with a very efficient lookup free access. To avoid expensive atomic operations for each packet, per-cpu cgroup local storage is used. Each packet is initially charged to a per-cpu counter, and only if the counter reaches certain value (32 in this case), the charge is moved into the global atomic counter. This allows to amortize atomic operations, keeping reasonable accuracy. The test also implements a naive network traffic throttling, mostly to demonstrate the possibility of bpf cgroup--based network bandwidth control. Expected output: ./test_netcnt test_netcnt:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01	selftests/bpf: extend the storage test to test per-cpu cgroup storage	Roman Gushchin
	This test extends the cgroup storage test to use per-cpu flavor of the cgroup storage as well. The test initializes a per-cpu cgroup storage to some non-zero initial value (1000), and then simple bumps a per-cpu counter each time the shared counter is atomically incremented. Then it reads all per-cpu areas from the userspace side, and checks that the sum of values adds to the expected sum. Expected output: $ ./test_cgroup_storage test_cgroup_storage:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01	selftests/bpf: add verifier per-cpu cgroup storage tests	Roman Gushchin
	This commits adds verifier tests covering per-cpu cgroup storage functionality. There are 6 new tests, which are exactly the same as for shared cgroup storage, but do use per-cpu cgroup storage map. Expected output: $ ./test_verifier #0/u add+sub+mul OK #0/p add+sub+mul OK ... #286/p invalid cgroup storage access 6 OK #287/p valid per-cpu cgroup storage access OK #288/p invalid per-cpu cgroup storage access 1 OK #289/p invalid per-cpu cgroup storage access 2 OK #290/p invalid per-cpu cgroup storage access 3 OK #291/p invalid per-cpu cgroup storage access 4 OK #292/p invalid per-cpu cgroup storage access 5 OK #293/p invalid per-cpu cgroup storage access 6 OK #294/p multiple registers share map_lookup_elem result OK ... #662/p mov64 src == dst OK #663/p mov64 src != dst OK Summary: 914 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01	bpftool: add support for PERCPU_CGROUP_STORAGE maps	Roman Gushchin
	This commit adds support for BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE map type. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01	bpf: sync include/uapi/linux/bpf.h to tools/include/uapi/linux/bpf.h	Roman Gushchin
	The sync is required due to the appearance of a new map type: BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, which implements per-cpu cgroup local storage. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-01	tools/kvm_stat: cut down decimal places in update interval dialog	Stefan Raspl
	We currently display the default number of decimal places for floats in _show_set_update_interval(), which is quite pointless. Cutting down to a single decimal place. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-30	tools: hv: fcopy: set 'error' in case an unknown operation was requested	Vitaly Kuznetsov
	'error' variable is left uninitialized in case we see an unknown operation. As we don't immediately return and proceed to pwrite() we need to set it to something, HV_E_FAIL sounds good enough. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-28	Merge tag 'powerpc-4.19-3' of ↵	Greg Kroah-Hartman
	https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Michael writes: "powerpc fixes for 4.19 #3 A reasonably big batch of fixes due to me being away for a few weeks. A fix for the TM emulation support on Power9, which could result in corrupting the guest r11 when running under KVM. Two fixes to the TM code which could lead to userspace GPR corruption if we take an SLB miss at exactly the wrong time. Our dynamic patching code had a bug that meant we could patch freed __init text, which could lead to corrupting userspace memory. csum_ipv6_magic() didn't work on little endian platforms since we optimised it recently. A fix for an endian bug when reading a device tree property telling us how many storage keys the machine has available. Fix a crash seen on some configurations of PowerVM when migrating the partition from one machine to another. A fix for a regression in the setup of our CPU to NUMA node mapping in KVM guests. A fix to our selftest Makefiles to make them work since a recent change to the shared Makefile logic." * tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix Makefiles for headers_install change powerpc/numa: Use associativity if VPHN hcall is successful powerpc/tm: Avoid possible userspace r1 corruption on reclaim powerpc/tm: Fix userspace r13 corruption powerpc/pseries: Fix unitialized timer reset on migration powerpc/pkeys: Fix reading of ibm, processor-storage-keys property powerpc: fix csum_ipv6_magic() on little endian platforms powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again) powerpc: Avoid code patching freed init sections KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds
2018-09-28	selftests: forwarding: test for bridge sticky flag	Nikolay Aleksandrov
	This test adds an fdb entry with the sticky flag and sends traffic from a different port with the same mac as a source address expecting the entry to not change ports if the flag is operating correctly. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28	selftests/powerpc: Fix Makefiles for headers_install change	Michael Ellerman
	Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk") introduced a requirement that Makefiles more than one level below the selftests directory need to define top_srcdir, but it didn't update any of the powerpc Makefiles. This broke building all the powerpc selftests with eg: make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc' BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment' ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make[2]: *** No rule to make target '../../../../scripts/subarch.include'. make[2]: Failed to remake makefile '../../../../scripts/subarch.include'. Makefile:38: recipe for target 'alignment' failed Fix it by setting top_srcdir in the affected Makefiles. Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-27	selftests/bpf: Test libbpf_{prog,attach}_type_by_name	Andrey Ignatov
	Add selftest for libbpf functions libbpf_prog_type_by_name and libbpf_attach_type_by_name. Example of output: % ./tools/testing/selftests/bpf/test_section_names Summary: 35 PASSED, 0 FAILED Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	selftests/bpf: Use libbpf_attach_type_by_name in test_socket_cookie	Andrey Ignatov
	Use newly introduced libbpf_attach_type_by_name in test_socket_cookie selftest. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Support sk_skb/stream_{parser, verdict} section names	Andrey Ignatov
	Add section names for BPF_SK_SKB_STREAM_PARSER and BPF_SK_SKB_STREAM_VERDICT attach types to be able to identify them in libbpf_attach_type_by_name. "stream_parser" and "stream_verdict" are used instead of simple "parser" and "verdict" just to avoid possible confusion in a place where attach type is used alone (e.g. in bpftool's show sub-commands) since there is another attach point that can be named as "verdict": BPF_SK_MSG_VERDICT. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Support cgroup_skb/{e,in}gress section names	Andrey Ignatov
	Add section names for BPF_CGROUP_INET_INGRESS and BPF_CGROUP_INET_EGRESS attach types to be able to identify them in libbpf_attach_type_by_name. "cgroup_skb" is used instead of "cgroup/skb" mostly to easy possible unifying of how libbpf and bpftool works with section names: * bpftool uses "cgroup_skb" to in "prog list" sub-command; * bpftool uses "ingress" and "egress" in "cgroup list" sub-command; * having two parts instead of three in a string like "cgroup_skb/ingress" can be leveraged to split it to prog_type part and attach_type part, or vise versa: use two parts to make a section name. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	libbpf: Introduce libbpf_attach_type_by_name	Andrey Ignatov
	There is a common use-case when ELF object contains multiple BPF programs and every program has its own section name. If it's cgroup-bpf then programs have to be 1) loaded and 2) attached to a cgroup. It's convenient to have information necessary to load BPF program together with program itself. This is where section name works fine in conjunction with libbpf_prog_type_by_name that identifies prog_type and expected_attach_type and these can be used with BPF_PROG_LOAD. But there is currently no way to identify attach_type by section name and it leads to messy code in user space that reinvents guessing logic every time it has to identify attach type to use with BPF_PROG_ATTACH. The patch introduces libbpf_attach_type_by_name that guesses attach type by section name if a program can be attached. The difference between expected_attach_type provided by libbpf_prog_type_by_name and attach_type provided by libbpf_attach_type_by_name is the former is used at BPF_PROG_LOAD time and can be zero if a program of prog_type X has only one corresponding attach type Y whether the latter provides specific attach type to use with BPF_PROG_ATTACH. No new section names were added to section_names array. Only existing ones were reorganized and attach_type was added where appropriate. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	bpftool: Fix bpftool net output	Andrey Ignatov
	Print `bpftool net` output to stdout instead of stderr. Only errors should be printed to stderr. Regular output should go to stdout and this is what all other subcommands of bpftool do, including --json and --pretty formats of `bpftool net` itself. Fixes: commit f6f3bac08ff9 ("tools/bpf: bpftool: add net support") Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-27	perf report: Don't try to map ip to invalid map	Milian Wolff
	Fixes a crash when the report encounters an address that could not be associated with an mmaped region: #0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 #1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 #2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 #3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 #4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 #5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 #6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 #7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085 Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-27	rseq/selftests: fix parametrized test with -fpie	Mathieu Desnoyers
	On x86-64, the parametrized selftest code for rseq crashes with a segmentation fault when compiled with -fpie. This happens when the param_test binary is loaded at an address beyond 32-bit on x86-64. The issue is caused by use of a 32-bit register to hold the address of the loop counter variable. Fix this by using a 64-bit register to calculate the address of the loop counter variables as an offset from rip. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> # v4.18 Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: linux-kselftest@vger.kernel.org Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-09-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-09-25 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow for RX stack hardening by implementing the kernel's flow dissector in BPF. Idea was originally presented at netconf 2017 [0]. Quote from merge commit: [...] Because of the rigorous checks of the BPF verifier, this provides significant security guarantees. In particular, the BPF flow dissector cannot get inside of an infinite loop, as with CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot read outside of packet bounds, because all memory accesses are checked. Also, with BPF the administrator can decide which protocols to support, reducing potential attack surface. Rarely encountered protocols can be excluded from dissection and the program can be updated without kernel recompile or reboot if a bug is discovered. [...] Also, a sample flow dissector has been implemented in BPF as part of this work, from Petar and Willem. [0] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf 2) Add support for bpftool to list currently active attachment points of BPF networking programs providing a quick overview similar to bpftool's perf subcommand, from Yonghong. 3) Fix a verifier pruning instability bug where a union member from the register state was not cleared properly leading to branches not being pruned despite them being valid candidates, from Alexei. 4) Various smaller fast-path optimizations in XDP's map redirect code, from Jesper. 5) Enable to recognize BPF_MAP_TYPE_REUSEPORT_SOCKARRAY maps in bpftool, from Roman. 6) Remove a duplicate check in libbpf that probes for function storage, from Taeung. 7) Fix an issue in test_progs by avoid checking for errno since on success its value should not be checked, from Mauricio. 8) Fix unused variable warning in bpf_getsockopt() helper when CONFIG_INET is not configured, from Anders. 9) Fix a compilation failure in the BPF sample code's use of bpf_flow_keys, from Prashant. 10) Minor cleanups in BPF code, from Yue and Zhong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Version bump conflict in batman-adv, take what's in net-next. iavf conflict, adjustment of netdev_ops in net-next conflicting with poll controller method removal in net. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	bpftool: add support for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY maps	Roman Gushchin
	Add BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map type to the list of maps types which bpftool recognizes. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-25	perf script python: Fix export-to-sqlite.py sample columns	Adrian Hunter
	With the "branches" export option, not all sample columns are exported. However the unwanted columns are not at the end of the tuple, as assumed by the code. Fix by taking the first 15 and last 3 values, instead of the first 18. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-25	perf script python: Fix export-to-postgresql.py occasional failure	Adrian Hunter
	Occasional export failures were found to be caused by truncating 64-bit pointers to 32-bits. Fix by explicitly setting types for all ctype arguments and results. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-25	Merge gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net	Greg Kroah-Hartman
	Dave writes: "Networking fixes: 1) Fix multiqueue handling of coalesce timer in stmmac, from Jose Abreu. 2) Fix memory corruption in NFC, from Suren Baghdasaryan. 3) Don't write reserved bits in ravb driver, from Kazuya Mizuguchi. 4) SMC bug fixes from Karsten Graul, YueHaibing, and Ursula Braun. 5) Fix TX done race in mvpp2, from Antoine Tenart. 6) ipv6 metrics leak, from Wei Wang. 7) Adjust firmware version requirements in mlxsw, from Petr Machata. 8) Fix autonegotiation on resume in r8169, from Heiner Kallweit. 9) Fixed missing entries when dumping /proc/net/if_inet6, from Jeff Barnhill. 10) Fix double free in devlink, from Dan Carpenter. 11) Fix ethtool regression from UFO feature removal, from Maciej Żenczykowski. 12) Fix drivers that have a ndo_poll_controller() that captures the cpu entirely on loaded hosts by trying to drain all rx and tx queues, from Eric Dumazet. 13) Fix memory corruption with jumbo frames in aquantia driver, from Friedemann Gerold." * gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (79 commits) net: mvneta: fix the remaining Rx descriptor unmapping issues ip_tunnel: be careful when accessing the inner header mpls: allow routes on ip6gre devices net: aquantia: memory corruption on jumbo frames tun: remove ndo_poll_controller nfp: remove ndo_poll_controller bnxt: remove ndo_poll_controller bnx2x: remove ndo_poll_controller mlx5: remove ndo_poll_controller mlx4: remove ndo_poll_controller i40evf: remove ndo_poll_controller ice: remove ndo_poll_controller igb: remove ndo_poll_controller ixgb: remove ndo_poll_controller fm10k: remove ndo_poll_controller ixgbevf: remove ndo_poll_controller ixgbe: remove ndo_poll_controller bonding: use netpoll_poll_dev() helper netpoll: make ndo_poll_controller() optional rds: Fix build regression. ...
2018-09-23	Merge branch 'perf-urgent-for-linus' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Thomas writes: "- Provide a strerror_r wrapper so lib/bpf can be built on systems without _GNU_SOURCE - Unbreak the man page generator when building out of tree" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf Documentation: Fix out-of-tree asciidoctor man page generation tools lib bpf: Provide wrapper for strerror_r to build in !_GNU_SOURCE systems
2018-09-21	selftests/net: add ipv6 tests to ip_defrag selftest	Peter Oskolkov
	This patch adds ipv6 defragmentation tests to ip_defrag selftest, to complement existing ipv4 tests. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	bpf: test_maps, only support ESTABLISHED socks	John Fastabend
	Ensure that sockets added to a sock{map\|hash} that is not in the ESTABLISHED state is rejected. Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-21	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Greg Kroah-Hartman
	Paolo writes: "It's mostly small bugfixes and cleanups, mostly around x86 nested virtualization. One important change, not related to nested virtualization, is that the ability for the guest kernel to trap CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now masked by default. This is because the feature is detected through an MSR; a very bad idea that Intel seems to like more and more. Some applications choke if the other fields of that MSR are not initialized as on real hardware, hence we have to disable the whole MSR by default, as was the case before Linux 4.12." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits) KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs kvm: selftests: Add platform_info_test KVM: x86: Control guest reads of MSR_PLATFORM_INFO KVM: x86: Turbo bits in MSR_PLATFORM_INFO nVMX x86: Check VPID value on vmentry of L2 guests nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv KVM: VMX: check nested state and CR4.VMXE against SMM kvm: x86: make kvm_{load\|put}_guest_fpu() static x86/hyper-v: rename ipi_arg_{ex,non_ex} structures KVM: VMX: use preemption timer to force immediate VMExit KVM: VMX: modify preemption timer bit only when arming timer KVM: VMX: immediately mark preemption timer expired only for zero value KVM: SVM: Switch to bitmap_zalloc() KVM/MMU: Fix comment in walk_shadow_page_lockless_end() kvm: selftests: use -pthread instead of -lpthread KVM: x86: don't reset root in kvm_mmu_setup() kvm: mmu: Don't read PDPTEs when paging is not enabled x86/kvm/lapic: always disable MMIO interface in x2APIC mode KVM: s390: Make huge pages unavailable in ucontrol VMs ...
2018-09-20	selftests: mlxsw: Add a test for UC behavior under MC flood	Petr Machata
	A so-called "MC-aware" mode has recently been enabled in mlxsw. In MC-aware mode, BUM traffic is handled in a special way so that when a switch is flooded with BUM, UC performance isn't unduly impacted. Without enablement of this mode, a stream of BUM traffic can cause sustained UC throughput drop in excess of 99 %. Add a test for this behavior. Compare how much UC throughput degrades as a stream of broadcast frames floods the switch. A minimal degradation is tolerated to cover for glitches in traffic injection performance. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20	selftests: forwarding: lib: Add mtu_set(), mtu_restore()	Petr Machata
	Some selftests need to tweak MTU of an interface, and naturally should at teardown restore the MTU back to the original value. Add two functions to facilitate this MTU handling: mtu_set() to change MTU value, and mtu_reset() to change it back to what it was before. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20	selftests: forwarding: lib: Add ethtool_stats_get()	Petr Machata
	Add a new service function to obtain ethtool counters. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20	kvm: selftests: Add platform_info_test	Drew Schmitt
	Test guest access to MSR_PLATFORM_INFO when the capability is enabled or disabled. Signed-off-by: Drew Schmitt <dasch@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-20	kvm: selftests: use -pthread instead of -lpthread	Lei Yang
	I run into the following error testing/selftests/kvm/dirty_log_test.c:285: undefined reference to `pthread_create' testing/selftests/kvm/dirty_log_test.c:297: undefined reference to `pthread_join' collect2: error: ld returned 1 exit status my gcc version is gcc version 4.8.4 "-pthread" would work everywhere Signed-off-by: Lei Yang <Lei.Yang@windriver.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-19	tools: bpf: fix license for a compat header file	Jakub Kicinski
	libc_compat.h is used by libbpf so make sure it's licensed under LGPL or BSD license. The license change should be OK, I'm the only author of the file. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-19	Merge tag 'perf-urgent-for-mingo-4.19-20180918' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix the build on !_GNU_SOURCE libc systems such as Alpine Linux/musl libc due to usage of strerror_r glibc variant on libbpf (Arnaldo Carvalho de Melo) - Fix out-of-tree asciidoctor man page generation (Ben Hutchings) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-09-18	selftests: pmtu: properly redirect stderr to /dev/null	Sabrina Dubroca
	The cleanup function uses "$CMD 2 > /dev/null", which doesn't actually send stderr to /dev/null, so when the netns doesn't exist, the error message is shown. Use "2> /dev/null" instead, so that those messages disappear, as was intended. Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Two new tls tests added in parallel in both net and net-next. Used Stephen Rothwell's linux-next resolution. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18	tools/bpf: bpftool: improve output format for bpftool net	Yonghong Song
	This is a followup patch for Commit f6f3bac08ff9 ("tools/bpf: bpftool: add net support"). Some improvements are made for the bpftool net output. Specially, plain output is more concise such that per attachment should nicely fit in one line. Compared to previous output, the prog tag is removed since it can be easily obtained with program id. Similar to xdp attachments, the device name is added to tc attachments. The bpf program attached through shared block mechanism is supported as well. $ ip link add dev v1 type veth peer name v2 $ tc qdisc add dev v1 ingress_block 10 egress_block 20 clsact $ tc qdisc add dev v2 ingress_block 10 egress_block 20 clsact $ tc filter add block 10 protocol ip prio 25 bpf obj bpf_shared.o sec ingress flowid 1:1 $ tc filter add block 20 protocol ip prio 30 bpf obj bpf_cyclic.o sec classifier flowid 1:1 $ bpftool net xdp: tc: v2(7) clsact/ingress bpf_shared.o:[ingress] id 23 v2(7) clsact/egress bpf_cyclic.o:[classifier] id 24 v1(8) clsact/ingress bpf_shared.o:[ingress] id 23 v1(8) clsact/egress bpf_cyclic.o:[classifier] id 24 The documentation and "bpftool net help" are updated to make it clear that current implementation only supports xdp and tc attachments. For programs attached to cgroups, "bpftool cgroup" can be used to dump attachments. For other programs e.g. sk_{filter,skb,msg,reuseport} and lwt/seg6, iproute2 tools should be used. The new output: $ bpftool net xdp: eth0(2) driver id 198 tc: eth0(2) clsact/ingress fbflow_icmp id 335 act [{icmp_action id 336}] eth0(2) clsact/egress fbflow_egress id 334 $ bpftool -jp net [{ "xdp": [{ "devname": "eth0", "ifindex": 2, "mode": "driver", "id": 198 } ], "tc": [{ "devname": "eth0", "ifindex": 2, "kind": "clsact/ingress", "name": "fbflow_icmp", "id": 335, "act": [{ "name": "icmp_action", "id": 336 } ] },{ "devname": "eth0", "ifindex": 2, "kind": "clsact/egress", "name": "fbflow_egress", "id": 334 } ] } ] Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-18	perf Documentation: Fix out-of-tree asciidoctor man page generation	Ben Hutchings
	The dependency for the man page rule using asciidoctor incorrectly specifies a source file in $(OUTPUT). When building out-of-tree, the source file is not found, resulting in a fall-back to the following rule which uses xmlto. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180916151704.GF4765@decadent.org.uk Fixes: ffef80ecf89f ("perf Documentation: Support for asciidoctor") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>