summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-26kbuild: process mixture of clean/build targets one by oneMasahiro Yamada
Support parallel building of clean, config, and build targets in a single command. For example, make -j<N> clean all or make -j<N> mrproper defconfig all They should be handled one by one. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26kbuild: rename built-in.o to built-in.aNicholas Piggin
Incremental linking is gone, so rename built-in.o to built-in.a, which is the usual extension for archive files. This patch does two things, first is a simple search/replace: git grep -l 'built-in\.o' | xargs sed -i 's/built-in\.o/built-in\.a/g' The second is to invert nesting of nested text manipulations to avoid filtering built-in.a out from libs-y2: -libs-y2 := $(filter-out %.a, $(patsubst %/, %/built-in.a, $(libs-y))) +libs-y2 := $(patsubst %/, %/built-in.a, $(filter-out %.a, $(libs-y))) Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26kbuild: remove incremental linking optionNicholas Piggin
This removes the old `ld -r` incremental link option, which has not been selected by any architecture since June 2017. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26kbuild: Improve portability of some sed invocationsMichael Forney
* Use BREs where EREs aren't necessary. * Pass -E instead of -r to use EREs. This will be standardized in the next POSIX revision[0]. GNU sed supports this since 4.2 (May 2009), and busybox since 1.22.0 (Jan 2014). * Use the [:space:] character class instead of ` \t` in bracket expressions. In bracket expressions, POSIX says that <backslash> loses its special meaning, so a conforming implementation cannot expand \t to <tab>[1]. * In BREs, use interval expressions (\{n,m\}) instead of non-standard features like \+ and \?. * Use a loop instead of -s flag. There are still plenty of other cases of non-standard sed invocations (use of ERE features in BREs, in-place editing), but this fixes some core ones. [0] http://austingroupbugs.net/view.php?id=528 [1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05 Signed-off-by: Michael Forney <forney@google.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26kbuild: add clang-version.shSami Tolvanen
Based on gcc-version.sh, clang-version.sh prints out the correct version of clang. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-25syscalls: define and explain goal to not call syscalls in the kernelDominik Brodowski
The syscall entry points to the kernel defined by SYSCALL_DEFINEx() and COMPAT_SYSCALL_DEFINEx() should only be called from userspace through kernel entry points, but not from the kernel itself. This will allow cleanups and optimizations to the entry paths *and* to the parts of the kernel code which currently need to pretend to be userspace in order to make use of syscalls. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-03-25x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from ↵Sven Wegener
KBUILD_CFLAGS The kernel build system already takes care of generating the dependency files. Having the additional -MD in KBUILD_CFLAGS leads to stray .<pid>.d files in the build directory when we call the cc-option macro. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/alpine.LNX.2.21.1803242219380.30139@titan.int.lan.stealer.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-25Merge tag 'perf-core-for-mingo-4.17-20180323' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Move non-TUI specific annotation routines out of the TUI browser so that it can be used in other UIs, and to demonstrate that introduce a 'perf annotate --stdio2' option that will apply those formatting routines to provide a non-interactive annotation mode (Arnaldo Carvalho de Melo) - Add 'P' hotkey to the annotation TUI, so dump the current annotated symbol to a file, easing report thru e-mail, by getting rid of the spaces + right hand side scrollbar chars (Arnaldo Carvalho de Melo) - Support --ignore-vmlinux to 'perf report' and 'perf annotate', that was already present in 'perf top', to use /proc/{kcore,kallsyms}, allowing to see what is in fact running (patched stuff, alternatives, ftrace, etc), not the initial state of the kernel (vmlinux) (Arnaldo Carvalho de Melo) - Support 'jump' instructions to a different function, treating them as 'call' instructions (Arnaldo Carvalho de Melo) - Fix some jump artifacts when using vmlinux + ASM functions, where the ELF symtab for instance, for entry_SYSCALL_64 includes that and what comes after the 'syscall_return_via_sysret' label, but the objdump -dS prints the jump targets + offsets using the syscall_return_via_sysret address, which was confusing 'perf annotate'. See the cset comments for further info (Arnaldo Carvalho de Melo) - Report error from dwfl_attach_state() in the unwind code (Martin Vuille) - Reference Py_None before returning it in the python extension (Petr Machata) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull mqueuefs revert from Eric Biederman: "This fixes a regression that came in the merge window for v4.16. The problem is that the permissions for mounting and using the mqueuefs filesystem are broken. The necessary permission check is missing letting people who should not be able to mount mqueuefs mount mqueuefs. The field sb->s_user_ns is set incorrectly not allowing the mounter of mqueuefs to remount and otherwise have proper control over the filesystem. Al Viro and I see the path to the necessary fixes differently and I am not even certain at this point he actually sees all of the necessary fixes. Given a couple weeks we can probably work something out but I don't see the review being resolved in time for the final v4.16. I don't want v4.16 shipping with a nasty regression. So unfortunately I am sending a revert" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: Revert "mqueue: switch to on-demand creation of internal mount"
2018-03-24Revert "mqueue: switch to on-demand creation of internal mount"Eric W. Biederman
This reverts commit 36735a6a2b5e042db1af956ce4bcc13f3ff99e21. Aleksa Sarai <asarai@suse.de> writes: > [REGRESSION v4.16-rc6] [PATCH] mqueue: forbid unprivileged user access to internal mount > > Felix reported weird behaviour on 4.16.0-rc6 with regards to mqueue[1], > which was introduced by 36735a6a2b5e ("mqueue: switch to on-demand > creation of internal mount"). > > Basically, the reproducer boils down to being able to mount mqueue if > you create a new user namespace, even if you don't unshare the IPC > namespace. > > Previously this was not possible, and you would get an -EPERM. The mount > is the *host* mqueue mount, which is being cached and just returned from > mqueue_mount(). To be honest, I'm not sure if this is safe or not (or if > it was intentional -- since I'm not familiar with mqueue). > > To me it looks like there is a missing permission check. I've included a > patch below that I've compile-tested, and should block the above case. > Can someone please tell me if I'm missing something? Is this actually > safe? > > [1]: https://github.com/docker/docker/issues/36674 The issue is a lot deeper than a missing permission check. sb->s_user_ns was is improperly set as well. So in addition to the filesystem being mounted when it should not be mounted, so things are not allow that should be. We are practically to the release of 4.16 and there is no agreement between Al Viro and myself on what the code should looks like to fix things properly. So revert the code to what it was before so that we can take our time and discuss this properly. Fixes: 36735a6a2b5e ("mqueue: switch to on-demand creation of internal mount") Reported-by: Felix Abecassis <fabecassis@nvidia.com> Reported-by: Aleksa Sarai <asarai@suse.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Don't pick fixed hash implementation for NFT_SET_EVAL sets, otherwise userspace hits EOPNOTSUPP with valid rules using the meter statement, from Florian Westphal. 2) If you send a batch that flushes the existing ruleset (that contains a NAT chain) and the new ruleset definition comes with a new NAT chain, don't bogusly hit EBUSY. Also from Florian. 3) Missing netlink policy attribute validation, from Florian. 4) Detach conntrack template from skbuff if IP_NODEFRAG is set on, from Paolo Abeni. 5) Cache device names in flowtable object, otherwise we may end up walking over devices going aways given no rtnl_lock is held. 6) Fix incorrect net_device ingress with ingress hooks. 7) Fix crash when trying to read more data than available in UDP packets from the nf_socket infrastructure, from Subash. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-24netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}Subash Abhinov Kasiviswanathan
skb_header_pointer will copy data into a buffer if data is non linear, otherwise it will return a pointer in the linear section of the data. nf_sk_lookup_slow_v{4,6} always copies data of size udphdr but later accesses memory within the size of tcphdr (th->doff) in case of TCP packets. This causes a crash when running with KASAN with the following call stack - BUG: KASAN: stack-out-of-bounds in xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178 Read of size 2 at addr ffffffe3d417a87c by task syz-executor/28971 CPU: 2 PID: 28971 Comm: syz-executor Tainted: G B W O 4.9.65+ #1 Call trace: [<ffffff9467e8d390>] dump_backtrace+0x0/0x428 arch/arm64/kernel/traps.c:76 [<ffffff9467e8d7e0>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226 [<ffffff946842d9b8>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffff946842d9b8>] dump_stack+0xd4/0x124 lib/dump_stack.c:51 [<ffffff946811d4b0>] print_address_description+0x68/0x258 mm/kasan/report.c:248 [<ffffff946811d8c8>] kasan_report_error mm/kasan/report.c:347 [inline] [<ffffff946811d8c8>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371 [<ffffff946811df44>] kasan_report+0x5c/0x70 mm/kasan/report.c:372 [<ffffff946811bebc>] check_memory_region_inline mm/kasan/kasan.c:308 [inline] [<ffffff946811bebc>] __asan_load2+0x84/0x98 mm/kasan/kasan.c:739 [<ffffff94694d6f04>] __tcp_hdrlen include/linux/tcp.h:35 [inline] [<ffffff94694d6f04>] xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178 Fix this by copying data into appropriate size headers based on protocol. Fixes: a583636a83ea ("inet: refactor inet[6]_lookup functions to take skb") Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-24nfp: bpf: fix check of program max insn countJakub Kicinski
NFP program allocation length is in bytes and NFP program length is in instructions, fix the comparison of the two. Fixes: 9314c442d7dd ("nfp: bpf: move translation prepare to offload.c") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-03-24tools: bpftool: don't use hex numbers in JSON outputJakub Kicinski
JSON does not accept hex numbers with 0x prefix. Simply print as decimal numbers, JSON should be primarily machine-readable. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Fixes: 831a0aafe5c3 ("tools: bpftool: add JSON output for `bpftool map *` commands") Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-03-24Merge tag 'pinctrl-v4.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Two fixes for pin control for v4.16: - Renesas SH-PFC: remove a duplicate clkout pin which was causing crashes - fix Samsung out of bounds exceptions" * tag 'pinctrl-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: samsung: Validate alias coming from DT pinctrl: sh-pfc: r8a7795: remove duplicate of CLKOUT pin in pinmux_pins[]
2018-03-24ipc/util: Helpers for making the sysvipc operations pid namespace awareEric W. Biederman
Capture the pid namespace when /proc/sysvipc/msg /proc/sysvipc/shm and /proc/sysvipc/sem are opened, and make it available through the new helper ipc_seq_pid_ns. This makes it possible to report the pids in these files in the pid namespace of the opener of the files. Implement ipc_update_pid. A simple impline helper that will only update a struct pid pointer if the new value does not equal the old value. This removes the need for wordy code sequences like: old = object->pid; object->pid = new; put_pid(old); and old = object->pid; if (old != new) { object->pid = new; put_pid(old); } Allowing the following to be written instead: ipc_update_pid(&object->pid, new); Which is easier to read and ensures that the pid reference count is not touched the old and the new values are the same. Not touching the reference count in this case is important to help avoid issues like af_unix experienced, where multiple threads of the same process managed to bounce the struct pid between cpu cache lines, but updating the pids reference count. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-24ipc: Move IPCMNI from include/ipc.h into ipc/util.hEric W. Biederman
The definition IPCMNI is only used in ipc/util.h and ipc/util.c. So there is no reason to keep it in a header file that the whole kernel can see. Move it into util.h to simplify future maintenance. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-24msg: Move struct msg_queue into ipc/msg.cEric W. Biederman
All of the users are now in ipc/msg.c so make the definition local to that file to make code maintenance easier. AKA to prevent rebuilding the entire kernel when struct msg_queue changes. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-24shm: Move struct shmid_kernel into ipc/shm.cEric W. Biederman
All of the users are now in ipc/shm.c so make the definition local to that file to make code maintenance easier. AKA to prevent rebuilding the entire kernel when struct shmid_kernel changes. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-03-24ARM: 8750/1: deflate_xip_data.sh: minor fixesNicolas Pitre
Send nm complaints about broken pipe (when sed exits early) to /dev/null. All errors should be printed to stderr. Don't trap on normal exit so the trap can return an error code. Signed-off-by: Nicolas Pitre <nico@linaro.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8748/1: mm: Define vdso_start, vdso_end as arrayJinbum Park
Define vdso_start, vdso_end as array to avoid compile-time analysis error for the case of built with CONFIG_FORTIFY_SOURCE. and, since vdso_start, vdso_end are used in vdso.c only, move extern-declaration from vdso.h to vdso.c. If kernel is built with CONFIG_FORTIFY_SOURCE, compile-time error happens at this code. - if (memcmp(&vdso_start, "177ELF", 4)) The size of "&vdso_start" is recognized as 1 byte, but n is 4, So that compile-time error is reported. Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jinbum Park <jinb.park7@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMUArnd Bergmann
Without CONFIG_MMU, this results in a build failure: ./arch/arm/include/asm/memory.h:92:23: error: initializer element is not constant #define VECTORS_BASE vectors_base arch/arm/mm/dump.c:32:4: note: in expansion of macro 'VECTORS_BASE' { VECTORS_BASE, "Vectors" }, arch/arm/mm/dump.c:71:11: error: 'L_PTE_USER' undeclared here (not in a function); did you mean 'VTIME_USER'? .mask = L_PTE_USER, ^~~~~~~~~~ Obviously the feature only makes sense with an MMU, so let's add the dependency here. Fixes: a8e53c151fe7 ("ARM: 8737/1: mm: dump: add checking for writable and executable") Acked-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]Fabio Estevam
Commit 384b38b66947 ("ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu") fixed the cpu dying notifier by clearing vfp_current_hw_state[]. However commit e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine") incorrectly used the original vfp_force_reload() function in the cpu dying notifier. Fix it by going back to clearing vfp_current_hw_state[]. Fixes: e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine") Cc: linux-stable <stable@vger.kernel.org> Reported-by: Kohji Okuno <okuno.kohji@jp.panasonic.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-03-24i2c: i2c-stm32f7: fix no check on returned setupPierre-Yves MORDRET
Before assigning returned setup structure check if not null Fixes: 463a9215f3ca7600b5ff ("i2c: stm32f7: fix setup structure") Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-03-24batman-adv: fix packet loss for broadcasted DHCP packets to a serverLinus Lüssing
DHCP connectivity issues can currently occur if the following conditions are met: 1) A DHCP packet from a client to a server 2) This packet has a multicast destination 3) This destination has a matching entry in the translation table (FF:FF:FF:FF:FF:FF for IPv4, 33:33:00:01:00:02/33:33:00:01:00:03 for IPv6) 4) The orig-node determined by TT for the multicast destination does not match the orig-node determined by best-gateway-selection In this case the DHCP packet will be dropped. The "gateway-out-of-range" check is supposed to only be applied to unicasted DHCP packets to a specific DHCP server. In that case dropping the the unicasted frame forces the client to retry via a broadcasted one, but now directed to the new best gateway. A DHCP packet with broadcast/multicast destination is already ensured to always be delivered to the best gateway. Dropping a multicasted DHCP packet here will only prevent completing DHCP as there is no other fallback. So far, it seems the unicast check was implicitly performed by expecting the batadv_transtable_search() to return NULL for multicast destinations. However, a multicast address could have always ended up in the translation table and in fact is now common. To fix this potential loss of a DHCP client-to-server packet to a multicast address this patch adds an explicit multicast destination check to reliably bail out of the gateway-out-of-range check for such destinations. The issue and fix were tested in the following three node setup: - Line topology, A-B-C - A: gateway client, DHCP client - B: gateway server, hop-penalty increased: 30->60, DHCP server - C: gateway server, code modifications to announce FF:FF:FF:FF:FF:FF Without this patch, A would never transmit its DHCP Discover packet due to an always "out-of-range" condition. With this patch, a full DHCP handshake between A and B was possible again. Fixes: be7af5cf9cae ("batman-adv: refactoring gateway handling code") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-03-24batman-adv: fix multicast-via-unicast transmission with AP isolationLinus Lüssing
For multicast frames AP isolation is only supposed to be checked on the receiving nodes and never on the originating one. Furthermore, the isolation or wifi flag bits should only be intepreted as such for unicast and never multicast TT entries. By injecting flags to the multicast TT entry claimed by a single target node it was verified in tests that this multicast address becomes unreachable, leading to packet loss. Omitting the "src" parameter to the batadv_transtable_search() call successfully skipped the AP isolation check and made the target reachable again. Fixes: 1d8ab8d3c176 ("batman-adv: Modified forwarding behaviour for multicast packets") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-03-24Merge branch 'linus' into x86/dma, to resolve a conflict with upstreamIngo Molnar
Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-24Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
With the cherry-picked perf/urgent commit merged separately we can now merge all the fixes without conflicts. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-24Merge branch 'perf/urgent' into perf/core, to resolve conflictsIngo Molnar
Pick up a cherry-picked commit. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-23net/sched: act_vlan: declare push_vid with host byte orderDavide Caratti
use u16 in place of __be16 to suppress the following sparse warnings: net/sched/act_vlan.c:150:26: warning: incorrect type in assignment (different base types) net/sched/act_vlan.c:150:26: expected restricted __be16 [usertype] push_vid net/sched/act_vlan.c:150:26: got unsigned short net/sched/act_vlan.c:151:21: warning: restricted __be16 degrades to integer net/sched/act_vlan.c:208:26: warning: incorrect type in assignment (different base types) net/sched/act_vlan.c:208:26: expected unsigned short [unsigned] [usertype] tcfv_push_vid net/sched/act_vlan.c:208:26: got restricted __be16 [usertype] push_vid Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23net/sched: remove tcf_idr_cleanup()Davide Caratti
tcf_idr_cleanup() is no more used, so remove it. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23mlxsw: spectrum_span: Prevent duplicate mirrorsIdo Schimmel
In net commit 8175f7c4736f ("mlxsw: spectrum: Prevent duplicate mirrors") we prevented the user from mirroring more than once from a single binding point (port-direction pair). The fix was essentially reverted in a merge conflict resolution when net was merged into net-next. Restore it. Fixes: 03fe2debbb27 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23Merge tag 'trace-v4.16-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull kprobe fixes from Steven Rostedt: "The documentation for kprobe events says that symbol offets can take both a + and - sign to get to befor and after the symbol address. But in actuality, the code does not support the minus. This fixes that issue, and adds a few more selftests to kprobe events" * tag 'trace-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: selftests: ftrace: Add a testcase for probepoint selftests: ftrace: Add a testcase for string type with kprobe_event selftests: ftrace: Add probe event argument syntax testcase tracing: probeevent: Fix to support minus offset from symbol
2018-03-23ixgbe: tweak page counting for XDP_REDIRECTBjörn Töpel
The current page counting scheme assumes that the reference count cannot decrease until the received frame is sent to the upper layers of the networking stack. This assumption does not hold for the XDP_REDIRECT action, since a page (pointed out by xdp_buff) can have its reference count decreased via the xdp_do_redirect call. To work around that, we now start off by a large page count and then don't allow a refcount less than two. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbevf: Add XDP queue stats reportingTony Nguyen
XDP stats are included in TX stats, however, they are not reported in TX queue stats since they are setup on different queues. Add reporting for XDP queue stats to provide consistency between the total stats and per queue stats. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbevf: Add support for meta dataTony Nguyen
Add support for XDP meta data when using build skb. Based on commit 366a88fe2f40 ("bpf, ixgbe: add meta data support") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbevf: Delay tail write for XDP packetsTony Nguyen
Current XDP implementation hits the tail on every XDP_TX; change the driver to only hit the tail after packet processing is complete. Based on commit 7379f97a4fce ("ixgbe: delay tail write to every 'n' packets") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbevf: Add support for XDP_TX actionTony Nguyen
This implements the XDP_TX action which is modeled on the ixgbe implementation. However instead of using CPU id to determine which XDP queue to use, this uses the received RX queue index, which is similar to i40e. Doing this eliminates the restriction that number of CPUs not exceed number of XDP queues that ixgbe has. Also, based on the number of queues available, the number of TX queues may be reduced when an XDP program is loaded in order to accommodate the XDP queues. Based largely on commit 33fdc82f0883 ("ixgbe: add support for XDP_TX action") Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbevf: Add XDP support for pass and drop actionsTony Nguyen
Implement XDP_PASS and XDP_DROP based on the ixgbe implementation. Based largely on commit 924708081629 ("ixgbe: add XDP support for pass and drop actions"). Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbe: enable TSO with IPsec offloadShannon Nelson
Fix things up to support TSO offload in conjunction with IPsec hw offload. This raises throughput with IPsec offload on to nearly line rate. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbe: no need for esp trailer if GSOShannon Nelson
There is no need to calculate the trailer length if we're doing a GSO/TSO, as there is no trailer added to the packet data. Also, don't bother clearing the flags field as it was already cleared earlier. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbe: remove unneeded ipsec test in TX pathShannon Nelson
Since the ipsec data fields will be zero anyway in the non-ipsec case, we can remove the conditional jump. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbe: no need for ipsec csum feature checkShannon Nelson
With the patch commit f8aa2696b4af ("esp: check the NETIF_F_HW_ESP_TX_CSUM bit before segmenting") we no longer need to protect ourself from checksum offload requests on IPsec packets, so we can remove the check in our .ndo_features_check callback. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ixgbe: fix read-modify-write in x550 phy setupPaul Greenwalt
Replaced an assignment operation with an OR operation. The variable assignment was overwriting the value read from the PHY register. The OR operation sets only the intended register bits. The bits that were being overwritten are reserved, so the assignment had no functional impact. Reported by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23sched/cpufreq: Rate limits for SCHED_DEADLINEClaudio Scordino
When the SCHED_DEADLINE scheduling class increases the CPU utilization, it should not wait for the rate limit, otherwise it may miss some deadline. Tests using rt-app on Exynos5422 with up to 10 SCHED_DEADLINE tasks have shown reductions of even 10% of deadline misses with a negligible increase of energy consumption (measured through Baylibre Cape). Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Todd Kjos <tkjos@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/1520937340-2755-1-git-send-email-claudio@evidence.eu.com
2018-03-23ixgbe: add status reg reads to ixgbe_check_removePaul Greenwalt
Add status register reads and delay between reads to ixgbe_check_remove. Registers can read 0xFFFFFFFF during PCI reset, which causes the driver to remove the adapter. The additional status register reads can reduce the chance of this race condition. If the status register is not 0xFFFFFFFF, then ixgbe_check_remove returns the value of the register being read. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-23ALSA: usb-audio: Add native DSD support for TEAC UD-301Nobutaka Okabe
Add native DSD support quirk for TEAC UD-301 DAC, by adding the PID/VID 0644:804a. Signed-off-by: Nobutaka Okabe <nob77413@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-23x86/entry/64: Don't use IST entry for #BP stackAndy Lutomirski
There's nothing IST-worthy about #BP/int3. We don't allow kprobes in the small handful of places in the kernel that run at CPL0 with an invalid stack, and 32-bit kernels have used normal interrupt gates for #BP forever. Furthermore, we don't allow kprobes in places that have usergs while in kernel mode, so "paranoid" is also unnecessary. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2018-03-23perf annotate: Use absolute addresses to calculate jump target offsetsArnaldo Carvalho de Melo
These types of jumps were confusing the annotate browser: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ffffffff81a00020: swapgs <SNIP> │ffffffff81a00128: ↓ jae ffffffff81a00139 <syscall_return_via_sysret+0x53> <SNIP> │ffffffff81a00155: → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> I.e. the syscall_return_via_sysret function is actually "inside" the entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53) are relative to syscall_return_via_sysret, not to syscall_return_via_sysret. Or this may be some artifact in how the assembler marks the start and end of a function and how this ends up in the ELF symtab for vmlinux, i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but just right after it. From readelf -sw vmlinux: 80267: ffffffff81a00020 315 NOTYPE GLOBAL DEFAULT 1 entry_SYSCALL_64 316: ffffffff81a000e6 0 NOTYPE LOCAL DEFAULT 1 syscall_return_via_sysret 0xffffffff81a00020 + 315 > 0xffffffff81a000e6 So instead of looking for offsets after that last '+' sign, calculate offsets for jump target addresses that are inside the function being disassembled from the absolute address, 0xffffffff81a00139 in this case, subtracting from it the objdump address for the start of the function being disassembled, entry_SYSCALL_64() in this case. So, before this patch: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↑ jmp 6c │ mov %cr3,%rdi │ ↑ jmp 62 │ mov %rdi,%rax │ and $0x7ff,%rdi │ bt %rdi,%gs:0x2219a │ ↑ jae 53 │ btr %rdi,%gs:0x2219a │ mov %rax,%rdi │ ↑ jmp 5b After: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> With those at least navigating to the right destination, an improvement for these cases seems to be to be to somehow mark those inner functions, which in this case could be: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux │syscall_return_via_sysret: │ pop %r15 │ pop %r14 │ pop %r13 │ pop %r12 │ pop %rbp │ pop %rbx │ pop %rsi │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> This all gets much better viewed if one uses 'perf report --ignore-vmlinux' forcing the usage of /proc/kcore + /proc/kallsyms, when the above actually gets down to: # perf report --ignore-vmlinux ## do '/64', will show the function names containing '64', ## navigate to /entry_SYSCALL_64_after_hwframe.annotation, ## press 'A' to annotate, then 'P' to print that annotation ## to a file ## From another xterm (or see on screen, this 'P' thing is for ## getting rid of those right side scroll bars/spaces): # cat /entry_SYSCALL_64_after_hwframe.annotation entry_SYSCALL_64_after_hwframe() /proc/kcore Event: cycles:ppp Percent Disassembly of section load0: ffffffff9aa00044 <load0>: 11.97 push %rax 4.85 push %rdi push %rsi 2.59 push %rdx 2.27 push %rcx 0.32 pushq $0xffffffffffffffda 1.29 push %r8 xor %r8d,%r8d 1.62 push %r9 0.65 xor %r9d,%r9d 1.62 push %r10 xor %r10d,%r10d 5.50 push %r11 xor %r11d,%r11d 3.56 push %rbx xor %ebx,%ebx 4.21 push %rbp xor %ebp,%ebp 2.59 push %r12 0.97 xor %r12d,%r12d 3.24 push %r13 xor %r13d,%r13d 2.27 push %r14 xor %r14d,%r14d 4.21 push %r15 xor %r15d,%r15d 0.97 mov %rsp,%rdi 5.50 → callq do_syscall_64 14.56 mov 0x58(%rsp),%rcx 7.44 mov 0x80(%rsp),%r11 0.32 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 shl $0x10,%rcx 0.32 sar $0x10,%rcx 3.24 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 2.27 cmpq $0x33,0x88(%rsp) 1.29 → jne swapgs_restore_regs_and_return_to_usermode mov 0x30(%rsp),%r11 8.74 cmp %r11,0x90(%rsp) → jne swapgs_restore_regs_and_return_to_usermode 0.32 test $0x10100,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 cmpq $0x2b,0xa0(%rsp) 0.65 → jne swapgs_restore_regs_and_return_to_usermode I.e. using kallsyms makes the function start/end be done differently than using what is in the vmlinux ELF symtab and actually the hits goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the start of entry_SYSCALL_64: ENTRY(entry_SYSCALL_64) UNWIND_HINT_EMPTY <SNIP> pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ GLOBAL(entry_SYSCALL_64_after_hwframe) pushq %rax /* pt_regs->orig_ax */ PUSH_AND_CLEAR_REGS rax=$-ENOSYS And it goes and ends at: cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ jne swapgs_restore_regs_and_return_to_usermode /* * We win! This label is here just for ease of understanding * perf profiles. Nothing jumps here. */ syscall_return_via_sysret: /* rcx and r11 are already restored (see code above) */ UNWIND_HINT_EMPTY POP_REGS pop_rdi=0 skip_r11rcx=1 So perhaps some people should really just play with '--ignore-vmlinux' to force /proc/kcore + kallsyms. One idea is to do both, i.e. have a vmlinux annotation and a kcore+kallsyms one, when possible, and even show the patched location, etc. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Defer searching for comma in raw line till it is neededArnaldo Carvalho de Melo
That strchr() in jump__scnprintf() needs to be nuked somehow, as it, IIRC is already done in jump__parse() and if needed at scnprintf() time, should be stashed in the struct filled in parse() time. For now jus defer it to just before where it is used. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j0t5hagnphoz9xw07bh3ha3g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>