linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-10-08	seccomp: Make duplicate listener detection non-racy	Jann Horn
	Currently, init_listener() tries to prevent adding a filter with SECCOMP_FILTER_FLAG_NEW_LISTENER if one of the existing filters already has a listener. However, this check happens without holding any lock that would prevent another thread from concurrently installing a new filter (potentially with a listener) on top of the ones we already have. Theoretically, this is also a data race: The plain load from current->seccomp.filter can race with concurrent writes to the same location. Fix it by moving the check into the region that holds the siglock to guard against concurrent TSYNC. (The "Fixes" tag points to the commit that introduced the theoretical data race; concurrent installation of another filter with TSYNC only became possible later, in commit 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together").) Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") Reviewed-by: Tycho Andersen <tycho@tycho.pizza> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201005014401.490175-1-jannh@google.com
2020-10-08	seccomp: Move config option SECCOMP to arch/Kconfig	YiFei Zhu
	In order to make adding configurable features into seccomp easier, it's better to have the options at one single location, considering especially that the bulk of seccomp code is arch-independent. An quick look also show that many SECCOMP descriptions are outdated; they talk about /proc rather than prctl. As a result of moving the config option and keeping it default on, architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP on by default prior to this and SECCOMP will be default in this change. Architectures microblaze, mips, powerpc, s390, sh, and sparc have an outdated depend on PROC_FS and this dependency is removed in this change. Suggested-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/ Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> [kees: added HAVE_ARCH_SECCOMP help text, tweaked wording] Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu
2020-10-08	selftests/clone3: Avoid OS-defined clone_args	Kees Cook
	As the UAPI headers start to appear in distros, we need to avoid outdated versions of struct clone_args to be able to test modern features, named "struct __clone_args". Additionally update the struct size macro names to match UAPI names. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075432.u4gis3s2o5qrsb5g@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit	Kees Cook
	Some archs (like powerpc) only support changing the return code during syscall exit when ptrace is used. Test entry vs exit phases for which portions of the syscall number and return values need to be set at which different phases. For non-powerpc, all changes are made during ptrace syscall entry, as before. For powerpc, the syscall number is changed at ptrace syscall entry and the syscall return value is changed on ptrace syscall exit. Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole") Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075300.7iylzof2w5vrutah@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Allow syscall nr and ret value to be set separately	Kees Cook
	In preparation for setting syscall nr and ret values separately, refactor the helpers to take a pointer to a value, so that a NULL can indicate "do not change this respective value". This is done to keep the regset read/write happening once and in one code path. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075031.j4gruygeugkp2zwd@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Record syscall during ptrace entry	Kees Cook
	In preparation for performing actions during ptrace syscall exit, save the syscall number during ptrace syscall entry. Some architectures do no have the syscall number available during ptrace syscall exit. Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921074354.6shkt2e5yhzhj3sn@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-19	selftests/seccomp: powerpc: Fix seccomp return value testing	Kees Cook
	On powerpc, the errno is not inverted, and depends on ccr.so being set. Add this to a powerpc definition of SYSCALL_RET_SET(). Co-developed-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-13-keescook@chromium.org Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
2020-09-19	selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET	Kees Cook
	Instead of special-casing the specific case of shared registers, create a default SYSCALL_RET_SET() macro (mirroring SYSCALL_NUM_SET()), that writes to the SYSCALL_RET register. For architectures that can't set the return value (for whatever reason), they can define SYSCALL_RET_SET() without an associated SYSCALL_RET() macro. This also paves the way for architectures that need to do special things to set the return value (e.g. powerpc). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-12-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Avoid redundant register flushes	Kees Cook
	When none of the registers have changed, don't flush them back. This can happen if the architecture uses a non-register way to change the syscall (e.g. arm64) , and a return value hasn't been written. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-11-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG	Kees Cook
	Consolidate the REGSET logic into the new ARCH_GETREG() and ARCH_SETREG() macros, avoiding more #ifdef code in function bodies. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-10-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG	Kees Cook
	Instead of special-casing the get/set-registers routines, move the HAVE_GETREG logic into the new ARCH_GETREG() and ARCH_SETREG() macros. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-9-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Remove syscall setting #ifdefs	Kees Cook
	With all architectures now using the common SYSCALL_NUM_SET() macro, the arch-specific #ifdef can be removed from change_syscall() itself. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-8-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: mips: Remove O32-specific macro	Kees Cook
	Instead of having the mips O32 macro special-cased, pull the logic into the SYSCALL_NUM() macro. Additionally include the ABI headers, since these appear to have been missing, leaving __NR_O32_Linux undefined. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-7-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro	Kees Cook
	Remove the arm64 special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-6-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: arm: Define SYSCALL_NUM_SET macro	Kees Cook
	Remove the arm special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-5-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: mips: Define SYSCALL_NUM_SET macro	Kees Cook
	Remove the mips special-case in change_syscall(). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-4-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Provide generic syscall setting macro	Kees Cook
	In order to avoid "#ifdef"s in the main function bodies, create a new macro, SYSCALL_NUM_SET(), where arch-specific logic can live. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-3-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Refactor arch register macros to avoid xtensa special case	Kees Cook
	To avoid an xtensa special-case, refactor all arch register macros to take the register variable instead of depending on the macro expanding as a struct member name. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-2-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-19	selftests/seccomp: Use __NR_mknodat instead of __NR_mknod	Kees Cook
	The __NR_mknod syscall doesn't exist on arm64 (only __NR_mknodat). Switch to the modern syscall. Fixes: ad5682184a81 ("selftests/seccomp: Check for EPOLLHUP for user_notif") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20200912110820.597135-16-keescook@chromium.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-09-08	selftests/seccomp: Use bitwise instead of arithmetic operator for flags	Zou Wei
	This silences the following coccinelle warning: "WARNING: sum of probable bitmasks, consider \|" tools/testing/selftests/seccomp/seccomp_bpf.c:3131:17-18: WARNING: sum of probable bitmasks, consider \| tools/testing/selftests/seccomp/seccomp_bpf.c:3133:18-19: WARNING: sum of probable bitmasks, consider \| tools/testing/selftests/seccomp/seccomp_bpf.c:3134:18-19: WARNING: sum of probable bitmasks, consider \| tools/testing/selftests/seccomp/seccomp_bpf.c:3135:18-19: WARNING: sum of probable bitmasks, consider \| Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1586924101-65940-1-git-send-email-zou_wei@huawei.com Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	seccomp: Use current_pt_regs() instead of task_pt_regs(current)	Denis Efremov
	As described in commit a3460a59747c ("new helper: current_pt_regs()"): - arch versions are "optimized versions". - some architectures have task_pt_regs() working only for traced tasks blocked on signal delivery. current_pt_regs() needs to work for all processes. In preparation for adding a coccinelle rule for using current_*(), instead of raw accesses to current members, modify seccomp_do_user_notification(), __seccomp_filter(), __secure_computing() to use current_pt_regs(). Signed-off-by: Denis Efremov <efremov@linux.com> Link: https://lore.kernel.org/r/20200824125921.488311-1-efremov@linux.com [kees: Reworded commit log, add comment to populate_seccomp_data()] Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	selftests/seccomp: Add test for unknown SECCOMP_RET kill behavior	Kees Cook
	While we were testing for the behavior of unknown seccomp filter return values, there was no test for how it acted in a thread group. Add a test in the thread group tests for this. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	seccomp: kill process instead of thread for unknown actions	Rich Felker
	Asynchronous termination of a thread outside of the userspace thread library's knowledge is an unsafe operation that leaves the process in an inconsistent, corrupt, and possibly unrecoverable state. In order to make new actions that may be added in the future safe on kernels not aware of them, change the default action from SECCOMP_RET_KILL_THREAD to SECCOMP_RET_KILL_PROCESS. Signed-off-by: Rich Felker <dalias@libc.org> Link: https://lore.kernel.org/r/20200829015609.GA32566@brightrain.aerifal.cx [kees: Fixed up coredump selection logic to match] Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	seccomp: don't leave dangling ->notif if file allocation fails	Tycho Andersen
	Christian and Kees both pointed out that this is a bit sloppy to open-code both places, and Christian points out that we leave a dangling pointer to ->notif if file allocation fails. Since we check ->notif for null in order to determine if it's ok to install a filter, this means people won't be able to install a filter if the file allocation fails for some reason, even if they subsequently should be able to. To fix this, let's hoist this free+null into its own little helper and use it. Reported-by: Kees Cook <keescook@chromium.org> Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200902140953.1201956-1-tycho@tycho.pizza Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	mailmap, MAINTAINERS: move to tycho.pizza	Tycho Andersen
	I've changed my e-mail address to tycho.pizza, so let's reflect that in these files. Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200902014017.934315-2-tycho@tycho.pizza Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08	seccomp: don't leak memory when filter install races	Tycho Andersen
	In seccomp_set_mode_filter() with TSYNC \| NEW_LISTENER, we first initialize the listener fd, then check to see if we can actually use it later in seccomp_may_assign_mode(), which can fail if anyone else in our thread group has installed a filter and caused some divergence. If we can't, we partially clean up the newly allocated file: we put the fd, put the file, but don't actually clean up the memory that was allocated at filter->notif. Let's clean that up too. To accomplish this, let's hoist the actual "detach a notifier from a filter" code to its own helper out of seccomp_notify_release(), so that in case anyone adds stuff to init_listener(), they only have to add the cleanup code in one spot. This does a bit of extra locking and such on the failure path when the filter is not attached, but it's a slow failure path anyway. Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together") Reported-by: syzbot+3ad9614a12f80994c32e@syzkaller.appspotmail.com Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200902014017.934315-1-tycho@tycho.pizza Signed-off-by: Kees Cook <keescook@chromium.org>
2020-08-23	Linux 5.9-rc2v5.9-rc2	Linus Torvalds

2020-08-23	Merge tag 'powerpc-5.9-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Add perf support for emitting extended registers for power10. - A fix for CPU hotplug on pseries, where on large/loaded systems we may not wait long enough for the CPU to be offlined, leading to crashes. - Addition of a raw cputable entry for Power10, which is not required to boot, but is required to make our PMU setup work correctly in guests. - Three fixes for the recent changes on 32-bit Book3S to move modules into their own segment for strict RWX. - A fix for a recent change in our powernv PCI code that could lead to crashes. - A change to our perf interrupt accounting to avoid soft lockups when using some events, found by syzkaller. - A change in the way we handle power loss events from the hypervisor on pseries. We no longer immediately shut down if we're told we're running on a UPS. - A few other minor fixes. Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz, Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth, Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann, Vaidyanathan Srinivasan, Vasant Hegde. * tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000 powerpc/pseries: Do not initiate shutdown when system is running on UPS powerpc/perf: Fix soft lockups due to missed interrupt accounting powerpc/powernv/pci: Fix possible crash when releasing DMA resources powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 powerpc/fixmap: Fix the size of the early debug area powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled powerpc/kernel: Cleanup machine check function declarations powerpc: Add POWER10 raw mode cputable entry powerpc/perf: Add extended regs support for power10 platform powerpc/perf: Add support for outputting extended regs in perf intr_regs powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
2020-08-23	Merge tag 'x86-urgent-2020-08-23' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for x86 which removes the RDPID usage from the paranoid entry path and unconditionally uses LSL to retrieve the CPU number. RDPID depends on MSR_TSX_AUX. KVM has an optmization to avoid expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values and restores them either when leaving the run loop, on preemption or when going out to user space. MSR_TSX_AUX is part of that lazy MSR set, so after writing the guest value and before the lazy restore any exception using the paranoid entry will read the guest value and use it as CPU number to retrieve the GSBASE value for the current CPU when FSGSBASE is enabled. As RDPID is only used in that particular entry path, there is no reason to burden VMENTER/EXIT with two extra MSR writes. Remove the RDPID optimization, which is not even backed by numbers from the paranoid entry path instead" * tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM
2020-08-23	Merge tag 'perf-urgent-2020-08-23' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Thomas Gleixner: "A single update for perf on x86 which has support for the broken down bandwith counters" * tag 'perf-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add BW counters for GT, IA and IO breakdown
2020-08-23	Merge tag 'efi-urgent-2020-08-23' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Thomas Gleixner: - Enforce NX on RO data in mixed EFI mode - Destroy workqueue in an error handling path to prevent UAF - Stop argument parser at '--' which is the delimiter for init - Treat a NULL command line pointer as empty instead of dereferncing it unconditionally. - Handle an unterminated command line correctly - Cleanup the 32bit code leftovers and remove obsolete documentation * tag 'efi-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: efi: remove description of efi=old_map efi/x86: Move 32-bit code into efi_32.c efi/libstub: Handle unterminated cmdline efi/libstub: Handle NULL cmdline efi/libstub: Stop parsing arguments at "--" efi: add missed destroy_workqueue when efisubsys_init fails efi/x86: Mark kernel rodata non-executable for mixed mode
2020-08-23	Merge tag 'core-urgent-2020-08-23' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull entry fix from Thomas Gleixner: "A single bug fix for the common entry code. The transcription of the x86 version messed up the reload of the syscall number from pt_regs after ptrace and seccomp which breaks syscall number rewriting" * tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: core/entry: Respect syscall number rewrites
2020-08-23	Merge tag 'edac_urgent_for_v5.9_rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: "A single fix correcting a reversed error severity determination check which lead to a recoverable error getting marked as fatal, by Tony Luck" * tag 'edac_urgent_for_v5.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/{i7core,sb,pnd2,skx}: Fix error event severity
2020-08-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds
	Pull networking fixes from David Miller: "Nothing earth shattering here, lots of small fixes (f.e. missing RCU protection, bad ref counting, missing memset(), etc.) all over the place: 1) Use get_file_rcu() in task_file iterator, from Yonghong Song. 2) There are two ways to set remote source MAC addresses in macvlan driver, but only one of which validates things properly. Fix this. From Alvin Šipraga. 3) Missing of_node_put() in gianfar probing, from Sumera Priyadarsini. 4) Preserve device wanted feature bits across multiple netlink ethtool requests, from Maxim Mikityanskiy. 5) Fix rcu_sched stall in task and task_file bpf iterators, from Yonghong Song. 6) Avoid reset after device destroy in ena driver, from Shay Agroskin. 7) Missing memset() in netlink policy export reallocation path, from Johannes Berg. 8) Fix info leak in __smc_diag_dump(), from Peilin Ye. 9) Decapsulate ECN properly for ipv6 in ipv4 tunnels, from Mark Tomlinson. 10) Fix number of data stream negotiation in SCTP, from David Laight. 11) Fix double free in connection tracker action module, from Alaa Hleihel. 12) Don't allow empty NHA_GROUP attributes, from Nikolay Aleksandrov" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits) net: nexthop: don't allow empty NHA_GROUP bpf: Fix two typos in uapi/linux/bpf.h net: dsa: b53: check for timeout tipc: call rcu_read_lock() in tipc_aead_encrypt_done() net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow net: sctp: Fix negotiation of the number of data streams. dt-bindings: net: renesas, ether: Improve schema validation gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit() hv_netvsc: Remove "unlikely" from netvsc_select_queue bpf: selftests: global_funcs: Check err_str before strstr bpf: xdp: Fix XDP mode when no mode flags specified selftests/bpf: Remove test_align leftovers tools/resolve_btfids: Fix sections with wrong alignment net/smc: Prevent kernel-infoleak in __smc_diag_dump() sfc: fix build warnings on 32-bit net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified" libbpf: Fix map index used in error message net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe() net: atlantic: Use readx_poll_timeout() for large timeout ...
2020-08-22	Merge branch 'work.epoll' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull epoll fixes from Al Viro: "Fix reference counting and clean up exit paths" * 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do_epoll_ctl(): clean the failure exits up a bit epoll: Keep a reference on files added to the check list
2020-08-22	do_epoll_ctl(): clean the failure exits up a bit	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-22	epoll: Keep a reference on files added to the check list	Marc Zyngier
	When adding a new fd to an epoll, and that this new fd is an epoll fd itself, we recursively scan the fds attached to it to detect cycles, and add non-epool files to a "check list" that gets subsequently parsed. However, this check list isn't completely safe when deletions can happen concurrently. To sidestep the issue, make sure that a struct file placed on the check list sees its f_count increased, ensuring that a concurrent deletion won't result in the file disapearing from under our feet. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-22	net: nexthop: don't allow empty NHA_GROUP	Nikolay Aleksandrov
	Currently the nexthop code will use an empty NHA_GROUP attribute, but it requires at least 1 entry in order to function properly. Otherwise we end up derefencing null or random pointers all over the place due to not having any nh_grp_entry members allocated, nexthop code relies on having at least the first member present. Empty NHA_GROUP doesn't make any sense so just disallow it. Also add a WARN_ON for any future users of nexthop_create_group(). BUG: kernel NULL pointer dereference, address: 0000000000000080 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 0 PID: 558 Comm: ip Not tainted 5.9.0-rc1+ #93 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:fib_check_nexthop+0x4a/0xaa Code: 0f 84 83 00 00 00 48 c7 02 80 03 f7 81 c3 40 80 fe fe 75 12 b8 ea ff ff ff 48 85 d2 74 6b 48 c7 02 40 03 f7 81 c3 48 8b 40 10 <48> 8b 80 80 00 00 00 eb 36 80 78 1a 00 74 12 b8 ea ff ff ff 48 85 RSP: 0018:ffff88807983ba00 EFLAGS: 00010213 RAX: 0000000000000000 RBX: ffff88807983bc00 RCX: 0000000000000000 RDX: ffff88807983bc00 RSI: 0000000000000000 RDI: ffff88807bdd0a80 RBP: ffff88807983baf8 R08: 0000000000000dc0 R09: 000000000000040a R10: 0000000000000000 R11: ffff88807bdd0ae8 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88807bea3100 R15: 0000000000000001 FS: 00007f10db393700(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000080 CR3: 000000007bd0f004 CR4: 00000000003706f0 Call Trace: fib_create_info+0x64d/0xaf7 fib_table_insert+0xf6/0x581 ? __vma_adjust+0x3b6/0x4d4 inet_rtm_newroute+0x56/0x70 rtnetlink_rcv_msg+0x1e3/0x20d ? rtnl_calcit.isra.0+0xb8/0xb8 netlink_rcv_skb+0x5b/0xac netlink_unicast+0xfa/0x17b netlink_sendmsg+0x334/0x353 sock_sendmsg_nosec+0xf/0x3f ____sys_sendmsg+0x1a0/0x1fc ? copy_msghdr_from_user+0x4c/0x61 ___sys_sendmsg+0x63/0x84 ? handle_mm_fault+0xa39/0x11b5 ? sockfd_lookup_light+0x72/0x9a __sys_sendmsg+0x50/0x6e do_syscall_64+0x54/0xbe entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f10dacc0bb7 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 9a 4b 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 f2 2a 00 f7 d8 64 89 02 48 RSP: 002b:00007ffcbe628bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffcbe628f80 RCX: 00007f10dacc0bb7 RDX: 0000000000000000 RSI: 00007ffcbe628c60 RDI: 0000000000000003 RBP: 000000005f41099c R08: 0000000000000001 R09: 0000000000000008 R10: 00000000000005e9 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffcbe628d70 R15: 0000563a86c6e440 Modules linked in: CR2: 0000000000000080 CC: David Ahern <dsahern@gmail.com> Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Reported-by: syzbot+a61aa19b0c14c8770bd9@syzkaller.appspotmail.com Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-22	Merge tag 'kbuild-fixes-v5.9' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - move -Wsign-compare warning from W=2 to W=3 - fix the keyword _restrict to __restrict in genksyms - fix more bugs in qconf * tag 'kbuild-fixes-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: qconf: replace deprecated QString::sprintf() with QTextStream kconfig: qconf: remove redundant help in the info view kconfig: qconf: remove qInfo() to get back Qt4 support kconfig: qconf: remove unused colNr kconfig: qconf: fix the popup menu in the ConfigInfoView window kconfig: qconf: fix signal connection to invalid slots genksyms: keywords: Use __restrict not _restrict kbuild: remove redundant patterns in filter/filter-out extract-cert: add static to local data Makefile.extrawarn: Move sign-compare from W=2 to W=3
2020-08-22	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Allow booting of late secondary CPUs affected by erratum 1418040 (currently they are parked if none of the early CPUs are affected by this erratum). - Add the 32-bit vdso Makefile to the vdso_install rule so that 'make vdso_install' installs the 32-bit compat vdso when it is compiled. - Print a warning that untrusted guests without a CPU erratum workaround (Cortex-A57 832075) may deadlock the affected system. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ARM64: vdso32: Install vdso32 from vdso_install KVM: arm64: Print warning when cpu erratum can cause guests to deadlock arm64: Allow booting of late CPUs affected by erratum 1418040 arm64: Move handling of erratum 1418040 into C code
2020-08-22	Merge tag 's390-5.9-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - a couple of fixes for storage key handling relevant for debugging - add cond_resched into potentially slow subchannels scanning loop - fixes for PF/VF linking and to ignore stale PCI configuration request events * tag 's390-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: fix PF/VF linking on hot plug s390/pci: re-introduce zpci_remove_device() s390/pci: fix zpci_bus_link_virtfn() s390/ptrace: fix storage key handling s390/runtime_instrumentation: fix storage key handling s390/pci: ignore stale configuration request event s390/cio: add cond_resched() in the slow_eval_known_fn() loop
2020-08-22	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: - PAE and PKU bugfixes for x86 - selftests fix for new binutils - MMU notifier fix for arm64 * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode KVM: x86: fix access code passed to gva_to_gpa selftests: kvm: Use a shorter encoding to clear RAX
2020-08-22	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "23 fixes in 5 drivers (qla2xxx, ufs, scsi_debug, fcoe, zfcp). The bulk of the changes are in qla2xxx and ufs and all are mostly small and definitely don't impact the core" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits) Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe" Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command" scsi: qla2xxx: Fix null pointer access during disconnect from subsystem scsi: qla2xxx: Check if FW supports MQ before enabling scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set anytime scsi: qla2xxx: Reduce noisy debug message scsi: qla2xxx: Fix login timeout scsi: qla2xxx: Indicate correct supported speeds for Mezz card scsi: qla2xxx: Flush I/O on zone disable scsi: qla2xxx: Flush all sessions on zone disable scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values scsi: scsi_debug: Fix scp is NULL errors scsi: zfcp: Fix use-after-free in request timeout handlers scsi: ufs: No need to send Abort Task if the task in DB was cleared scsi: ufs: Clean up completed request without interrupt notification scsi: ufs: Improve interrupt handling for shared interrupts scsi: ufs: Fix interrupt error message for shared interrupts scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL scsi: ufs-mediatek: Fix incorrect time to wait link status ...
2020-08-22	Merge tag 'devicetree-fixes-for-5.9-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Another set of DT fixes: - restore range parsing error check - workaround PCI range parsing with missing 'device_type' now required - correct description of 'phy-connection-type' - fix erroneous matching on 'snps,dw-pcie' by 'intel,lgm-pcie' schema - a couple of grammar and whitespace fixes - update Shawn Guo's email" * tag 'devicetree-fixes-for-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: vendor-prefixes: Remove trailing whitespace dt-bindings: net: correct description of phy-connection-type dt-bindings: PCI: intel,lgm-pcie: Fix matching on all snps,dw-pcie instances of: address: Work around missing device_type property in pcie nodes dt: writing-schema: Miscellaneous grammar fixes dt-bindings: Use Shawn Guo's preferred e-mail for i.MX bindings of/address: check for invalid range.cpu_addr
2020-08-21	dt-bindings: vendor-prefixes: Remove trailing whitespace	Geert Uytterhoeven
	Fixes: f516fb704d02fff2 ("dt-bindings: Whitespace clean-ups in schema files") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20200819092058.1526-1-geert+renesas@glider.be Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-21	KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set	Will Deacon
	When an MMU notifier call results in unmapping a range that spans multiple PGDs, we end up calling into cond_resched_lock() when crossing a PGD boundary, since this avoids running into RCU stalls during VM teardown. Unfortunately, if the VM is destroyed as a result of OOM, then blocking is not permitted and the call to the scheduler triggers the following BUG(): \| BUG: sleeping function called from invalid context at arch/arm64/kvm/mmu.c:394 \| in_atomic(): 1, irqs_disabled(): 0, non_block: 1, pid: 36, name: oom_reaper \| INFO: lockdep is turned off. \| CPU: 3 PID: 36 Comm: oom_reaper Not tainted 5.8.0 #1 \| Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 \| Call trace: \| dump_backtrace+0x0/0x284 \| show_stack+0x1c/0x28 \| dump_stack+0xf0/0x1a4 \| ___might_sleep+0x2bc/0x2cc \| unmap_stage2_range+0x160/0x1ac \| kvm_unmap_hva_range+0x1a0/0x1c8 \| kvm_mmu_notifier_invalidate_range_start+0x8c/0xf8 \| __mmu_notifier_invalidate_range_start+0x218/0x31c \| mmu_notifier_invalidate_range_start_nonblock+0x78/0xb0 \| __oom_reap_task_mm+0x128/0x268 \| oom_reap_task+0xac/0x298 \| oom_reaper+0x178/0x17c \| kthread+0x1e4/0x1fc \| ret_from_fork+0x10/0x30 Use the new 'flags' argument to kvm_unmap_hva_range() to ensure that we only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is set in the notifier flags. Cc: <stable@vger.kernel.org> Fixes: 8b3405e345b5 ("kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd") Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-3-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21	KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()	Will Deacon
	The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21	dt-bindings: net: correct description of phy-connection-type	Madalin Bucur
	The phy-connection-type parameter is described in ePAPR 1.1: Specifies interface type between the Ethernet device and a physical layer (PHY) device. The value of this property is specific to the implementation. Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://lore.kernel.org/r/1597917724-11127-1-git-send-email-madalin.bucur@oss.nxp.com Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-21	Merge tag 'io_uring-5.9-2020-08-21' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull io_uring fixes from Jens Axboe: - Make sure the head link cancelation includes async work - Get rid of kiocb_wait_page_queue_init(), makes no sense to have it as a separate function since you moved it into io_uring itself - io_import_iovec cleanups (Pavel, me) - Use system_unbound_wq for ring exit work, to avoid spawning tons of these if we have tons of rings exiting at the same time - Fix req->flags overflow flag manipulation (Pavel) * tag 'io_uring-5.9-2020-08-21' of git://git.kernel.dk/linux-block: io_uring: kill extra iovec=NULL in import_iovec() io_uring: comment on kfree(iovec) checks io_uring: fix racy req->flags modification io_uring: use system_unbound_wq for ring exit work io_uring: cleanup io_import_iovec() of pre-mapped request io_uring: get rid of kiocb_wait_page_queue_init() io_uring: find and cancel head link async work on files exit
2020-08-21	dt-bindings: PCI: intel,lgm-pcie: Fix matching on all snps,dw-pcie instances	Rob Herring
	The intel,lgm-pcie binding is matching on all snps,dw-pcie instances which is wrong. Add a custom 'select' entry to fix this. Fixes: e54ea45a4955 ("dt-bindings: PCI: intel: Add YAML schemas for the PCIe RC controller") Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Reviewed-by: Dilip Kota <eswara.kota@linux.intel.com> Signed-off-by: Rob Herring <robh@kernel.org>