linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-09-29	ftrace: Move RCU is watching check after recursion check	Steven Rostedt (VMware)
	The first thing that the ftrace function callback helper functions should do is to check for recursion. Peter Zijlstra found that when "rcu_is_watching()" had its notrace removed, it caused perf function tracing to crash. This is because the call of rcu_is_watching() is tested before function recursion is checked and and if it is traced, it will cause an infinite recursion loop. rcu_is_watching() should still stay notrace, but to prevent this should never had crashed in the first place. The recursion prevention must be the first thing done in callback functions. Link: https://lore.kernel.org/r/20200929112541.GM2628@hirez.programming.kicks-ass.net Cc: stable@vger.kernel.org Cc: Paul McKenney <paulmck@kernel.org> Fixes: c68c0fa293417 ("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-29	tracing: Fix trace_find_next_entry() accounting of temp buffer size	Steven Rostedt (VMware)
	The temp buffer size variable for trace_find_next_entry() was incorrectly being updated when the size did not change. The temp buffer size should only be updated when it is reallocated. This is mostly an issue when used with ftrace_dump(). That's because ftrace_dump() can not allocate a new buffer, and instead uses a temporary buffer with a fix size. But the variable that keeps track of that size is incorrectly updated with each call, and it could fall into the path that would try to reallocate the buffer and produce a warning. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1601 at kernel/trace/trace.c:3548 trace_find_next_entry+0xd0/0xe0 Modules linked in [..] CPU: 1 PID: 1601 Comm: bash Not tainted 5.9.0-rc5-test+ #521 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:trace_find_next_entry+0xd0/0xe0 Code: 40 21 00 00 4c 89 e1 31 d2 4c 89 ee 48 89 df e8 c6 9e ff ff 89 ab 54 21 00 00 5b 5d 41 5c 41 5d c3 48 63 d5 eb bf 31 c0 eb f0 <0f> 0b 48 63 d5 eb b4 66 0f 1f 84 00 00 00 00 00 53 48 8d 8f 60 21 RSP: 0018:ffff95a4f2e8bd70 EFLAGS: 00010046 RAX: ffffffff96679fc0 RBX: ffffffff97910de0 RCX: ffffffff96679fc0 RDX: ffff95a4f2e8bd98 RSI: ffff95a4ee321098 RDI: ffffffff97913000 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000046 R12: ffff95a4f2e8bd98 R13: 0000000000000000 R14: ffff95a4ee321098 R15: 00000000009aa301 FS: 00007f8565484740(0000) GS:ffff95a55aa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055876bd43d90 CR3: 00000000b76e6003 CR4: 00000000001706e0 Call Trace: trace_print_lat_context+0x58/0x2d0 ? cpumask_next+0x16/0x20 print_trace_line+0x1a4/0x4f0 ftrace_dump.cold+0xad/0x12c __handle_sysrq.cold+0x51/0x126 write_sysrq_trigger+0x3f/0x4a proc_reg_write+0x53/0x80 vfs_write+0xca/0x210 ksys_write+0x70/0xf0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f8565579487 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffd40707948 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f8565579487 RDX: 0000000000000002 RSI: 000055876bd74de0 RDI: 0000000000000001 RBP: 000055876bd74de0 R08: 000000000000000a R09: 0000000000000001 R10: 000055876bdec280 R11: 0000000000000246 R12: 0000000000000002 R13: 00007f856564a500 R14: 0000000000000002 R15: 00007f856564a700 irq event stamp: 109958 ---[ end trace 7aab5b7e51484b00 ]--- Not only fix the updating of the temp buffer, but also do not free the temp buffer before a new buffer is allocated (there's no reason to not continue to use the current temp buffer if an allocation fails). Cc: stable@vger.kernel.org Fixes: 8e99cf91b99bb ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic") Reported-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-29	Merge tag 'phy-fixes-2-5.9' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy into usb-linus Vinod writes: phy: Second round of fixes for 5.9 ) Fix of leak in TI phy driver tag 'phy-fixes-2-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: ti: am654: Fix a leak in serdes_am654_probe()
2020-09-29	arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option	Will Deacon
	The PR_SPEC_DISABLE_NOEXEC option to the PR_SPEC_STORE_BYPASS prctl() allows the SSB mitigation to be enabled only until the next execve(), at which point the state will revert back to PR_SPEC_ENABLE and the mitigation will be disabled. Add support for PR_SPEC_DISABLE_NOEXEC on arm64. Reported-by: Anthony Steinhauser <asteinhauser@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Pull in task_stack_page() to Spectre-v4 mitigation code	Will Deacon
	The kbuild robot reports that we're relying on an implicit inclusion to get a definition of task_stack_page() in the Spectre-v4 mitigation code, which is not always in place for some configurations: \| arch/arm64/kernel/proton-pack.c:329:2: error: implicit declaration of function 'task_stack_page' [-Werror,-Wimplicit-function-declaration] \| task_pt_regs(task)->pstate \|= val; \| ^ \| arch/arm64/include/asm/processor.h:268:36: note: expanded from macro 'task_pt_regs' \| ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1) \| ^ \| arch/arm64/kernel/proton-pack.c:329:2: note: did you mean 'task_spread_page'? Add the missing include to fix the build error. Fixes: a44acf477220 ("arm64: Move SSBD prctl() handler alongside other spectre mitigation code") Reported-by: Anthony Steinhauser <asteinhauser@google.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/202009260013.Ul7AD29w%lkp@intel.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled	Will Deacon
	Patching the EL2 exception vectors is integral to the Spectre-v2 workaround, where it can be necessary to execute CPU-specific sequences to nobble the branch predictor before running the hypervisor text proper. Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors to be patched even when KASLR is not enabled. Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Get rid of arm64_ssbd_state	Marc Zyngier
	Out with the old ghost, in with the new... Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()	Marc Zyngier
	Convert the KVM WA2 code to using the Spectre infrastructure, making the code much more readable. It also allows us to take SSBS into account for the mitigation. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Get rid of kvm_arm_have_ssbd()	Marc Zyngier
	kvm_arm_have_ssbd() is now completely unused, get rid of it. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Simplify handling of ARCH_WORKAROUND_2	Marc Zyngier
	Owing to the fact that the host kernel is always mitigated, we can drastically simplify the WA2 handling by keeping the mitigation state ON when entering the guest. This means the guest is either unaffected or not mitigated. This results in a nice simplification of the mitigation space, and the removal of a lot of code that was never really used anyway. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Rewrite Spectre-v4 mitigation code	Will Deacon
	Rewrite the Spectre-v4 mitigation handling code to follow the same approach as that taken by Spectre-v2. For now, report to KVM that the system is vulnerable (by forcing 'ssbd_state' to ARM64_SSBD_UNKNOWN), as this will be cleared up in subsequent steps. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Move SSBD prctl() handler alongside other spectre mitigation code	Will Deacon
	As part of the spectre consolidation effort to shift all of the ghosts into their own proton pack, move all of the horrible SSBD prctl() code out of its own 'ssbd.c' file. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4	Will Deacon
	In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't _entirely_ accurate, as we also need to take into account the interaction with SSBS, but that will be taken care of in subsequent patches. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Treat SSBS as a non-strict system feature	Will Deacon
	If all CPUs discovered during boot have SSBS, then spectre-v4 will be considered to be "mitigated". However, we still allow late CPUs without SSBS to be onlined, albeit with a "SANITY CHECK" warning. This is problematic for userspace because it means that the system can quietly transition to "Vulnerable" at runtime. Avoid this by treating SSBS as a non-strict system feature: if all of the CPUs discovered during boot have SSBS, then late arriving secondaries better have it as well. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Group start_thread() functions together	Will Deacon
	The is_ttbrX_addr() functions have somehow ended up in the middle of the start_thread() functions, so move them out of the way to keep the code readable. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2	Marc Zyngier
	If the system is not affected by Spectre-v2, then advertise to the KVM guest that it is not affected, without the need for a safelist in the guest. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Rewrite Spectre-v2 mitigation code	Will Deacon
	The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain. This is largely due to it being written hastily, without much clue as to how things would pan out, and also because it ends up mixing policy and state in such a way that it is very difficult to figure out what's going on. Rewrite the Spectre-v2 mitigation so that it clearly separates state from policy and follows a more structured approach to handling the mitigation. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Introduce separate file for spectre mitigations and reporting	Will Deacon
	The spectre mitigation code is spread over a few different files, which makes it both hard to follow, but also hard to remove it should we want to do that in future. Introduce a new file for housing the spectre mitigations, and populate it with the spectre-v1 reporting code to start with. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2	Will Deacon
	For better or worse, the world knows about "Spectre" and not about "Branch predictor hardening". Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2 as part of moving all of the Spectre mitigations into their own little corner. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Simplify install_bp_hardening_cb()	Will Deacon
	Use is_hyp_mode_available() to detect whether or not we need to patch the KVM vectors for branch hardening, which avoids the need to take the vector pointers as parameters. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE	Will Deacon
	The removal of CONFIG_HARDEN_BRANCH_PREDICTOR means that CONFIG_KVM_INDIRECT_VECTORS is synonymous with CONFIG_RANDOMIZE_BASE, so replace it. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Remove Spectre-related CONFIG_* options	Will Deacon
	The spectre mitigations are too configurable for their own good, leading to confusing logic trying to figure out when we should mitigate and when we shouldn't. Although the plethora of command-line options need to stick around for backwards compatibility, the default-on CONFIG options that depend on EXPERT can be dropped, as the mitigations only do anything if the system is vulnerable, a mitigation is available and the command-line hasn't disabled it. Remove CONFIG_HARDEN_BRANCH_PREDICTOR and CONFIG_ARM64_SSBD in favour of enabling this code unconditionally. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs	Marc Zyngier
	Commit 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") changed the way we deal with per-CPU errata by only calling the .matches() callback until one CPU is found to be affected. At this point, .matches() stop being called, and .cpu_enable() will be called on all CPUs. This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be mitigated. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") Cc: <stable@vger.kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	bpf, powerpc: Fix misuse of fallthrough in bpf_jit_comp()	He Zhe
	The user defined label following "fallthrough" is not considered by GCC and causes build failure. kernel-source/include/linux/compiler_attributes.h:208:41: error: attribute 'fallthrough' not preceding a case label or default label [-Werror] 208 define fallthrough _attribute((fallthrough_)) ^~~~~~~~~~~~~ Fixes: df561f6688fe ("treewide: Use fallthrough pseudo-keyword") Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/bpf/20200928090023.38117-1-zhe.he@windriver.com
2020-09-29	blk-mq: call commit_rqs while list empty but error happen	yangerkun
	Blk-mq should call commit_rqs once 'bd.last != true' and no more request will come(so virtscsi can kick the virtqueue, e.g.). We already do that in 'blk_mq_dispatch_rq_list/blk_mq_try_issue_list_directly' while list not empty and 'queued > 0'. However, we can seen the same scene once the last request in list call queue_rq and return error like BLK_STS_IOERR which will not requeue the request, and lead that list empty but need call commit_rqs too(Or the request for virtscsi will stay timeout until other request kick virtqueue). We found this problem by do fsstress test with offline/online virtscsi device repeat quickly. Fixes: d666ba98f849 ("blk-mq: add mq_ops->commit_rqs()") Reported-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: yangerkun <yangerkun@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-29	io_uring: fix async buffered reads when readahead is disabled	Hao Xu
	The async buffered reads feature is not working when readahead is turned off. There are two things to concern: - when doing retry in io_read, not only the IOCB_WAITQ flag but also the IOCB_NOWAIT flag is still set, which makes it goes to would_block phase in generic_file_buffered_read() and then return -EAGAIN. After that, the io-wq thread work is queued, and later doing the async reads in the old way. - even if we remove IOCB_NOWAIT when doing retry, the feature is still not running properly, since in generic_file_buffered_read() it goes to lock_page_killable() after calling mapping->a_ops->readpage() to do IO, and thus causing process to sleep. Fixes: 1a0a7853b901 ("mm: support async buffered reads in generic_file_buffered_read()") Fixes: 3b2a4439e0ae ("io_uring: get rid of kiocb_wait_page_queue_init()") Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-29	efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure	Ard Biesheuvel
	Currently, on arm64, we abort on any failure from efi_get_random_bytes() other than EFI_NOT_FOUND when it comes to setting the physical seed for KASLR, but ignore such failures when obtaining the seed for virtual KASLR or for early seeding of the kernel's entropy pool via the config table. This is inconsistent, and may lead to unexpected boot failures. So let's permit any failure for the physical seed, and simply report the error code if it does not equal EFI_NOT_FOUND. Cc: <stable@vger.kernel.org> # v5.8+ Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-29	Merge tag 'gpio-fixes-for-v5.9-rc7' of ↵	Linus Walleij
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into fixes gpio: fixes for v5.9-rc7 - fix uninitialized variable in gpio-pca953x - enable all 160 lines and fix interrupt configuration in gpio-aspeed-gpio - fix ast2600 bank properties in gpio-aspeed
2020-09-29	lockdep: Optimize the memory usage of circular queue	Boqun Feng
	Qian Cai reported a BFS_EQUEUEFULL warning [1] after read recursive deadlock detection merged into tip tree recently. Unlike the previous lockep graph searching, which iterate every lock class (every node in the graph) exactly once, the graph searching for read recurisve deadlock detection needs to iterate every lock dependency (every edge in the graph) once, as a result, the maximum memory cost of the circular queue changes from O(V), where V is the number of lock classes (nodes or vertices) in the graph, to O(E), where E is the number of lock dependencies (edges), because every lock class or dependency gets enqueued once in the BFS. Therefore we hit the BFS_EQUEUEFULL case. However, actually we don't need to enqueue all dependencies for the BFS, because every time we enqueue a dependency, we almostly enqueue all other dependencies in the same dependency list ("almostly" is because we currently check before enqueue, so if a dependency doesn't pass the check stage we won't enqueue it, however, we can always do in reverse ordering), based on this, we can only enqueue the first dependency from a dependency list and every time we want to fetch a new dependency to work, we can either: 1) fetch the dependency next to the current dependency in the dependency list or 2) if the dependency in 1) doesn't exist, fetch the dependency from the queue. With this approach, the "max bfs queue depth" for a x86_64_defconfig + lockdep and selftest config kernel can get descreased from: max bfs queue depth: 201 to (after apply this patch) max bfs queue depth: 61 While I'm at it, clean up the code logic a little (e.g. directly return other than set a "ret" value and goto the "exit" label). [1]: https://lore.kernel.org/lkml/17343f6f7f2438fc376125384133c5ba70c2a681.camel@redhat.com/ Reported-by: Qian Cai <cai@redhat.com> Reported-by: syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200917080210.108095-1-boqun.feng@gmail.com
2020-09-28	ethtool: mark netlink family as __ro_after_init	Jakub Kicinski
	Like all genl families ethtool_genl_family needs to not be a straight up constant, because it's modified/initialized by genl_register_family(). After init, however, it's only passed to genlmsg_put() & co. therefore we can mark it as __ro_after_init. Since genl_family structure contains function pointers mark this as a fix. Fixes: 2b4a8990b7df ("ethtool: introduce ethtool netlink interface") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	genetlink: add missing kdoc for validation flags	Jakub Kicinski
	Validation flags are missing kdoc, add it. Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: usb: ax88179_178a: add MCT usb 3.0 adapter	Wilken Gottwalt
	Adds the driver_info and usb ids of the AX88179 based MCT U3-A9003 USB 3.0 ethernet adapter. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: usb: ax88179_178a: fix missing stop entry in driver_info	Wilken Gottwalt
	Adds the missing .stop entry in the Belkin driver_info structure. Fixes: e20bd60bf62a ("net: usb: asix88179_178a: Add support for the Belkin B2B128") Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	Input: i8042 - add nopnp quirk for Acer Aspire 5 A515	Jiri Kosina
	Touchpad on this laptop is not detected properly during boot, as PNP enumerates (wrongly) AUX port as disabled on this machine. Fix that by adding this board (with admittedly quite funny DMI identifiers) to nopnp quirk list. Reported-by: Andrés Barrantes Silman <andresbs2000@protonmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2009252337340.3336@cbobk.fhfr.pm Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-09-28	Input: trackpoint - enable Synaptics trackpoints	Vincent Huang
	Add Synaptics IDs in trackpoint_start_protocol() to mark them as valid. Signed-off-by: Vincent Huang <vincent.huang@tw.synaptics.com> Fixes: 6c77545af100 ("Input: trackpoint - add new trackpoint variant IDs") Reviewed-by: Harry Cutts <hcutts@chromium.org> Tested-by: Harry Cutts <hcutts@chromium.org> Link: https://lore.kernel.org/r/20200924053013.1056953-1-vincent.huang@tw.synaptics.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-09-28	net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks	Manivannan Sadhasivam
	The rcu read locks are needed to avoid potential race condition while dereferencing radix tree from multiple threads. The issue was identified by syzbot. Below is the crash report: ============================= WARNING: suspicious RCU usage 5.7.0-syzkaller #0 Not tainted ----------------------------- include/linux/radix-tree.h:176 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by kworker/u4:1/21: #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: spin_unlock_irq include/linux/spinlock.h:403 [inline] #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: process_one_work+0x6df/0xfd0 kernel/workqueue.c:2241 #1: ffffc90000dd7d80 ((work_completion)(&qrtr_ns.work)){+.+.}-{0:0}, at: process_one_work+0x71e/0xfd0 kernel/workqueue.c:2243 stack backtrace: CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.7.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: qrtr_ns_handler qrtr_ns_worker Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1e9/0x30e lib/dump_stack.c:118 radix_tree_deref_slot include/linux/radix-tree.h:176 [inline] ctrl_cmd_new_lookup net/qrtr/ns.c:558 [inline] qrtr_ns_worker+0x2aff/0x4500 net/qrtr/ns.c:674 process_one_work+0x76e/0xfd0 kernel/workqueue.c:2268 worker_thread+0xa7f/0x1450 kernel/workqueue.c:2414 kthread+0x353/0x380 kernel/kthread.c:268 Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Reported-and-tested-by: syzbot+0f84f6eed90503da72fc@syzkaller.appspotmail.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	arm64: mte: Fix typo in memory tagging ABI documentation	Will Deacon
	We offer both PTRACE_PEEKMTETAGS and PTRACE_POKEMTETAGS requests via ptrace(). Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	Merge branch 'net-core-fix-a-lockdep-splat-in-the-dev_addr_list'	David S. Miller
	Taehee Yoo says: ==================== net: core: fix a lockdep splat in the dev_addr_list. This patchset is to avoid lockdep splat. When a stacked interface graph is changed, netif_addr_lock() is called recursively and it internally calls spin_lock_nested(). The parameter of spin_lock_nested() is 'dev->lower_level', this is called subclass. The problem of 'dev->lower_level' is that while 'dev->lower_level' is being used as a subclass of spin_lock_nested(), its value can be changed. So, spin_lock_nested() would be called recursively with the same subclass value, the lockdep understands a deadlock. In order to avoid this, a new variable is needed and it is going to be used as a parameter of spin_lock_nested(). The first and second patch is a preparation patch for the third patch. In the third patch, the problem will be fixed. The first patch is to add __netdev_upper_dev_unlink(). An existed netdev_upper_dev_unlink() is renamed to __netdev_upper_dev_unlink(). and netdev_upper_dev_unlink() is added as an wrapper of this function. The second patch is to add the netdev_nested_priv structure. netdev_walk_all_{ upper \| lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. The third patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. Due to this variable, it can avoid lockdep splat. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: add nested_level variable in net_device	Taehee Yoo
	This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. netif_addr_lock() can be called recursively so spin_lock_nested() is used instead of spin_lock() and dev->lower_level is used as a parameter of spin_lock_nested(). But, dev->lower_level value can be updated while it is being used. So, lockdep would warn a possible deadlock scenario. When a stacked interface is deleted, netif_{uc \| mc}_sync() is called recursively. So, spin_lock_nested() is called recursively too. At this moment, the dev->lower_level variable is used as a parameter of it. dev->lower_level value is updated when interfaces are being unlinked/linked immediately. Thus, After unlinking, dev->lower_level shouldn't be a parameter of spin_lock_nested(). A (macvlan) \| B (vlan) \| C (bridge) \| D (macvlan) \| E (vlan) \| F (bridge) A->lower_level : 6 B->lower_level : 5 C->lower_level : 4 D->lower_level : 3 E->lower_level : 2 F->lower_level : 1 When an interface 'A' is removed, it releases resources. At this moment, netif_addr_lock() would be called. Then, netdev_upper_dev_unlink() is called recursively. Then dev->lower_level is updated. There is no problem. But, when the bridge module is removed, 'C' and 'F' interfaces are removed at once. If 'F' is removed first, a lower_level value is like below. A->lower_level : 5 B->lower_level : 4 C->lower_level : 3 D->lower_level : 2 E->lower_level : 1 F->lower_level : 1 Then, 'C' is removed. at this moment, netif_addr_lock() is called recursively. The ordering is like this. C(3)->D(2)->E(1)->F(1) At this moment, the lower_level value of 'E' and 'F' are the same. So, lockdep warns a possible deadlock scenario. In order to avoid this problem, a new variable 'nested_level' is added. This value is the same as dev->lower_level - 1. But this value is updated in rtnl_unlock(). So, this variable can be used as a parameter of spin_lock_nested() safely in the rtnl context. Test commands: ip link add br0 type bridge vlan_filtering 1 ip link add vlan1 link br0 type vlan id 10 ip link add macvlan2 link vlan1 type macvlan ip link add br3 type bridge vlan_filtering 1 ip link set macvlan2 master br3 ip link add vlan4 link br3 type vlan id 10 ip link add macvlan5 link vlan4 type macvlan ip link add br6 type bridge vlan_filtering 1 ip link set macvlan5 master br6 ip link add vlan7 link br6 type vlan id 10 ip link add macvlan8 link vlan7 type macvlan ip link set br0 up ip link set vlan1 up ip link set macvlan2 up ip link set br3 up ip link set vlan4 up ip link set macvlan5 up ip link set br6 up ip link set vlan7 up ip link set macvlan8 up modprobe -rv bridge Splat looks like: [ 36.057436][ T744] WARNING: possible recursive locking detected [ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted [ 36.059959][ T744] -------------------------------------------- [ 36.061391][ T744] ip/744 is trying to acquire lock: [ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30 [ 36.064922][ T744] [ 36.064922][ T744] but task is already holding lock: [ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.068851][ T744] [ 36.068851][ T744] other info that might help us debug this: [ 36.070731][ T744] Possible unsafe locking scenario: [ 36.070731][ T744] [ 36.072497][ T744] CPU0 [ 36.073238][ T744] ---- [ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.076590][ T744] [ 36.076590][ T744] * DEADLOCK * [ 36.076590][ T744] [ 36.078515][ T744] May be due to missing lock nesting notation [ 36.078515][ T744] [ 36.080491][ T744] 3 locks held by ip/744: [ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490 [ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80 [ 36.088400][ T744] [ 36.088400][ T744] stack backtrace: [ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728 [ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 36.093630][ T744] Call Trace: [ 36.094416][ T744] dump_stack+0x77/0x9b [ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40 [ 36.096522][ T744] lock_acquire+0xb4/0x3b0 [ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30 [ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0 [ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20 [ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0 [ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70 [ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.105461][ T744] dev_set_rx_mode+0x19/0x30 [ 36.106532][ T744] dev_set_promiscuity+0x36/0x50 [ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50 [ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge] [ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge] [ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50 [ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90 [ 36.116249][ T744] dev_uc_sync+0x70/0x80 [ 36.117244][ T744] dev_uc_add+0x50/0x60 [ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan] [ 36.119470][ T744] __dev_open+0xd6/0x170 [ 36.120470][ T744] __dev_change_flags+0x181/0x1d0 [ 36.121644][ T744] dev_change_flags+0x23/0x60 [ 36.122741][ T744] do_setlink+0x30a/0x11e0 [ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0 [ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0 [ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850 [ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0 [ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60 [ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380 [ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.139164][ T744] rtnl_newlink+0x47/0x70 [ ... ] Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: introduce struct netdev_nested_priv for nested interface ↵	Taehee Yoo
	infrastructure Functions related to nested interface infrastructure such as netdev_walk_all_{ upper \| lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. In the following patch, a new member variable will be added into this struct to fix the lockdep issue. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: add __netdev_upper_dev_unlink()	Taehee Yoo
	The netdev_upper_dev_unlink() has to work differently according to flags. This idea is the same with __netdev_upper_dev_link(). In the following patches, new flags will be added. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	arm64: cpufeature: Export symbol read_sanitised_ftr_reg()	Jean-Philippe Brucker
	The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to share CPU page tables with devices. Allow the driver to be built as module by exporting the read_sanitized_ftr_reg() cpufeature symbol. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200918101852.582559-7-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: mm: Pin down ASIDs for sharing mm with devices	Jean-Philippe Brucker
	To enable address space sharing with the IOMMU, introduce arm64_mm_context_get() and arm64_mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Export the symbols to let the modular SMMUv3 driver use them. Pinning is necessary because a device constantly needs a valid ASID, unlike tasks that only require one when running. Without pinning, we would need to notify the IOMMU when we're about to use a new ASID for a task, and it would get complicated when a new task is assigned a shared ASID. Consider the following scenario with no ASID pinned: 1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1) 2. Task t2 is scheduled on CPUx, gets ASID (1, 2) 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1) We would now have to immediately generate a new ASID for t1, notify the IOMMU, and finally enable task tn. We are holding the lock during all that time, since we can't afford having another CPU trigger a rollover. The IOMMU issues invalidation commands that can take tens of milliseconds. It gets needlessly complicated. All we wanted to do was schedule task tn, that has no business with the IOMMU. By letting the IOMMU pin tasks when needed, we avoid stalling the slow path, and let the pinning fail when we're out of shareable ASIDs. After a rollover, the allocator expects at least one ASID to be available in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS - 1) is the maximum number of ASIDs that can be shared with the IOMMU. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove _sdei_event_unregister()	Gavin Shan
	_sdei_event_unregister() is called by sdei_event_unregister() and sdei_device_freeze(). _sdei_event_unregister() covers the shared and private events, but sdei_device_freeze() only covers the shared events. So the logic to cover the private events isn't needed by sdei_device_freeze(). sdei_event_unregister sdei_device_freeze _sdei_event_unregister sdei_unregister_shared _sdei_event_unregister This removes _sdei_event_unregister(). Its logic is moved to its callers accordingly. This shouldn't cause any logical changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-14-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove _sdei_event_register()	Gavin Shan
	The function _sdei_event_register() is called by sdei_event_register() and sdei_device_thaw() as the following functional call chain shows. _sdei_event_register() covers the shared and private events, but sdei_device_thaw() only covers the shared events. So the logic to cover the private events in _sdei_event_register() isn't needed by sdei_device_thaw(). Similarly, sdei_reregister_event_llocked() covers the shared and private events in the regard of reenablement. The logic to cover the private events isn't needed by sdei_device_thaw() either. sdei_event_register sdei_device_thaw _sdei_event_register sdei_reregister_shared sdei_reregister_event_llocked _sdei_event_register This removes _sdei_event_register() and sdei_reregister_event_llocked(). Their logic is moved to sdei_event_register() and sdei_reregister_shared(). This shouldn't cause any logical changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-13-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Introduce sdei_do_local_call()	Gavin Shan
	During the CPU hotplug, the private events are registered, enabled or unregistered on the specific CPU. It repeats the same steps: initializing cross call argument, make function call on local CPU, check the returned error. This introduces sdei_do_local_call() to cover the first steps. The other benefit is to make CROSSCALL_INIT and struct sdei_crosscall_args are only visible to sdei_do_{cross, local}_call(). Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-12-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Cleanup on cross call function	Gavin Shan
	This applies cleanup on the cross call functions, no functional changes are introduced: * Wrap the code block of CROSSCALL_INIT inside "do { } while (0)" as linux kernel usually does. Otherwise, scripts/checkpatch.pl reports warning regarding this. * Use smp_call_func_t for @fn argument in sdei_do_cross_call() as the function is called on target CPU(s). * Remove unnecessary space before @event in sdei_do_cross_call() Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-11-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove while loop in sdei_event_unregister()	Gavin Shan
	This removes the unnecessary while loop in sdei_event_unregister() because of the following two reasons. This shouldn't cause any functional changes. * The while loop is executed for once, meaning it's not needed in theory. * With the while loop removed, the nested statements can be avoid to make the code a bit cleaner. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200922130423.10173-10-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove while loop in sdei_event_register()	Gavin Shan
	This removes the unnecessary while loop in sdei_event_register() because of the following two reasons. This shouldn't cause any functional changes. * The while loop is executed for once, meaning it's not needed in theory. * With the while loop removed, the nested statements can be avoid to make the code a bit cleaner. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200922130423.10173-9-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove redundant error message in sdei_probe()	Gavin Shan
	This removes the redundant error message in sdei_probe() because the case can be identified from the errno in next error message. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-8-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>