linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-11-19	Merge branch 'kcsan.2020.11.06a' into HEAD	Paul E. McKenney
	kcsan.2020.11.06a: Kernel concurrency sanitizer (KCSAN) updates.
2020-11-19	Merge branches 'cpuinfo.2020.11.06a', 'doc.2020.11.06a', ↵	Paul E. McKenney
	'fixes.2020.11.19b', 'lockdep.2020.11.02a', 'tasks.2020.11.06a' and 'torture.2020.11.06a' into HEAD cpuinfo.2020.11.06a: Speedups for /proc/cpuinfo. doc.2020.11.06a: Documentation updates. fixes.2020.11.19b: Miscellaneous fixes. lockdep.2020.11.02a: Lockdep-RCU updates to avoid "unused variable". tasks.2020.11.06a: Tasks-RCU updates. torture.2020.11.06a': Torture-test updates.
2020-11-19	srcu: Take early exit on memory-allocation failure	Paul E. McKenney
	It turns out that init_srcu_struct() can be invoked from usermode tasks, and that fatal signals received by these tasks can cause memory-allocation failures. These failures are not handled well by init_srcu_struct(), so much so that NULL pointer dereferences can result. This commit therefore causes init_srcu_struct() to take an early exit upon detection of memory-allocation failure. Link: https://lore.kernel.org/lkml/20200908144306.33355-1-aik@ozlabs.ru/ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu/tree: Defer kvfree_rcu() allocation to a clean context	Uladzislau Rezki (Sony)
	The current memmory-allocation interface causes the following difficulties for kvfree_rcu(): a) If built with CONFIG_PROVE_RAW_LOCK_NESTING, the lockdep will complain about violation of the nesting rules, as in "BUG: Invalid wait context". This Kconfig option checks for proper raw_spinlock vs. spinlock nesting, in particular, it is not legal to acquire a spinlock_t while holding a raw_spinlock_t. This is a problem because kfree_rcu() uses raw_spinlock_t whereas the "page allocator" internally deals with spinlock_t to access to its zones. The code also can be broken from higher level of view: <snip> raw_spin_lock(&some_lock); kfree_rcu(some_pointer, some_field_offset); <snip> b) If built with CONFIG_PREEMPT_RT, spinlock_t is converted into sleeplock. This means that invoking the page allocator from atomic contexts results in "BUG: scheduling while atomic". c) Please note that call_rcu() is already invoked from raw atomic context, so it is only reasonable to expaect that kfree_rcu() and kvfree_rcu() will also be called from atomic raw context. This commit therefore defers page allocation to a clean context using the combination of an hrtimer and a workqueue. The hrtimer stage is required in order to avoid deadlocks with the scheduler. This deferred allocation is required only when kvfree_rcu()'s per-CPU page cache is empty. Link: https://lore.kernel.org/lkml/20200630164543.4mdcf6zb4zfclhln@linutronix.de/ Fixes: 3042f83f19be ("rcu: Support reclaim for head-less object") Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Do not report strict GPs for outgoing CPUs	Paul E. McKenney
	An outgoing CPU is marked offline in a stop-machine handler and most of that CPU's services stop at that point, including IRQ work queues. However, that CPU must take another pass through the scheduler and through a number of CPU-hotplug notifiers, many of which contain RCU readers. In the past, these readers were not a problem because the outgoing CPU has interrupts disabled, so that rcu_read_unlock_special() would not be invoked, and thus RCU would never attempt to queue IRQ work on the outgoing CPU. This changed with the advent of the CONFIG_RCU_STRICT_GRACE_PERIOD Kconfig option, in which rcu_read_unlock_special() is invoked upon exit from almost all RCU read-side critical sections. Worse yet, because interrupts are disabled, rcu_read_unlock_special() cannot immediately report a quiescent state and will therefore attempt to defer this reporting, for example, by queueing IRQ work. Which fails with a splat because the CPU is already marked as being offline. But it turns out that there is no need to report this quiescent state because rcu_report_dead() will do this job shortly after the outgoing CPU makes its final dive into the idle loop. This commit therefore makes rcu_read_unlock_special() refrain from queuing IRQ work onto outgoing CPUs. Fixes: 44bad5b3cca2 ("rcu: Do full report for .need_qs for strict GPs") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Jann Horn <jannh@google.com>
2020-11-19	rcu: Fix a typo in rcu_blocking_is_gp() header comment	Zhouyi Zhou
	This commit fixes a typo in the rcu_blocking_is_gp() function's header comment. Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Prevent lockdep-RCU splats on lock acquisition/release	Paul E. McKenney
	The rcu_cpu_starting() and rcu_report_dead() functions transition the current CPU between online and offline state from an RCU perspective. Unfortunately, this means that the rcu_cpu_starting() function's lock acquisition and the rcu_report_dead() function's lock releases happen while the CPU is offline from an RCU perspective, which can result in lockdep-RCU splats about using RCU from an offline CPU. And this situation can also result in too-short grace periods, especially in guest OSes that are subject to vCPU preemption. This commit therefore uses sequence-count-like synchronization to forgive use of RCU while RCU thinks a CPU is offline across the full extent of the rcu_cpu_starting() and rcu_report_dead() function's lock acquisitions and releases. One approach would have been to use the actual sequence-count primitives provided by the Linux kernel. Unfortunately, the resulting code looks completely broken and wrong, and is likely to result in patches that break RCU in an attempt to address this appearance of broken wrongness. Plus there is no net savings in lines of code, given the additional explicit memory barriers required. Therefore, this sequence count is instead implemented by a new ->ofl_seq field in the rcu_node structure. If this counter's value is an odd number, RCU forgives RCU read-side critical sections on other CPUs covered by the same rcu_node structure, even if those CPUs are offline from an RCU perspective. In addition, if a given leaf rcu_node structure's ->ofl_seq counter value is an odd number, rcu_gp_init() delays starting the grace period until that counter value changes. [ paulmck: Apply Peter Zijlstra feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu/tree: nocb: Avoid raising softirq for offloaded ready-to-execute CBs	Joel Fernandes (Google)
	Testing showed that rcu_pending() can return 1 when offloaded callbacks are ready to execute. This invokes RCU core processing, for example, by raising RCU_SOFTIRQ, eventually resulting in a call to rcu_core(). However, rcu_core() explicitly avoids in any way manipulating offloaded callbacks, which are instead handled by the rcuog and rcuoc kthreads, which work independently of rcu_core(). One exception to this independence is that rcu_core() invokes do_nocb_deferred_wakeup(), however, rcu_pending() also checks rcu_nocb_need_deferred_wakeup() in order to correctly handle this case, invoking rcu_core() when needed. This commit therefore avoids needlessly invoking RCU core processing by checking rcu_segcblist_ready_cbs() only on non-offloaded CPUs. This reduces overhead, for example, by reducing softirq activity. This change passed 30 minute tests of TREE01 through TREE09 each. On TREE08, there is at most 150us from the time that rcu_pending() chose not to invoke RCU core processing to the time when the ready callbacks were invoked by the rcuoc kthread. This provides further evidence that there is no need to invoke rcu_core() for offloaded callbacks that are ready to invoke. Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu,ftrace: Fix ftrace recursion	Peter Zijlstra
	Kim reported that perf-ftrace made his box unhappy. It turns out that commit: ff5c4f5cad33 ("rcu/tree: Mark the idle relevant functions noinstr") removed one too many notrace qualifiers, probably due to there not being a helpful comment. This commit therefore reinstates the notrace and adds a comment to avoid losing it again. [ paulmck: Apply Steven Rostedt's feedback on the comment. ] Fixes: ff5c4f5cad33 ("rcu/tree: Mark the idle relevant functions noinstr") Reported-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu/tree: Make struct kernel_param_ops definitions const	Joe Perches
	These should be const, so make it so. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu/tree: Add a warning if CPU being onlined did not report QS already	Joel Fernandes (Google)
	Currently, rcu_cpu_starting() checks to see if the RCU core expects a quiescent state from the incoming CPU. However, the current interaction between RCU quiescent-state reporting and CPU-hotplug operations should mean that the incoming CPU never needs to report a quiescent state. First, the outgoing CPU reports a quiescent state if needed. Second, the race where the CPU is leaving just as RCU is initializing a new grace period is handled by an explicit check for this condition. Third, the CPU's leaf rcu_node structure's ->lock serializes these checks. This means that if rcu_cpu_starting() ever feels the need to report a quiescent state, then there is a bug somewhere in the CPU hotplug code or the RCU grace-period handling code. This commit therefore adds a WARN_ON_ONCE() to bring that bug to everyone's attention. Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Clarify nocb kthreads naming in RCU_NOCB_CPU config	Neeraj Upadhyay
	This commit clarifies that the "p" and the "s" in the in the RCU_NOCB_CPU config-option description refer to the "x" in the "rcuox/N" kthread name. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> [ paulmck: While in the area, update description and advice. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Fix single-CPU check in rcu_blocking_is_gp()	Neeraj Upadhyay
	Currently, for CONFIG_PREEMPTION=n kernels, rcu_blocking_is_gp() uses num_online_cpus() to determine whether there is only one CPU online. When there is only a single CPU online, the simple fact that synchronize_rcu() could be legally called implies that a full grace period has elapsed. Therefore, in the single-CPU case, synchronize_rcu() simply returns immediately. Unfortunately, num_online_cpus() is unreliable while a CPU-hotplug operation is transitioning to or from single-CPU operation because: 1. num_online_cpus() uses atomic_read(&__num_online_cpus) to locklessly sample the number of online CPUs. The hotplug locks are not held, which means that an incoming CPU can concurrently update this count. This in turn means that an RCU read-side critical section on the incoming CPU might observe updates prior to the grace period, but also that this critical section might extend beyond the end of the optimized synchronize_rcu(). This breaks RCU's fundamental guarantee. 2. In addition, num_online_cpus() does no ordering, thus providing another way that RCU's fundamental guarantee can be broken by the current code. 3. The most probable failure mode happens on outgoing CPUs. The outgoing CPU updates the count of online CPUs in the CPUHP_TEARDOWN_CPU stop-machine handler, which is fine in and of itself due to preemption being disabled at the call to num_online_cpus(). Unfortunately, after that stop-machine handler returns, the CPU takes one last trip through the scheduler (which has RCU readers) and, after the resulting context switch, one final dive into the idle loop. During this time, RCU needs to keep track of two CPUs, but num_online_cpus() will say that there is only one, which in turn means that the surviving CPU will incorrectly ignore the outgoing CPU's RCU read-side critical sections. This problem is illustrated by the following litmus test in which P0() corresponds to synchronize_rcu() and P1() corresponds to the incoming CPU. The herd7 tool confirms that the "exists" clause can be satisfied, thus demonstrating that this breakage can happen according to the Linux kernel memory model. { int x = 0; atomic_t numonline = ATOMIC_INIT(1); } P0(int x, atomic_t numonline) { int r0; WRITE_ONCE(x, 1); r0 = atomic_read(numonline); if (r0 == 1) { smp_mb(); } else { synchronize_rcu(); } WRITE_ONCE(x, 2); } P1(int x, atomic_t numonline) { int r0; int r1; atomic_inc(numonline); smp_mb(); rcu_read_lock(); r0 = READ_ONCE(x); smp_rmb(); r1 = READ_ONCE(x); rcu_read_unlock(); } locations [x;numonline;] exists (1:r0=0 /\ 1:r1=2) It is important to note that these problems arise only when the system is transitioning to or from single-CPU operation. One solution would be to hold the CPU-hotplug locks while sampling num_online_cpus(), which was in fact the intent of the (redundant) preempt_disable() and preempt_enable() surrounding this call to num_online_cpus(). Actually blocking CPU hotplug would not only result in excessive overhead, but would also unnecessarily impede CPU-hotplug operations. This commit therefore follows long-standing RCU tradition by maintaining a separate RCU-specific set of CPU-hotplug books. This separate set of books is implemented by a new ->n_online_cpus field in the rcu_state structure that maintains RCU's count of the online CPUs. This count is incremented early in the CPU-online process, so that the critical transition away from single-CPU operation will occur when there is only a single CPU. Similarly for the critical transition to single-CPU operation, the counter is decremented late in the CPU-offline process, again while there is only a single CPU. Because there is only ever a single CPU when the ->n_online_cpus field undergoes the critical 1->2 and 2->1 transitions, full memory ordering and mutual exclusion is provided implicitly and, better yet, for free. In the case where the CPU is coming online, nothing will happen until the current CPU helps it come online. Therefore, the new CPU will see all accesses prior to the optimized grace period, which means that RCU does not need to further delay this new CPU. In the case where the CPU is going offline, the outgoing CPU is totally out of the picture before the optimized grace period starts, which means that this outgoing CPU cannot see any of the accesses following that grace period. Again, RCU needs no further interaction with the outgoing CPU. This does mean that synchronize_rcu() will unnecessarily do a few grace periods the hard way just before the second CPU comes online and just after the second-to-last CPU goes offline, but it is not worth optimizing this uncommon case. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Implement rcu_segcblist_is_offloaded() config dependent	Frederic Weisbecker
	This commit simplifies the use of the rcu_segcblist_is_offloaded() API so that its callers no longer need to check the RCU_NOCB_CPU Kconfig option. Note that rcu_segcblist_is_offloaded() is defined in the header file, which means that the generated code should be just as efficient as before. Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	list.h: Update comment to explicitly note circular lists	Asif Rasheed
	The students in the Operating System Lecture Section at the American University of Sharjah were confused by the header comment in include/linux/list.h, which says "Simple doubly linked list implementation". This comment means "simple" as in "not complex", but "simple" is often used in this context to mean "not circular". This commit therefore avoids this ambiguity by explicitly calling out "circular". Signed-off-by: Asif Rasheed <b00073877@aus.edu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Panic after fixed number of stalls	chao
	Some stalls are transient, so that system fully recovers. This commit therefore allows users to configure the number of stalls that must happen in order to trigger kernel panic. Signed-off-by: chao <chao@eero.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	x86/smpboot: Move rcu_cpu_starting() earlier	Paul E. McKenney
	The call to rcu_cpu_starting() in mtrr_ap_init() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows: ============================= WARNING: suspicious RCU usage 5.9.0+ #268 Not tainted ----------------------------- kernel/kprobes.c:300 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0+ #268 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x77/0x97 __is_insn_slot_addr+0x15d/0x170 kernel_text_address+0xba/0xe0 ? get_stack_info+0x22/0xa0 __kernel_text_address+0x9/0x30 show_trace_log_lvl+0x17d/0x380 ? dump_stack+0x77/0x97 dump_stack+0x77/0x97 __lock_acquire+0xdf7/0x1bf0 lock_acquire+0x258/0x3d0 ? vprintk_emit+0x6d/0x2c0 _raw_spin_lock+0x27/0x40 ? vprintk_emit+0x6d/0x2c0 vprintk_emit+0x6d/0x2c0 printk+0x4d/0x69 start_secondary+0x1c/0x100 secondary_startup_64_no_verify+0xb8/0xbb This is avoided by moving the call to rcu_cpu_starting up near the beginning of the start_secondary() function. Note that the raw_smp_processor_id() is required in order to avoid calling into lockdep before RCU has declared the CPU to be watched for readers. Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/ Reported-by: Qian Cai <cai@redhat.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	rcu: Allow rcu_irq_enter_check_tick() from NMI	Peter Zijlstra
	Eugenio managed to tickle #PF from NMI context which resulted in hitting a WARN in RCU through irqentry_enter() -> __rcu_irq_enter_check_tick(). However, this situation is perfectly sane and does not warrant an WARN. The #PF will (necessarily) be atomic and not require messing with the tick state, so early return is correct. This commit therefore removes the WARN. Fixes: aaf2bc50df1f ("rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()") Reported-by: "Eugenio Pérez" <eupm90@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-11-19	scsi: ufs: Fix race between shutdown and runtime resume flow	Stanley Chu
	If UFS host device is in runtime-suspended state while UFS shutdown callback is invoked, UFS device shall be resumed for register accesses. Currently only UFS local runtime resume function will be invoked to wake up the host. This is not enough because if someone triggers runtime resume from block layer, then race may happen between shutdown and runtime resume flow, and finally lead to unlocked register access. To fix this, in ufshcd_shutdown(), use pm_runtime_get_sync() instead of resuming UFS device by ufshcd_runtime_resume() "internally" to let runtime PM framework manage the whole resume flow. Link: https://lore.kernel.org/r/20201119062916.12931-1-stanley.chu@mediatek.com Fixes: 57d104c153d3 ("ufs: add UFS power management support") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-19	drm/i915: Do not call hsw_set_frame_start_delay for dsi	Manasi Navare
	This should fix the boot oops for dsi v2: * Fix indent (Manasi) v3: * Remove redundant condition (Matt Roper) Fixes: 4e3cdb4535e7 ("drm/i915/dp: Master/Slave enable/disable sequence for bigjoiner") Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201119232615.23231-1-manasi.d.navare@intel.com
2020-11-20	Merge tag 'drm-intel-fixes-2020-11-19' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix tgl power gating issue (Rodrigo) - Memory leak fixes (Tvrtko, Chris) - Selftest fixes (Zhang) - Display bpc fix (Ville) - Fix TGL MOCS for PTE tracking (Chris) GVT Fixes: It temporarily disables VFIO edid feature on BXT/APL until its virtual display is really fixed to make it work properly. And fixes for DPCD 1.2 and error return in taking module reference. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201119203417.GA1795798@intel.com
2020-11-20	Merge tag 'drm-misc-fixes-2020-11-19' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes two patches to fix dw-hdmi bind and detection code, and one fix for sun4i shared with arm-soc Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20201119083939.ddj3saipyg5iwvb4@gilmour
2020-11-19	xfs: revert "xfs: fix rmap key and record comparison functions"	Darrick J. Wong
	This reverts commit 6ff646b2ceb0eec916101877f38da0b73e3a5b7f. Your maintainer committed a major braino in the rmap code by adding the attr fork, bmbt, and unwritten extent usage bits into rmap record key comparisons. While XFS uses the usage bits in the rmap records for cross-referencing metadata in xfs_scrub and xfs_repair, it only needs the owner and offset information to distinguish between reverse mappings of the same physical extent into the data fork of a file at multiple offsets. The other bits are not important for key comparisons for index lookups, and never have been. Eric Sandeen reports that this causes regressions in generic/299, so undo this patch before it does more damage. Reported-by: Eric Sandeen <sandeen@sandeen.net> Fixes: 6ff646b2ceb0 ("xfs: fix rmap key and record comparison functions") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-11-19	Merge tag 'net-5.10-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.10-rc5, including fixes from the WiFi (mac80211), can and bpf (including the strncpy_from_user fix). Current release - regressions: - mac80211: fix memory leak of filtered powersave frames - mac80211: free sta in sta_info_insert_finish() on errors to avoid sleeping in atomic context - netlabel: fix an uninitialized variable warning added in -rc4 Previous release - regressions: - vsock: forward all packets to the host when no H2G is registered, un-breaking AWS Nitro Enclaves - net: Exempt multicast addresses from five-second neighbor lifetime requirement, decreasing the chances neighbor tables fill up - net/tls: fix corrupted data in recvmsg - qed: fix ILT configuration of SRC block - can: m_can: process interrupt only when not runtime suspended Previous release - always broken: - page_frag: Recover from memory pressure by not recycling pages allocating from the reserves - strncpy_from_user: Mask out bytes after NUL terminator - ip_tunnels: Set tunnel option flag only when tunnel metadata is present, always setting it confuses Open vSwitch - bpf, sockmap: - Fix partial copy_page_to_iter so progress can still be made - Fix socket memory accounting and obeying SO_RCVBUF - net: Have netpoll bring-up DSA management interface - net: bridge: add missing counters to ndo_get_stats64 callback - tcp: brr: only postpone PROBE_RTT if RTT is < current min_rtt - enetc: Workaround MDIO register access HW bug - net/ncsi: move netlink family registration to a subsystem init, instead of tying it to driver probe - net: ftgmac100: unregister NC-SI when removing driver to avoid crash - lan743x: - prevent interrupt storm on open - fix freeing skbs in the wrong context - net/mlx5e: Fix socket refcount leak on kTLS RX resync - net: dsa: mv88e6xxx: Avoid VLAN database corruption on 6097 - fix 21 unset return codes and other mistakes on error paths, mostly detected by the Hulk Robot" * tag 'net-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (115 commits) fail_function: Remove a redundant mutex unlock selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL lib/strncpy_from_user.c: Mask out bytes after NUL terminator. net/smc: fix direct access to ib_gid_addr->ndev in smc_ib_determine_gid() net/smc: fix matching of existing link groups ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module libbpf: Fix VERSIONED_SYM_COUNT number parsing net/mlx4_core: Fix init_hca fields offset atm: nicstar: Unmap DMA on send error page_frag: Recover from memory pressure net: dsa: mv88e6xxx: Wait for EEPROM done after HW reset mlxsw: core: Use variable timeout for EMAD retries mlxsw: Fix firmware flashing net: Have netpoll bring-up DSA management interface atl1e: fix error return code in atl1e_probe() atl1c: fix error return code in atl1c_probe() ah6: fix error return code in ah6_input() net: usb: qmi_wwan: Set DTR quirk for MR400 can: m_can: process interrupt only when not runtime suspended can: flexcan: flexcan_chip_start(): fix erroneous flexcan_transceiver_enable() during bus-off recovery ...
2020-11-19	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Linus Torvalds
	Pull rdma fixes from Jason Gunthorpe: "The last two weeks have been quiet here, just the usual smattering of long standing bug fixes. A collection of error case bug fixes: - Improper nesting of spinlock types in cm - Missing error codes and kfree() - Ensure dma_virt_ops users have the right kconfig symbols to work properly - Compilation failure of tools/testing" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: tools/testing/scatterlist: Fix test to compile and run IB/hfi1: Fix error return code in hfi1_init_dd() RMDA/sw: Don't allow drivers using dma_virt_ops on highmem configs RDMA/pvrdma: Fix missing kfree() in pvrdma_register_device() RDMA/cm: Make the local_id_table xarray non-irq
2020-11-19	EDAC/i10nm: Add Intel Sapphire Rapids server support	Qiuxu Zhuo
	The Sapphire Rapids CPU model shares the same memory controller architecture with Ice Lake server. There are some configurations different from Ice Lake server as below: - The device ID for configuration agent. - The size for per channel memory-mapped I/O. - The DDR5 memory support. So add the above configurations and the Sapphire Rapids CPU model ID for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	EDAC: Add DDR5 new memory type	Qiuxu Zhuo
	Add a new entry to 'enum mem_type' and a new string to 'edac_mem_types[]' for DDR5 new memory type. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	EDAC/i10nm: Use readl() to access MMIO registers	Qiuxu Zhuo
	Instead of raw access, use readl() to access MMIO registers of memory controller to avoid possible compiler re-ordering. Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors") Cc: <stable@vger.kernel.org> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	MAINTAINERS: Add entry for Intel IGEN6 EDAC driver	Tony Luck
	New driver for "client" system on chip CPUs. Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	EDAC/igen6: Add debugfs interface for Intel client SoC EDAC driver	Qiuxu Zhuo
	Add debugfs support to fake memory correctable errors to test the error reporting path and the error address decoding logic in the igen6_edac driver. Please note that the fake errors are also reported to EDAC core and then the CE counter in EDAC sysfs is also increased. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-17-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-16-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: tmio: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-15-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: txx9ndfmc: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-14-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-13-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-12-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: lpc32xx_slc: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Vladimir Zapolskiy <vz@mleia.com> Cc: Sylvain Lemieux <slemieux.tyco@gmail.com>
2020-11-19	mtd: rawnand: lpc32xx_mlc: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Vladimir Zapolskiy <vz@mleia.com> Cc: Sylvain Lemieux <slemieux.tyco@gmail.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-10-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: fsmc: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2020-11-19	mtd: rawnand: diskonchip: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-8-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: davinci: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-7-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: cs553x: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2020-11-19	EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC	Qiuxu Zhuo
	This driver supports Intel client SoC with integrated memory controller using In-Band ECC(IBECC). The memory correctable and uncorrectable errors are reported via NMIs. The driver handles the NMIs and decodes the memory error address to platform specific address. The first IBECC-supported SoC is Elkhart Lake. [Tony: s/#include <linux/nmi.h>/#include <asm/nmi.h>/ to fix randconfig build] Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2020-11-19	ext4: drop fast_commit from /proc/mounts	Theodore Ts'o
	The options in /proc/mounts must be valid mount options --- and fast_commit is not a mount option. Otherwise, command sequences like this will fail: # mount /dev/vdc /vdc # mkdir -p /vdc/phoronix_test_suite /pts # mount --bind /vdc/phoronix_test_suite /pts # mount -o remount,nodioread_nolock /pts mount: /pts: mount point not mounted or bad option. And in the system logs, you'll find: EXT4-fs (vdc): Unrecognized mount option "fast_commit" or missing value Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-19	mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-5-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(). Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-4-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(), a NAND controller hook. Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-3-miquel.raynal@bootlin.com
2020-11-19	mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()	Miquel Raynal
	The probe function is only supposed to initialize the controller hardware but not the ECC engine. Indeed, we don't know anything about the NAND chip(s) at this stage. Let's move the logic initializing the ECC engine, even pretty simple, to the ->attach_chip() hook which gets called during nand_scan() routine, after the NAND chip discovery. As the previously mentioned logic is supposed to parse the DT for us, it is likely that the chip->ecc.* entries be overwritten. So let's avoid this by moving these lines to ->attach_chip(), a NAND controller hook. Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/linux-mtd/20201113123424.32233-2-miquel.raynal@bootlin.com
2020-11-19	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Jakub Kicinski
	Alexei Starovoitov says: ==================== 1) libbpf should not attempt to load unused subprogs, from Andrii. 2) Make strncpy_from_user() mask out bytes after NUL terminator, from Daniel. 3) Relax return code check for subprograms in the BPF verifier, from Dmitrii. 4) Fix several sockmap issues, from John. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: fail_function: Remove a redundant mutex unlock selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL lib/strncpy_from_user.c: Mask out bytes after NUL terminator. libbpf: Fix VERSIONED_SYM_COUNT number parsing bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list bpf, sockmap: Handle memory acct if skb_verdict prog redirects to self bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self bpf, sockmap: Use truesize with sk_rmem_schedule() bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made selftests/bpf: Fix error return code in run_getsockopt_test() bpf: Relax return code check for subprograms tools, bpftool: Add missing close before bpftool net attach exit MAINTAINERS/bpf: Update Andrii's entry. selftests/bpf: Fix unused attribute usage in subprogs_unused test bpf: Fix unsigned 'datasec_id' compared with zero in check_pseudo_btf_id bpf: Fix passing zero to PTR_ERR() in bpf_btf_printf_prepare libbpf: Don't attempt to load unused subprog as an entry-point BPF program ==================== Link: https://lore.kernel.org/r/20201119200721.288-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19	drm/i915/gt: Fixup tgl mocs for PTE tracking	Chris Wilson
	Forcing mocs:1 [used for our winsys follows-pte mode] to be cached caused display glitches. Though it is documented as deprecated (and so likely behaves as uncached) use the follow-pte bit and force it out of L3 cache. Testcase: igt/kms_frontbuffer_tracking Testcase: igt/kms_big_fb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk (cherry picked from commit a04ac827366594c7244f60e9be79fcb404af69f0) Fixes: 849c0fe9e831 ("drm/i915/gt: Initialize reserved and unspecified MOCS indices") Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo: Updated Fixes tag]