git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-07-29	mlxsw: spectrum: Use different trap group for externally routed packets	Ido Schimmel
	Cited commit mistakenly removed the trap group for externally routed packets (e.g., via the management interface) and grouped locally routed and externally routed packet traps under the same group, thereby subjecting them to the same policer. This can result in problems, for example, when FRR is restarted and suddenly all transient traffic is trapped to the CPU because of a default route through the management interface. Locally routed packets required to re-establish a BGP connection will never reach the CPU and the routing tables will not be re-populated. Fix this by using a different trap group for externally routed packets. Fixes: 8110668ecd9a ("mlxsw: spectrum_trap: Register layer 3 control traps") Reported-by: Alex Veber <alexve@mellanox.com> Tested-by: Alex Veber <alexve@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29	IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE	Mike Marciniszyn
	The lookaside count is improperly initialized to the size of the Receive Queue with the additional +1. In the traces below, the RQ size is 384, so the count was set to 385. The lookaside count is then rarely refreshed. Note the high and incorrect count in the trace below: rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1 The head,tail indicate there is only one RWQE posted although the count says 385 and we correctly return the element 0. The next call to rvt_get_rwqe with the decremented count: rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9058 wr_id 0 qpn c qpt 2 pid 3018 num_sge 0 head 1 tail 1, count 384 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1 Note that the RQ is empty (head == tail) yet we return the RWQE at tail 1, which is not valid because of the bogus high count. Best case, the RWQE has never been posted and the rc logic sees an RWQE that is too small (all zeros) and puts the QP into an error state. In the worst case, a server slow at posting receive buffers might fool rvt_get_rwqe() into fetching an old RWQE and corrupt memory. Fix by deleting the faulty initialization code and creating an inline to fetch the posted count and convert all callers to use new inline. Fixes: f592ae3c999f ("IB/rdmavt: Fracture single lock used for posting and processing RWQEs") Link: https://lore.kernel.org/r/20200728183848.22226.29132.stgit@awfm-01.aw.intel.com Reported-by: Zhaojuan Guo <zguo@redhat.com> Cc: <stable@vger.kernel.org> # 5.4.x Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Honggang Li <honli@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29	Merge tag 'drm-fixes-2020-07-29' of git://anongit.freedesktop.org/drm/drm ↵	Linus Torvalds
	into master Pull drm fixes from Dave Airlie: "The nouveau fixes missed the last pull by a few hours, and we had a few arm driver/panel/bridge fixes come in. This is possibly a bit more than I'm comfortable sending at this stage, but I've looked at each patch, the core + nouveau patches fix regressions, and the arm related ones are all around screens turning on and working, and are mostly trivial patches, the line count is mostly in comments. core: - fix possible use-after-free drm_fb_helper: - regression fix to use memcpy_io on bochs' sparc64 nouveau: - format modifiers fixes - HDA regression fix - turing modesetting race fix of: - fix a double free dbi: - fix SPI Type 1 transfer mcde: - fix screen stability crash panel: - panel: fix display noise on auo,kd101n80-45na - panel: delay HPD checks for boe_nv133fhm_n61 bridge: - bridge: drop connector check in nwl-dsi bridge - bridge: set proper bridge type for adv7511" * tag 'drm-fixes-2020-07-29' of git://anongit.freedesktop.org/drm/drm: drm: hold gem reference until object is no longer accessed drm/dbi: Fix SPI Type 1 (9-bit) transfer drm/drm_fb_helper: fix fbdev with sparc64 drm/mcde: Fix stability issue drm/bridge: nwl-dsi: Drop DRM_BRIDGE_ATTACH_NO_CONNECTOR check. drm/panel: Fix auo, kd101n80-45na horizontal noise on edges of panel drm: panel: simple: Delay HPD checking on boe_nv133fhm_n61 for 15 ms drm/bridge/adv7511: set the bridge type properly drm: of: Fix double-free bug drm/nouveau/fbcon: zero-initialise the mode_cmd2 structure drm/nouveau/fbcon: fix module unload when fbcon init has failed for some reason drm/nouveau/kms/tu102: wait for core update to complete when assigning windows drm/nouveau/kms/gf100: use correct format modifiers drm/nouveau/disp/gm200-: fix regression from HDA SOR selection changes
2020-07-29	netfilter: Replace HTTP links with HTTPS ones	Alexander A. Klimov
	Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w\|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-29	RDMA/include: Replace license text with SPDX tags	Leon Romanovsky
	The header files in RDMA subsystem are dual licensed and can be described by simple SPDX tag, so replace all of them at once together with making them use the same coding style for header guard defines. Link: https://lore.kernel.org/r/20200719072521.135260-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29	random32: update the net random state on interrupt and activity	Willy Tarreau
	This modifies the first 32 bits out of the 128 bits of a random CPU's net_rand_state on interrupt or CPU activity to complicate remote observations that could lead to guessing the network RNG's internal state. Note that depending on some network devices' interrupt rate moderation or binding, this re-seeding might happen on every packet or even almost never. In addition, with NOHZ some CPUs might not even get timer interrupts, leaving their local state rarely updated, while they are running networked processes making use of the random state. For this reason, we also perform this update in update_process_times() in order to at least update the state when there is user or system activity, since it's the only case we care about. Reported-by: Amit Klein <aksecurity@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-29	cpuidle: change enter_s2idle() prototype	Neal Liu
	Control Flow Integrity(CFI) is a security mechanism that disallows changes to the original control flow graph of a compiled binary, making it significantly harder to perform such attacks. init_state_node() assign same function callback to different function pointer declarations. static int init_state_node(struct cpuidle_state idle_state, const struct of_device_id matches, struct device_node state_node) { ... idle_state->enter = match_id->data; ... idle_state->enter_s2idle = match_id->data; } Function declarations: struct cpuidle_state { ... int (enter) (struct cpuidle_device dev, struct cpuidle_driver drv, int index); void (enter_s2idle) (struct cpuidle_device dev, struct cpuidle_driver *drv, int index); }; In this case, either enter() or enter_s2idle() would cause CFI check failed since they use same callee. Align function prototype of enter() since it needs return value for some use cases. The return value of enter_s2idle() is no need currently. Signed-off-by: Neal Liu <neal.liu@mediatek.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-29	serial: 8250: Add 8250 port clock update method	Serge Semin
	Some platforms can be designed in a way so the UART port reference clock might be asynchronously changed at some point. In Baikal-T1 SoC this may happen due to the reference clock being shared between two UART ports, on the Allwinner SoC the reference clock is derived from the CPU clock, so any CPU frequency change should get to be known/reflected by/in the UART controller as well. But it's not enough to just update the uart_port->uartclk field of the corresponding UART port, the 8250 controller reference clock divisor should be altered so to preserve current baud rate setting. All of these things is done in a coherent way by calling the serial8250_update_uartclk() method provided in this patch. Though note that it isn't supposed to be called from within the UART port callbacks because the locks using to the protect the UART port data are already taken in there. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Link: https://lore.kernel.org/r/20200723003357.26897-2-Sergey.Semin@baikalelectronics.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29	nvmem: core: add support to auto devid	Srinivas Kandagatla
	For nvmem providers which have multiple instances, it is required to suffix the provider name with proper id, so that they do not confict for the same name. Currently the core does not handle this case properly eventhough core already has logic to generate the id. This patch add new devid type NVMEM_DEVID_AUTO for providers to be able to allow core to assign id and append it to provier name. Reported-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Link: https://lore.kernel.org/r/20200722100705.7772-8-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29	nvmem: core: Add nvmem_cell_read_u8()	Andreas Färber
	Complement the u16, u32 and u64 helpers with a u8 variant to ease accessing byte-sized values. This helper will be useful for Realtek Digital Home Center platforms, which store some byte and sub-byte sized values in non-volatile memory. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200722100705.7772-7-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-29	seqcount: More consistent seqprop names	Peter Zijlstra
	Attempt uniformity and brevity. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-07-29	seqcount: Compress SEQCNT_LOCKNAME_ZERO()	Peter Zijlstra
	Less is more. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-07-29	seqlock: Fold seqcount_LOCKNAME_init() definition	Peter Zijlstra
	Manual repetition is boring and error prone. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-07-29	seqlock: Fold seqcount_LOCKNAME_t definition	Peter Zijlstra
	Manual repetition is boring and error prone. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-07-29	seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g	Peter Zijlstra
	__SEQ_LOCKDEP() is an expression gate for the seqcount_LOCKNAME_t::lock member. Rename it to be about the member, not the gate condition. Later (PREEMPT_RT) patches will make the member available for !LOCKDEP configs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-07-29	hrtimer: Use sequence counter with associated raw spinlock	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_raw_spinlock_t data type, which allows to associate a raw spinlock with the sequence counter. This enables lockdep to verify that the raw spinlock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-25-a.darwish@linutronix.de
2020-07-29	kvm/eventfd: Use sequence counter with associated spinlock	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_spinlock_t data type, which allows to associate a spinlock with the sequence counter. This enables lockdep to verify that the spinlock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20200720155530.1173732-24-a.darwish@linutronix.de
2020-07-29	vfs: Use sequence counter with associated spinlock	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_spinlock_t data type, which allows to associate a spinlock with the sequence counter. This enables lockdep to verify that the spinlock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-19-a.darwish@linutronix.de
2020-07-29	netfilter: conntrack: Use sequence counter with associated spinlock	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_spinlock_t data type, which allows to associate a spinlock with the sequence counter. This enables lockdep to verify that the spinlock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-15-a.darwish@linutronix.de
2020-07-29	sched: tasks: Use sequence counter with associated spinlock	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. A plain seqcount_t does not contain the information of which lock must be held when entering a write side critical section. Use the new seqcount_spinlock_t data type, which allows to associate a spinlock with the sequence counter. This enables lockdep to verify that the spinlock used for writer serialization is held when the write side critical section is entered. If lockdep is disabled this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-14-a.darwish@linutronix.de
2020-07-29	dma-buf: Use sequence counter with associated wound/wait mutex	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. If the serialization primitive is not disabling preemption implicitly, preemption has to be explicitly disabled before entering the sequence counter write side critical section. The dma-buf reservation subsystem uses plain sequence counters to manage updates to reservations. Writer serialization is accomplished through a wound/wait mutex. Acquiring a wound/wait mutex does not disable preemption, so this needs to be done manually before and after the write side critical section. Use the newly-added seqcount_ww_mutex_t instead: - It associates the ww_mutex with the sequence count, which enables lockdep to validate that the write side critical section is properly serialized. - It removes the need to explicitly add preempt_disable/enable() around the write side critical section because the write_begin/end() functions for this new data type automatically do this. If lockdep is disabled this ww_mutex lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lkml.kernel.org/r/20200720155530.1173732-13-a.darwish@linutronix.de
2020-07-29	dma-buf: Remove custom seqcount lockdep class key	Ahmed S. Darwish
	Commit 3c3b177a9369 ("reservation: add support for read-only access using rcu") introduced a sequence counter to manage updates to reservations. Back then, the reservation object initializer reservation_object_init() was always inlined. Having the sequence counter initialization inlined meant that each of the call sites would have a different lockdep class key, which would've broken lockdep's deadlock detection. The aforementioned commit thus introduced, and exported, a custom seqcount lockdep class key and name. The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...") transformed the reservation object initializer to a normal non-inlined C function. seqcount_init(), which automatically defines the seqcount lockdep class key and must be called non-inlined, can now be safely used. Remove the seqcount custom lockdep class key, name, and export. Use seqcount_init() inside the dma reservation object initializer. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lkml.kernel.org/r/20200720155530.1173732-12-a.darwish@linutronix.de
2020-07-29	seqlock: Align multi-line macros newline escapes at 72 columns	Ahmed S. Darwish
	Parent commit, "seqlock: Extend seqcount API with associated locks", introduced a big number of multi-line macros that are newline-escaped at 72 columns. For overall cohesion, align the earlier-existing macros similarly. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-11-a.darwish@linutronix.de
2020-07-29	seqlock: Extend seqcount API with associated locks	Ahmed S. Darwish
	A sequence counter write side critical section must be protected by some form of locking to serialize writers. If the serialization primitive is not disabling preemption implicitly, preemption has to be explicitly disabled before entering the write side critical section. There is no built-in debugging mechanism to verify that the lock used for writer serialization is held and preemption is disabled. Some usage sites like dma-buf have explicit lockdep checks for the writer-side lock, but this covers only a small portion of the sequence counter usage in the kernel. Add new sequence counter types which allows to associate a lock to the sequence counter at initialization time. The seqcount API functions are extended to provide appropriate lockdep assertions depending on the seqcount/lock type. For sequence counters with associated locks that do not implicitly disable preemption, preemption protection is enforced in the sequence counter write side functions. This removes the need to explicitly add preempt_disable/enable() around the write side critical sections: the write_begin/end() functions for these new sequence counter types automatically do this. Introduce the following seqcount types with associated locks: seqcount_spinlock_t seqcount_raw_spinlock_t seqcount_rwlock_t seqcount_mutex_t seqcount_ww_mutex_t Extend the seqcount read and write functions to branch out to the specific seqcount_LOCKTYPE_t implementation at compile-time. This avoids kernel API explosion per each new seqcount_LOCKTYPE_t added. Add such compile-time type detection logic into a new, internal, seqlock header. Document the proper seqcount_LOCKTYPE_t usage, and rationale, at Documentation/locking/seqlock.rst. If lockdep is disabled, this lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-10-a.darwish@linutronix.de
2020-07-29	seqlock: lockdep assert non-preemptibility on seqcount_t write	Ahmed S. Darwish
	Preemption must be disabled before entering a sequence count write side critical section. Failing to do so, the seqcount read side can preempt the write side section and spin for the entire scheduler tick. If that reader belongs to a real-time scheduling class, it can spin forever and the kernel will livelock. Assert through lockdep that preemption is disabled for seqcount writers. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-9-a.darwish@linutronix.de
2020-07-29	lockdep: Add preemption enabled/disabled assertion APIs	Ahmed S. Darwish
	Asserting that preemption is enabled or disabled is a critical sanity check. Developers are usually reluctant to add such a check in a fastpath as reading the preemption count can be costly. Extend the lockdep API with macros asserting that preemption is disabled or enabled. If lockdep is disabled, or if the underlying architecture does not support kernel preemption, this assert has no runtime overhead. References: f54bb2ec02c8 ("locking/lockdep: Add IRQs disabled/enabled assertion APIs: ...") Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-8-a.darwish@linutronix.de
2020-07-29	seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()	Ahmed S. Darwish
	raw_seqcount_begin() has the same code as raw_read_seqcount(), with the exception of masking the sequence counter's LSB before returning it to the caller. Note, raw_seqcount_begin() masks the counter's LSB before returning it to the caller so that read_seqcount_retry() can fail if the counter is odd -- without the overhead of an extra branching instruction. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-7-a.darwish@linutronix.de
2020-07-29	seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs	Ahmed S. Darwish
	seqlock.h is now included by kernel's RST documentation, but a small number of the the exported seqlock.h functions are kernel-doc annotated. Add kernel-doc for all seqlock.h exported APIs. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-6-a.darwish@linutronix.de
2020-07-29	seqlock: Reorder seqcount_t and seqlock_t API definitions	Ahmed S. Darwish
	The seqlock.h seqcount_t and seqlock_t API definitions are presented in the chronological order of their development rather than the order that makes most sense to readers. This makes it hard to follow and understand the header file code. Group and reorder all of the exported seqlock.h functions according to their function. First, group together the seqcount_t standard read path functions: - __read_seqcount_begin() - raw_read_seqcount_begin() - read_seqcount_begin() since each function is implemented exactly in terms of the one above it. Then, group the special-case seqcount_t readers on their own as: - raw_read_seqcount() - raw_seqcount_begin() since the only difference between the two functions is that the second one masks the sequence counter LSB while the first one does not. Note that raw_seqcount_begin() can actually be implemented in terms of raw_read_seqcount(), which will be done in a follow-up commit. Then, group the seqcount_t write path functions, instead of injecting unrelated seqcount_t latch functions between them, and order them as: - raw_write_seqcount_begin() - raw_write_seqcount_end() - write_seqcount_begin_nested() - write_seqcount_begin() - write_seqcount_end() - raw_write_seqcount_barrier() - write_seqcount_invalidate() which is the expected natural order. This also isolates the seqcount_t latch functions into their own area, at the end of the sequence counters section, and before jumping to the next one: sequential locks (seqlock_t). Do a similar grouping and reordering for seqlock_t "locking" readers vs. the "conditionally locking or lockless" ones. No implementation code was changed in any of the reordering above. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-5-a.darwish@linutronix.de
2020-07-29	seqlock: seqcount_t latch: End read sections with read_seqcount_retry()	Ahmed S. Darwish
	The seqcount_t latch reader example at the raw_write_seqcount_latch() kernel-doc comment ends the latch read section with a manual smp memory barrier and sequence counter comparison. This is technically correct, but it is suboptimal: read_seqcount_retry() already contains the same logic of an smp memory barrier and sequence counter comparison. End the latch read critical section example with read_seqcount_retry(). Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-4-a.darwish@linutronix.de
2020-07-29	seqlock: Properly format kernel-doc code samples	Ahmed S. Darwish
	Align the code samples and note sections inside kernel-doc comments with tabs. This way they can be properly parsed and rendered by Sphinx. It also makes the code samples easier to read from text editors. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-3-a.darwish@linutronix.de
2020-07-29	Documentation: locking: Describe seqlock design and usage	Ahmed S. Darwish
	Proper documentation for the design and usage of sequence counters and sequential locks does not exist. Complete the seqlock.h documentation as follows: - Divide all documentation on a seqcount_t vs. seqlock_t basis. The description for both mechanisms was intermingled, which is incorrect since the usage constrains for each type are vastly different. - Add an introductory paragraph describing the internal design of, and rationale for, sequence counters. - Document seqcount_t writer non-preemptibility requirement, which was not previously documented anywhere, and provide a clear rationale. - Provide template code for seqcount_t and seqlock_t initialization and reader/writer critical sections. - Recommend using seqlock_t by default. It implicitly handles the serialization and non-preemptibility requirements of writers. At seqlock.h: - Remove references to brlocks as they've long been removed from the kernel. - Remove references to gcc-3.x since the kernel's minimum supported gcc version is 4.9. References: 0f6ed63b1707 ("no need to keep brlock macros anymore...") References: 6ec4476ac825 ("Raise gcc version requirement to 4.9") Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-2-a.darwish@linutronix.de
2020-07-29	Merge branch 'locking/header'	Peter Zijlstra

2020-07-29	locking/qspinlock: Do not include atomic.h from qspinlock_types.h	Herbert Xu
	This patch breaks a header loop involving qspinlock_types.h. The issue is that qspinlock_types.h includes atomic.h, which then eventually includes kernel.h which could lead back to the original file via spinlock_types.h. As ATOMIC_INIT is now defined by linux/types.h, there is no longer any need to include atomic.h from qspinlock_types.h. This also allows the CONFIG_PARAVIRT hack to be removed since it was trying to prevent exactly this loop. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20200729123316.GC7047@gondor.apana.org.au
2020-07-29	locking/atomic: Move ATOMIC_INIT into linux/types.h	Herbert Xu
	This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h. This allows users of atomic_t to use ATOMIC_INIT without having to include atomic.h as that way may lead to header loops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
2020-07-29	Merge remote-tracking branch 'spi/for-5.9' into spi-next	Mark Brown

2020-07-29	kexec_file: Allow archs to handle special regions while locating memory hole	Hari Bathini
	Some architectures may have special memory regions, within the given memory range, which can't be used for the buffer in a kexec segment. Implement weak arch_kexec_locate_mem_hole() definition which arch code may override, to take care of special regions, while trying to locate a memory hole. Also, add the missing declarations for arch overridable functions and and drop the __weak descriptors in the declarations to avoid non-weak definitions from becoming weak. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Tested-by: Pingfan Liu <piliu@redhat.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/159602273603.575379.17665852963340380839.stgit@hbathini
2020-07-29	ocxl: Address kernel doc errors & warnings	Alastair D'Silva
	This patch addresses warnings and errors from the kernel doc scripts for the OpenCAPI driver. It also makes minor tweaks to make the docs more consistent. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200415012343.919255-3-alastair@d-silva.org
2020-07-29	ocxl: Remove unnecessary externs	Alastair D'Silva
	Function declarations don't need externs, remove the existing ones so they are consistent with newer code Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200415012343.919255-2-alastair@d-silva.org
2020-07-29	Merge branches 'arm/renesas', 'arm/qcom', 'arm/mediatek', 'arm/omap', ↵	Joerg Roedel
	'arm/exynos', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'core' into next
2020-07-29	RDMA/efa: User/kernel compatibility handshake mechanism	Gal Pressman
	Introduce a mechanism that performs an handshake between the userspace provider and kernel driver which verifies that the user supports all required features in order to operate correctly. The handshake verifies the needed functionality by comparing the reported device caps and the provider caps. If the device reports a non-zero capability the appropriate comp mask is required from the userspace provider in order to allocate the context. Link: https://lore.kernel.org/r/20200722140312.3651-4-galpress@amazon.com Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29	RDMA/efa: Expose minimum SQ size	Gal Pressman
	The device reports the minimum SQ size required for creation. This patch queries the min SQ size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29	RDMA/efa: Expose maximum TX doorbell batch	Gal Pressman
	The device reports the maximum number of bytes to be written before ringing the doorbell (zero means unlimited). This patch queries the max batch size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-2-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29	Merge tag 'usb-ci-v5.9-rc1' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-next Peter writes: ENDIAN issue fix and one query controller role API is introduced. * tag 'usb-ci-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb: usb: chipidea: imx: get available runtime dr mode for wakeup setting usb: chipidea: add query_available_role interface Documentation: ABI: usb: chipidea: Update Li Jun's e-mail usb: chipidea: udc: fix the ENDIAN issue
2020-07-29	sched/uclamp: Add a new sysctl to control RT default boost value	Qais Yousef
	RT tasks by default run at the highest capacity/performance level. When uclamp is selected this default behavior is retained by enforcing the requested uclamp.min (p->uclamp_req[UCLAMP_MIN]) of the RT tasks to be uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum value. This is also referred to as 'the default boost value of RT tasks'. See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks"). On battery powered devices, it is desired to control this default (currently hardcoded) behavior at runtime to reduce energy consumed by RT tasks. For example, a mobile device manufacturer where big.LITTLE architecture is dominant, the performance of the little cores varies across SoCs, and on high end ones the big cores could be too power hungry. Given the diversity of SoCs, the new knob allows manufactures to tune the best performance/power for RT tasks for the particular hardware they run on. They could opt to further tune the value when the user selects a different power saving mode or when the device is actively charging. The runtime aspect of it further helps in creating a single kernel image that can be run on multiple devices that require different tuning. Keep in mind that a lot of RT tasks in the system are created by the kernel. On Android for instance I can see over 50 RT tasks, only a handful of which created by the Android framework. To control the default behavior globally by system admins and device integrator, introduce the new sysctl_sched_uclamp_util_min_rt_default to change the default boost value of the RT tasks. I anticipate this to be mostly in the form of modifying the init script of a particular device. To avoid polluting the fast path with unnecessary code, the approach taken is to synchronously do the update by traversing all the existing tasks in the system. This could race with a concurrent fork(), which is dealt with by introducing sched_post_fork() function which will ensure the racy fork will get the right update applied. Tested on Juno-r2 in combination with the RT capacity awareness [1]. By default an RT task will go to the highest capacity CPU and run at the maximum frequency, which is particularly energy inefficient on high end mobile devices because the biggest core[s] are 'huge' and power hungry. With this patch the RT task can be controlled to run anywhere by default, and doesn't cause the frequency to be maximum all the time. Yet any task that really needs to be boosted can easily escape this default behavior by modifying its requested uclamp.min value (p->uclamp_req[UCLAMP_MIN]) via sched_setattr() syscall. [1] 804d402fb6f6: ("sched/rt: Make RT capacity-aware") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200716110347.19553-2-qais.yousef@arm.com
2020-07-29	nvmet: add passthru code to process commands	Logan Gunthorpe
	Add passthru command handling capability for the NVMeOF target and export passthru APIs which are used to integrate passthru code with nvmet-core. The new file passthru.c handles passthru cmd parsing and execution. In the passthru mode, we create a block layer request from the nvmet request and map the data on to the block layer request. Admin commands and features are on an allow list as there are a number of each that don't make too much sense with passthrough. We use an allow list such that new commands can be considered before being blindly passed through. In both cases, vendor specific commands are always allowed. We also reject reservation IO commands as the underlying device cannot differentiate between multiple hosts behind a fabric. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29	nvme-fc: drop a duplicated word in a comment	Randy Dunlap
	Drop the repeated word "a" in a comment. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29	Merge tag 'drm-misc-fixes-2020-07-28' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes * drm: fix possible use-after-free * dbi: fix SPI Type 1 transfer * drm_fb_helper: use memcpy_io on bochs' sparc64 * mcde: fix stability * panel: fix display noise on auo,kd101n80-45na * panel: delay HPD checks for boe_nv133fhm_n61 * bridge: drop connector check in nwl-dsi bridge * bridge: set proper bridge type for adv7511 * of: fix a double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200728110446.GA8076@linux-uq9g
2020-07-28	scsi: target: tcmu: Implement tmr_notify callback	Bodo Stroesser
	This patch implements the tmr_notify callback for tcmu. When the callback is called, tcmu checks the list of aborted commands it received as parameter: - aborted commands in the qfull_queue are removed from the queue and target_complete_command is called - from the cmd_ids of aborted commands currently uncompleted in cmd ring it creates a list of aborted cmd_ids. Finally a TMR notification is written to cmd ring containing TMR type and cmd_id list. If there is no space in ring, the TMR notification is queued on a TMR specific queue. The TMR specific queue 'tmr_queue' can be seen as a extension of the cmd ring. At the end of each iexecution of tcmu_complete_commands() we check whether tmr_queue contains TMRs and try to move them onto the ring. If tmr_queue is not empty after that, we don't call run_qfull_queue() because commands must not overtake TMRs. This way we guarantee that cmd_ids in TMR notification received by userspace either match an active, not yet completed command or are no longer valid due to userspace having complete some cmd_ids meanwhile. New commands that were assigned to an aborted cmd_id will always appear on the cmd ring _after_ the TMR. Link: https://lore.kernel.org/r/20200726153510.13077-8-bstroesser@ts.fujitsu.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-28	scsi: target: Add tmr_notify backend function	Bodo Stroesser
	Target core is modified to call an optional backend callback function if a TMR is received or commands are aborted implicitly after a PR command was received. The backend function takes as parameters the se_dev, the type of the TMR, and the list of aborted commands. If no commands were aborted, an empty list is supplied. Link: https://lore.kernel.org/r/20200726153510.13077-3-bstroesser@ts.fujitsu.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>