git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-06-08	doc: Take tail recursion into account in RCU requirements	Paul E. McKenney
	This commit classifies tail recursion as an alternative way to write a loop, with similar limitations. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Document auto-expediting requirement	Paul E. McKenney
	This commit documents the auto-expediting requirement satisfied by commits 2da4b2a7fd8d ("srcu: Expedite first synchronize_srcu() when idle") and 22607d66bbc3 ("srcu: Specify auto-expedite holdoff time"). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Add "git diff" output to testid.txt file	Paul E. McKenney
	Currently, when running from a git archive, the testid.txt file contains only the branch name, the output of "git status", and the SHA-1 of the current HEAD. This is useful, but does not uniquely identify the source code that was built. This commit therefore adds the output of "git diff HEAD", which means that if two testid.txt files compare equal, they correspond to exactly the same source code. Give or take the possibility of SHA-1 collisions, that is. ;-) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Add writer_holdoff boot parameter	Paul E. McKenney
	This commit adds a writer_holdoff boot parameter to rcuperf, which is intended to be used to test Tree SRCU's auto-expediting. This boot parameter is in microseconds, and defaults to zero (that is, disabled). Set it to a bit larger than srcutree.exp_holdoff, keeping the nanosecond/microsecond conversion, to force Tree SRCU to auto-expedite more aggressively. This commit also adds documentation for this parameter, and fixes some alphabetization while in the neighborhood. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu-cbmc: Use /usr/bin/awk instead of /bin/awk	Priyalee Kushwaha
	Most OS distribution have awk in /usr/bin not in /bin Without this patch, kernel-devsrc fails to build as runtime dependency for srcu-cbmc script /bin/awk is not found. Signed-off-by: Kushwaha, Priyalee <priyalee.kushwaha@intel.com> Acked-by: Lance Roy <ldr709@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Set more user-friendly defaults	Paul E. McKenney
	Common-case use of rcuperf must set rcuperf.nreaders=0 and if not built as a module, rcuperf.shutdown. This commit therefore sets the default for rcuperf.nreaders to zero and sets the default for rcuperf.shutdown to zero if rcuperf is built as a module and to one otherwise. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Shrink Tiny SRCU a bit more	Paul E. McKenney
	This commit rearranges Tiny SRCU's srcu_struct structure, substitutes u8 for bool, and shrinks counters down to short. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Reduce CPUs dedicated to testing Classic SRCU	Paul E. McKenney
	Given that the plan is to retire Classic SRCU in the near future, this commit reduces the number of CPUs dedicated to testing Classic SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Make Classic and Tree SRCU announce themselves at bootup	Paul E. McKenney
	Currently, the only way to tell whether a given kernel is running Classic, Tiny, or Tree SRCU is to look at the .config file, which can easily be lost or associated with the wrong kernel. This commit therefore has Classic and Tree SRCU identify themselves at boot time. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Add the ability to test tiny RCU flavors	Paul E. McKenney
	This commit adds a TINY rcuperf test scenario, which allows performance testing of Tiny RCU and Tiny SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	docs: Fix typo in Documentation/memory-barriers.txt	Stan Drozd
	This commit changes "architecure" to the correct spelling, "architecture". Signed-off-by: Stan Drozd <drozdziak1@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	atomics: Add header comment so spin_unlock_wait()	Paul E. McKenney
	There is material describing the ordering guarantees provided by spin_unlock_wait(), but it is not necessarily easy to find. This commit therefore adds a docbook header comment to this function informally describing its semantics. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org>
2017-06-08	doc/atomic_ops: Clarify smp_mb__{before,after}_atomic()	Paul E. McKenney
	This commit explicitly states that surrounding a non-value-returning atomic read-modify atomic operations provides full ordering, just as is provided by value-returning atomic read-modify-write operations. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Add test for dynamically initialized srcu_struct	Paul E. McKenney
	This commit adds a perf_type of "srcud", which species that rcuperf test SRCU on a dynamically initialized srcu_struct. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	checkpatch: Remove checks for expedited grace periods	Paul E. McKenney
	There was a time when the expedited grace-period primitives (synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(), and synchronize_sched_expedited()) used rather antisocial kernel facilities like try_stop_cpus(). However, they have since been housebroken to use only single-CPU IPIs, and typically cause less disturbance than a scheduling-clock interrupt. Furthermore, this disturbance can be eliminated entirely using NO_HZ_FULL on the one hand or the rcupdate.rcu_normal boot parameter on the other. This commit therefore removes checkpatch's complaints about use of the expedited RCU primitives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcu: Make sync_rcu_preempt_exp_done() return bool	Paul E. McKenney
	The sync_rcu_preempt_exp_done() function returns a logical expression, but its return type is nevertheless int. This commit therefore changes the return type to bool. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Add a Kconfig-fragment file for Classic SRCU	Paul E. McKenney
	This commit adds a Kconfig-fragment file for Classic SRCU to ease performance comparisons with Tree SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Add ability to performance-test call_rcu() and friends	Paul E. McKenney
	This commit upgrades rcuperf so that it can do performance testing on asynchronous grace-period primitives such as call_srcu(). There is a new rcuperf.gp_async module parameter that specifies this new behavior, with the pre-existing rcuperf.gp_exp testing expedited grace periods such as synchronize_rcu_expedited, and with the default being to test synchronous non-expedited grace periods such as synchronize_rcu(). There is also a new rcuperf.gp_async_max module parameter that specifies the maximum number of outstanding callbacks per writer kthread, defaulting to 1,000. When this limit is exceeded, the writer thread invokes the appropriate flavor of rcu_barrier() to wait for callbacks to drain. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Removed the redundant initialization noted by Arnd Bergmann. ]
2017-06-08	rcu: Remove obsolete reference to synchronize_kernel()	Paul E. McKenney
	The synchronize_kernel() primitive was removed in favor of synchronize_sched() more than a decade ago, and it seems likely that rather few kernel hackers are familiar with it. Its continued presence is therefore providing more confusion than enlightenment. This commit therefore removes the reference from the synchronize_sched() header comment, and adds the corresponding information to the synchronize_rcu(0 header comment. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Remove conflicting Kconfig options	Paul E. McKenney
	The TREE and TREE54 rcuperf scenarios' Kconfig fragment files specified conflicting values for CONFIG_RCU_TRACE. This commit therefore removes the =n line in favor of the =y line. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcuperf: Defer expedited/normal check to end of test	Paul E. McKenney
	Current rcuperf startup checks to see if the user asked to measure only expedited grace periods, yet constrained all grace periods to be normal, or if the user asked to measure only normal grace periods, yet constrained all grace periods to be expedited. Useless tests of this sort are aborted. Unfortunately, making RCU work through the mid-boot dead zone [1] puts RCU into expedited-only mode during that zone. Which happens to also be the exact time that rcuperf carries out the aforementioned check. So if the user asks rcuperf to measure only normal grace periods (the default), rcuperf will now always complain and terminate the test. This commit therefore moves the checks to rcu_perf_cleanup(). This has the disadvantage of failing to abort useless tests, but avoids the need to create yet another kthread and the need to do fiddly checks involving the holdoff time. (Yes, another approach is to do the checks in a late-stage init function, but that would require some way to communicate badness to rcuperf's kthreads, and seems not worth the bother.) [1] https://lwn.net/Articles/716148/ Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcu: Complain if blocking in preemptible RCU read-side critical section	Paul E. McKenney
	Although preemptible RCU allows its read-side critical sections to be preempted, general blocking is forbidden. The reason for this is that excessive preemption times can be handled by CONFIG_RCU_BOOST=y, but a voluntarily blocked task doesn't care how high you boost its priority. Because preemptible RCU is a global mechanism, one ill-behaved reader hurts everyone. Hence the prohibition against general blocking in RCU-preempt read-side critical sections. Preemption yes, blocking no. This commit enforces this prohibition. There is a special exception for the -rt patchset (which they kindly volunteered to implement): It is OK to block (as opposed to merely being preempted) within an RCU-preempt read-side critical section, but only if the blocking is subject to priority inheritance. This exception permits CONFIG_RCU_BOOST=y to get -rt RCU readers out of trouble. Why doesn't this exception also apply to mainline's rt_mutex? Because of the possibility that someone does general blocking while holding an rt_mutex. Yes, the priority boosting will affect the rt_mutex, but it won't help with the task doing general blocking while holding that rt_mutex. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Eliminate possibility of destructive counter overflow	Paul E. McKenney
	Earlier versions of Tree SRCU were subject to a counter overflow bug that could theoretically result in too-short grace periods. This commit eliminates this problem by adding an update-side memory barrier. The short explanation is that if the updater sums the unlock counts too late to see a given __srcu_read_unlock() increment, that CPU's next __srcu_read_lock() must see the new value of ->srcu_idx, thus incrementing the other bank of counters. This eliminates the possibility of destructive counter overflow as long as the srcu_read_lock() nesting level does not exceed floor(ULONG_MAX/NR_CPUS/2), which should be an eminently reasonable nesting limit, especially on 64-bit systems. Reported-by: Lance Roy <ldr709@gmail.com> Suggested-by: Lance Roy <ldr709@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Update test scenarios based on new Kconfig dependencies	Paul E. McKenney
	A number of the rcutorture test scenarios were not using the desired Kconfig options because dependencies were preventing the selections in the Kconfig-fragment files from being honored. This commit therefore updates the Kconfig-fragment files to account for these changes in dependencies. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Correctly handle CONFIG_RCU_TORTURE_TEST_* options	Paul E. McKenney
	The rcutorture scripting handles the CONFIG__TORTURE_TEST Kconfig options specially, and therefore greps them out of the Kconfig-fragment files. Unfortunately, a poor choice of grep pattern means that the CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP, CONFIG_RCU_TORTURE_TEST_SLOW_INIT, and CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT Kconfig options are also grepped out, preventing rcutorture from using them. This commit therefore fixes the offending grep pattern to focus only on the CONFIG__TORTURE_TEST Kconfig options. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcu: Prevent rcu_barrier() from starting needless grace periods	Paul E. McKenney
	Currently rcu_barrier() uses call_rcu() to enqueue new callbacks on each CPU with a non-empty callback list. This works, but means that rcu_barrier() forces grace periods that are not otherwise needed. The key point is that rcu_barrier() never needs to wait for a grace period, but instead only for all pre-existing callbacks to be invoked. This means that rcu_barrier()'s new callbacks should be placed in the callback-list segment containing the last pre-existing callback. This commit makes this change using the new rcu_segcblist_entrain() function. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Add a scenario for Classic SRCU	Paul E. McKenney
	A robust combination of paranoia and cowardice has resulted in retaining Classic SRCU (CONFIG_CLASSIC_SRCU) as a backup for the shiny new Tiny and Tree SRCU implementations. If it is to be a viable backup, it of course needs to be tested. This commit therefore adds an rcutorture scenario named SRCU-C for Classic SRCU. This commit also adds this scenario to the set that are run by default. Once sufficient good experience has accumulated for Tiny and Tree SRCU, this test will be removed, along with the Classic SRCU implementation itself. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Add a scenario for Tiny SRCU	Paul E. McKenney
	This commit adds an SRCU-t rcutorture scenario for the new Tiny SRCU implementation, removing the need to pass the --bootargs parameter to kvm.sh to run Tiny SRCU tests. This commit also adds SRCU-t to the set of scenarios that are run by default. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Fix bug in reporting Kconfig mis-settings	Paul E. McKenney
	Kconfig "select" clauses can defeat Kconfig-fragment file attempts to clear a given Kconfig variable, and dependencies can defeat attempts to set a given Kconfig variable. Because "select" clauses and dependencies can be added at any time, there needs to be a way to verify that the Kconfig-fragment file's requests were honored. And there is, except that it is buggy. This commit therefore provides the needed fix. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Add three-level tree test for Tree SRCU	Paul E. McKenney
	This commit adds a test for a three-level srcu_node tree for Tree SRCU in the existing SRCU-P scenario. This requires enabling CONFIG_RCU_EXPERT, so the CONFIG_RCU_EXPERT=n scenario is now SRCU-N. The reason for using SRCU-P for the tall tree is that preemption raises the possibility of locating more bugs than does the non-preemptive SRCU-N. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Allow use of Classic SRCU from both process and interrupt context	Paolo Bonzini
	Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit 5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Cc: stable@vger.kernel.org Fixes: 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	rcutorture: Add lockdep to one of the SRCU scenarios	Paul E. McKenney
	Back when SRCU was simpler, there wasn't much need for lockdep. However, with Tree SRCU, it is needed. This commit therefore adds CONFIG_PROVE_LOCKING to the SRCU-P scenario. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08	srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context	Paolo Bonzini
	Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit 5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Unlike Classic and Tree SRCU, Tiny SRCU does increments and decrements on a single variable. Therefore, as Peter Zijlstra pointed out, Tiny SRCU's implementation already supports mixed-context use of srcu_read_lock() and srcu_read_unlock(), at least as long as uses of srcu_read_lock() and srcu_read_unlock() in each handler are nested and paired properly. In other words, it is still illegal to (say) invoke srcu_read_lock() in an interrupt handler and to invoke the matching srcu_read_unlock() in a softirq handler. Therefore, the only change required for Tiny SRCU is to its comments. Fixes: 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-08	xfs: fix spurious spin_is_locked() assert failures on non-smp kernels	Brian Foster
	The 0-day kernel test robot reports assertion failures on !CONFIG_SMP kernels due to failed spin_is_locked() checks. As it turns out, spin_is_locked() is hardcoded to return zero on !CONFIG_SMP kernels and so this function cannot be relied on to verify spinlock state in this configuration. To avoid this problem, replace the associated asserts with lockdep variants that do the right thing regardless of kernel configuration. Drop the one assert that checks for an unlocked lock as there is no suitable lockdep variant for that case. This moves the spinlock checks from XFS debug code to lockdep, but generally provides the same level of protection. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-06-08	net: ipv6: Release route when device is unregistering	David Ahern
	Roopa reported attempts to delete a bond device that is referenced in a multipath route is hanging: $ ifdown bond2 # ifupdown2 command that deletes virtual devices unregister_netdevice: waiting for bond2 to become free. Usage count = 2 Steps to reproduce: echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown ip link add dev bond12 type bond ip link add dev bond13 type bond ip addr add 2001:db8:2::0/64 dev bond12 ip addr add 2001:db8:3::0/64 dev bond13 ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2 ip link del dev bond12 ip link del dev bond13 The root cause is the recent change to keep routes on a linkdown. Update the check to detect when the device is unregistering and release the route for that case. Fixes: a1a22c12060e4 ("net: ipv6: Keep nexthop of multipath route on admin down") Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	net: Zero ifla_vf_info in rtnl_fill_vfinfo()	Mintz, Yuval
	Some of the structure's fields are not initialized by the rtnetlink. If driver doesn't set those in ndo_get_vf_config(), they'd leak memory to user. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb	Mateusz Jurczyk
	Verify that the length of the socket buffer is sufficient to cover the nlmsghdr structure before accessing the nlh->nlmsg_len field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	Revert "decnet: dn_rtmsg: Improve input length sanitization in ↵	David S. Miller
	dnrmg_receive_user_skb" This reverts commit 85eac2ba35a2dbfbdd5767c7447a4af07444a5b4. There is an updated version of this fix which we should use instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	net: emac: fix and unify emac_mdio functions	Christian Lamparter
	emac_mdio_read_link() was not copying the requested phy settings back into the emac driver's own phy api. This has caused a link speed mismatch issue for the AR8035 as the emac driver kept trying to connect with 10/100MBps on a 1GBit/s link. This patch also unifies shared code between emac_setup_aneg() and emac_mdio_setup_forced(). And furthermore it removes a chunk of emac_mdio_init_phy(), that was copying the same data into itself. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	net: emac: fix reset timeout with AR8035 phy	Christian Lamparter
	This patch fixes a problem where the AR8035 PHY can't be detected on an Cisco Meraki MR24, if the ethernet cable is not connected on boot. Russell Senior provided steps to reproduce the issue: \|Disconnect ethernet cable, apply power, wait until device has booted, \|plug in ethernet, check for interfaces, no eth0 is listed. \| \|This appears to be a problem during probing of the AR8035 Phy chip. \|When ethernet has no link, the phy detection fails, and eth0 is not \|created. Plugging ethernet later has no effect, because there is no \|interface as far as the kernel is concerned. The relevant part of \|the boot log looks like this: \|this is the failing case: \| \|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode \|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout \|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY! \|and the succeeding case: \| \|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode \|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:.. \|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01) Based on the comment and the commit message of commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT"). This is because the AR8035 PHY doesn't provide the TX Clock, if the ethernet cable is not attached. This causes the reset to timeout and the PHY detection code in emac_init_phy() is unable to detect the AR8035 PHY. As a result, the emac driver bails out early and the user left with no ethernet. In order to stay compatible with existing configurations, the driver tries the current reset approach at first. Only if the first attempt timed out, it does perform one more retry with the clock temporarily switched to the internal source for just the duration of the reset. LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687> Cc: Chris Blake <chrisrblake93@gmail.com> Reported-by: Russell Senior <russell@personaltelco.net> Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT") Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	loop: support 4k physical blocksize	Hannes Reinecke
	When generating bootable VM images certain systems (most notably s390x) require devices with 4k blocksize. This patch implements a new flag 'LO_FLAGS_BLOCKSIZE' which will set the physical blocksize to that of the underlying device, and allow to change the logical blocksize for up to the physical blocksize. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-08	loop: Remove unused 'bdev' argument from loop_set_capacity	Hannes Reinecke
	Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-08	decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb	Mateusz Jurczyk
	Verify that the length of the socket buffer is sufficient to cover the entire nlh->nlmsg_len field before accessing that field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	Merge tag 'kvm-s390-master-4.12-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for master (4.12) - The newly created AIS capability enables the feature unconditionally and ignores the cpu model
2017-06-08	Merge branch 'nvme-4.12' of git://git.infradead.org/nvme into for-linus	Jens Axboe
	Christoph writes: "A few NVMe fixes for 4.12-rc, PCIe reset fixes and APST fixes, a RDMA reconnect fix, two FC fixes and a general controller removal fix."
2017-06-08	hsi: Fix build regression due to netdev destructor fix.	David S. Miller
	> ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor' Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	net: s390: fix up for "Fix inconsistent teardown and release of private ↵	Stephen Rothwell
	netdev state" Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08	drm/i915: fix warning for unused variable	Jani Nikula
	drivers/gpu/drm/i915/intel_engine_cs.c: In function ‘intel_engine_is_idle’: drivers/gpu/drm/i915/intel_engine_cs.c:1103:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable] struct drm_i915_private *dev_priv = engine->i915; ^~~~~~~~ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-06-08	Fix loop device flush before configure v3	James Wang
	While installing SLES-12 (based on v4.4), I found that the installer will stall for 60+ seconds during LVM disk scan. The root cause was determined to be the removal of a bound device check in loop_flush() by commit b5dd2f6047ca ("block: loop: improve performance via blk-mq"). Restoring this check, examining ->lo_state as set by loop_set_fd() eliminates the bad behavior. Test method: modprobe loop max_loop=64 dd if=/dev/zero of=disk bs=512 count=200K for((i=0;i<4;i++))do losetup -f disk; done mkfs.ext4 -F /dev/loop0 for((i=0;i<4;i++))do mkdir t$i; mount /dev/loop$i t$i;done for f in `ls /dev/loop[0-9]*\|sort`; do \ echo $f; dd if=$f of=/dev/null bs=512 count=1; \ done Test output: stock patched /dev/loop0 18.1217e-05 8.3842e-05 /dev/loop1 6.1114e-05 0.000147979 /dev/loop10 0.414701 0.000116564 /dev/loop11 0.7474 6.7942e-05 /dev/loop12 0.747986 8.9082e-05 /dev/loop13 0.746532 7.4799e-05 /dev/loop14 0.480041 9.3926e-05 /dev/loop15 1.26453 7.2522e-05 Note that from loop10 onward, the device is not mounted, yet the stock kernel consumes several orders of magnitude more wall time than it does for a mounted device. (Thanks for Mike Galbraith <efault@gmx.de>, give a changelog review.) Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: James Wang <jnwang@suse.com> Fixes: b5dd2f6047ca ("block: loop: improve performance via blk-mq") Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-08	s390: update defconfig	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>