summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-06-10sock_diag: fix filter code sent to userspaceNicolas Dichtel
Filters need to be translated to real BPF code for userland, like SO_GETFILTER. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Merge tag 'renesas-dt-for-v3.11' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/dt From Simon Horman: Renesas ARM-based SoC DT updates for v3.11 * Armadillo800eva reference DT - bring up armadillo800eva baord using DT as much as possible * Remove unused GIC dtsi entries for r8a7790 and r8a73a4 * Add AS3711 and CPUFreq DT bindings for kzm9g-reference * Add irqpin DT support for marzen-reference * tag 'renesas-dt-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: marzen-reference: add irqpin support in DT ARM: shmobile: kzm9g-reference: add AS3711 and CPUFreq DT bindings ARM: shmobile: armadillo800eva: Reference DT implementation ARM: shmobile: Remove unused r8a7790 GIC CPU interface DT bits ARM: shmobile: Remove unused r8a73a4 GIC CPU interface DT bits ARM: shmobile: r8a7740: Prepare for reference DT setup ARM: shmobile: r8a7740: Add OF support to initialze the GIC Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-10Merge tag 'renesas-defconfig-for-v3.11' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/boards From Simon Horman: Renesas ARM based SoC defconfig updates for v3.11 kzm9g: Enable AS3711 PMIC bockw: Enable, USB, PM_RUNTIME, I2C and SDHI armadillo800eva: Correct SERIAL_SH_SCI_NR_UARTS * tag 'renesas-defconfig-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: kzm9g: enable AS3711 PMIC in defconfig ARM: shmobile: bockw: enable USB in defconfig ARM: shmobile: bockw: enable CONFIG_PM_RUNTIME in defconfig ARM: shmobile: bockw: enable I2C in defconfig ARM: shmobile: bockw: enable SDHI on defconfig ARM: shmobile: armadillo800eva: Fix maximum number of SCIF Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-10Merge tag 'pcie_kw-3.11-2' of git://git.infradead.org/users/jcooper/linux ↵Olof Johansson
into next/soc From Jason Cooper: mvebu pcie driver (kirkwood) for v3.11 (round 2) - kirkwood - migrate Netgear ReadyNAS Duo v2 to pcie DT init * tag 'pcie_kw-3.11-2' of git://git.infradead.org/users/jcooper/linux: arm: kirkwood: NETGEAR ReadyNAS Duo v2 init PCIe via DT Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-10Merge tag 'dt-3.11-4' of git://git.infradead.org/users/jcooper/linux into ↵Olof Johansson
next/dt From Jason Cooper: mvebu dt changes for v3.11 (round 4) - kirkwood - reshuffle nodes from kirkwood.dtsi to -6281.dtsi, etc - add i2c-gpio for km_kirkwood - add cpu node so pending cpufreq driver will init * tag 'dt-3.11-4' of git://git.infradead.org/users/jcooper/linux: ARM: Kirkwood add cpus definition needed by cpufreq driver to dtsi ARM: kirkwood: add i2c-gpio controller for km_kirkwood ARM: kirkwood: refactor dtsi to largest common nodes Signed-off-by: Olof Johansson <olof@lixom.net>
2013-06-10Input: add missing dependencies on CONFIG_HAS_IOMEMBen Hutchings
Several drivers don't build on s390 with CONFIG_PCI disabled as they require MMIO functions. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-06-10Fix lockup related to stop_machine being stuck in __do_softirq.Ben Greear
The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck in __do_softirq. The reason __do_softirq can hang is that it has a bail-out based on jiffies timeout, but in the lockup case, jiffies itself is not incremented. To work around this, re-add the max_restart counter in __do_irq and stop processing irqs after 10 restarts. Thanks to Tejun Heo and Rusty Russell and others for helping me track this down. This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce latencies"). It may be worth looking into ath9k to see if it has issues with its irq handler at a later date. The hang stack traces look something like this: ------------[ cut here ]------------ WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7() Watchdog detected hard LOCKUP on cpu 2 Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11 Call Trace: <NMI> warn_slowpath_common+0x85/0x9f warn_slowpath_fmt+0x46/0x48 watchdog_overflow_callback+0x9c/0xa7 __perf_event_overflow+0x137/0x1cb perf_event_overflow+0x14/0x16 intel_pmu_handle_irq+0x2dc/0x359 perf_event_nmi_handler+0x19/0x1b nmi_handle+0x7f/0xc2 do_nmi+0xbc/0x304 end_repeat_nmi+0x1e/0x2e <<EOE>> cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 ---[ end trace 4947dfa9b0a4cec3 ]--- BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17] Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc] irq event stamp: 835637905 hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257 hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257 softirqs last disabled at (5654725): irq_exit+0x5f/0xbb CPU 1 Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M. RIP: tasklet_hi_action+0xf0/0xf0 Process migration/1 Call Trace: <IRQ> __do_softirq+0x117/0x257 irq_exit+0x5f/0xbb smp_apic_timer_interrupt+0x8a/0x98 apic_timer_interrupt+0x72/0x80 <EOI> printk+0x4d/0x4f stop_machine_cpu_stop+0x22c/0x274 cpu_stopper_thread+0xae/0x162 smpboot_thread_fn+0x258/0x260 kthread+0xc7/0xcf ret_from_fork+0x7c/0xb0 Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Pekka Riikonen <priikone@iki.fi> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-10Merge tag '9p-3.10-bug-fix-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull net/9p bug fix from Eric Van Hensbergen: "zero copy error fix" * tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: Handle error in zero copy request correctly for 9p2000.u
2013-06-11netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr()Pablo Neira Ayuso
In (bc6bcb5 netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary), the use of tcp_hdr was introduced. However, we cannot assume that skb->transport_header is set for non-local packets. Cc: Florian Westphal <fw@strlen.de> Reported-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-11Merge branch 'gma500-fixes' of git://github.com/patjak/drm-gma500 into drm-fixesDave Airlie
Patrik writes: Two fixes for memory leaks split into Cedarview and Poulsbo versions, and a fix for properly setting the pipe base when using fbdev. It's on my todo-list to start unifying the chips since they are very similar, but until then I'd like to split them up in case there are side-effects on Cedarview that I cannot currently test. airled: Verified pull from github matches what I expected. * 'gma500-fixes' of git://github.com/patjak/drm-gma500: drm/gma500/cdv: Fix cursor gem obj referencing on cdv drm/gma500/psb: Fix cursor gem obj referencing on psb drm/gma500/cdv: Unpin framebuffer on crtc disable drm/gma500/psb: Unpin framebuffer on crtc disable drm/gma500: Add fb gtt offset to fb base
2013-06-10clk: exynos5250: Add sclk_mpll to the parent list of mout_cpu clockTushar Behera
'mout_mpll' is added the list of parent clocks for 'mout_cpu'. 'mout_mpll' is an alias to the clock 'sclk_mpll'. Hence 'sclk_mpll' should be added to the list of parent clocks. This results in an error when cpufreq driver for EXYNOS5250 tries to set 'mout_mpll' as a parent for 'mout_cpu'. clk_set_parent: clk sclk_mpll can not be parent of clk mout_cpu Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Mike Turquette <mturquette@linaro.org>
2013-06-10clk: exynos5250: Update cpufreq related clocks for EXYNOS5250Tushar Behera
cpufreq driver for EXYNOS5250 is not a platform driver, hence we cannot currently pass the clock names through a device tree node. Instead, we need to make them available through a global alias. cpufreq driver for EXYNOS5250 requires four clocks - 'armclk', 'mout_cpu', 'mout_mpll' and 'mout_apll'. 'armclk' has already been defined with an alias, 'mout_cpu', 'mout_mpll' and 'mout_apll' are now defined with an alias. Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Mike Turquette <mturquette@linaro.org>
2013-06-10USB: serial: ports: add minor and port numberGreg Kroah-Hartman
The usb_serial_port structure had the number field, which was the minor number for the port, which almost no one really cared about. They really wanted the number of the port within the device, which you had to subtract from the minor of the parent usb_serial_device structure. To clean this up, provide the real minor number of the port, and the number of the port within the serial device separately, as these numbers might not be related in the future. Bonus is that this cleans up a lot of logic in the drivers, and saves lines overall. Tested-by: Tobias Winter <tobias@linuxdingsda.de> Reviewed-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -- drivers/staging/serqt_usb2/serqt_usb2.c | 21 +++-------- drivers/usb/serial/ark3116.c | 2 - drivers/usb/serial/bus.c | 6 +-- drivers/usb/serial/console.c | 2 - drivers/usb/serial/cp210x.c | 2 - drivers/usb/serial/cypress_m8.c | 4 +- drivers/usb/serial/digi_acceleport.c | 6 --- drivers/usb/serial/f81232.c | 5 +- drivers/usb/serial/garmin_gps.c | 6 +-- drivers/usb/serial/io_edgeport.c | 58 ++++++++++++-------------------- drivers/usb/serial/io_ti.c | 21 ++++------- drivers/usb/serial/keyspan.c | 29 +++++++--------- drivers/usb/serial/metro-usb.c | 4 +- drivers/usb/serial/mos7720.c | 37 +++++++++----------- drivers/usb/serial/mos7840.c | 52 +++++++++------------------- drivers/usb/serial/opticon.c | 2 - drivers/usb/serial/pl2303.c | 2 - drivers/usb/serial/quatech2.c | 7 +-- drivers/usb/serial/sierra.c | 2 - drivers/usb/serial/ti_usb_3410_5052.c | 10 ++--- drivers/usb/serial/usb-serial.c | 7 ++- drivers/usb/serial/usb_wwan.c | 2 - drivers/usb/serial/whiteheat.c | 20 +++++------ include/linux/usb/serial.h | 6 ++- 24 files changed, 133 insertions(+), 180 deletions(-)
2013-06-10tuntap: fix a possible race between queue selection and changing queuesJason Wang
Complier may generate codes that re-read the tun->numqueues during tun_select_queue(). This may be a race if vlan->numqueues were changed in the same time and can lead unexpected result (e.g. very huge value). We need prevent the compiler from generating such codes by adding an ACCESS_ONCE() to make sure tun->numqueues were only read once. Bug were introduced by commit c8d68e6be1c3b242f1c598595830890b65cea64a (tuntap: multiqueue support). Reported-by: Michael S. Tsirkin <mst@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10vhost_net: clear msg.control for non-zerocopy case during txJason Wang
When we decide not use zero-copy, msg.control should be set to NULL otherwise macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs wrongly. Bug were introduced by commit cedb9bdce099206290a2bdd02ce47a7b253b6a84 (vhost-net: skip head management if no outstanding). This solves the following warnings: WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]() Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun] CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566 Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011 ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48 ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0 ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58 Call Trace: [<ffffffff81796b73>] dump_stack+0x19/0x1e [<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20 [<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net] [<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net] [<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffff81061f46>] kthread+0xc6/0xd0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Modify UEFI anti-bricking codeMatthew Garrett
This patch reworks the UEFI anti-bricking code, including an effective reversion of cc5a080c and 31ff2f20. It turns out that calling QueryVariableInfo() from boot services results in some firmware implementations jumping to physical addresses even after entering virtual mode, so until we have 1:1 mappings for UEFI runtime space this isn't going to work so well. Reverting these gets us back to the situation where we'd refuse to create variables on some systems because they classify deleted variables as "used" until the firmware triggers a garbage collection run, which they won't do until they reach a lower threshold. This results in it being impossible to install a bootloader, which is unhelpful. Feedback from Samsung indicates that the firmware doesn't need more than 5KB of storage space for its own purposes, so that seems like a reasonable threshold. However, there's still no guarantee that a platform will attempt garbage collection merely because it drops below this threshold. It seems that this is often only triggered if an attempt to write generates a genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to create a variable larger than the remaining space. This should fail, but if it somehow succeeds we can then immediately delete it. I've tested this on the UEFI machines I have available, but I don't have a Samsung and so can't verify that it avoids the bricking problem. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ] Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-06-10Merge branches 'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', ↵Paul E. McKenney
'srcu.2013.06.10a' and 'tiny.2013.06.10a' into HEAD cbnum.2013.06.10a: Apply simplifications stemming from the new callback numbering. doc.2013.06.10a: Documentation updates. fixes.2013.06.10a: Miscellaneous fixes. srcu.2013.06.10a: Updates to SRCU. tiny.2013.06.10a: Eliminate TINY_PREEMPT_RCU.
2013-06-10rcu: Shrink TINY_RCU by reworking CPU-stall ifdefsPaul E. McKenney
TINY_RCU's reset_cpu_stall_ticks() and check_cpu_stalls() functions are defined unconditionally, and are empty functions if CONFIG_RCU_TRACE is disabled (which in turns disables detection of RCU CPU stalls). This commit saves a few lines of source code by defining these functions only if CONFIG_RCU_TRACE=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-06-10rcu: Shrink TINY_RCU by moving exit_rcu()Paul E. McKenney
Now that TINY_PREEMPT_RCU is no more, exit_rcu() is always an empty function. But if TINY_RCU is going to have an empty function, it should be in include/linux/rcutiny.h, where it does not bloat the kernel. This commit therefore moves exit_rcu() out of kernel/rcupdate.c to kernel/rcutree_plugin.h, and places a static inline empty function in include/linux/rcutiny.h in order to shrink TINY_RCU a bit. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove TINY_PREEMPT_RCU tracing documentationPaul E. McKenney
Because TINY_PREEMPT_RCU is no more, this commit removes its tracing formats from the documentation. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Consolidate rcutiny_plugin.h ifdefsPaul E. McKenney
This commit rearranges code in order to allow ifdefs to be consolidated in kernel/rcutiny_plugin.h, simplifying the code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove rcu_preempt_note_context_switch()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_note_context_switch() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.hPaul E. McKenney
Now that CONFIG_TINY_PREEMPT_RCU is no more, this commit removes the CONFIG_TINY_RCU ifdefs from include/linux/rcutiny.h in favor of unconditionally compiling the CONFIG_TINY_RCU legs of those ifdefs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Moved removal of #else to "Remove TINY_PREEMPT_RCU" as suggested by Josh Triplett. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove check_cpu_stall_preempt()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, check_cpu_stall_preempt() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Simplify RCU_TINY RCU callback invocationPaul E. McKenney
TINY_PREEMPT_RCU could use a kthread to handle RCU callback invocation, which required an API to abstract kthread vs. softirq invocation. Now that TINY_PREEMPT_RCU is no longer with us, this commit retires this API in favor of direct use of the relevant softirq primitives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove rcu_preempt_process_callbacks()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_process_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove rcu_preempt_remove_callbacks()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_remove_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove rcu_preempt_check_callbacks()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, rcu_preempt_check_callbacks() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove show_tiny_preempt_stats()Paul E. McKenney
With the removal of CONFIG_TINY_PREEMPT_RCU, show_tiny_preempt_stats() is now an empty function. This commit therefore eliminates it by inlining it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove TINY_PREEMPT_RCUPaul E. McKenney
TINY_PREEMPT_RCU adds significant code and complexity, but does not offer commensurate benefits. People currently using TINY_PREEMPT_RCU can get much better memory footprint with TINY_RCU, or, if they really need preemptible RCU, they can use TREE_PREEMPT_RCU with a relatively minor degradation in memory footprint. Please note that this move has been widely publicized on LKML (https://lkml.org/lkml/2012/11/12/545) and on LWN (http://lwn.net/Articles/541037/). This commit therefore removes TINY_PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Updated to eliminate #else in rcutiny.h as suggested by Josh ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10powerpc,kvm: fix imbalance srcu_read_[un]lock()Lai Jiangshan
At the point of up_out label in kvmppc_hv_setup_htab_rma(), srcu read lock is still held. We have to release it before return. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Alexander Graf <agraf@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: kvm@vger.kernel.org Cc: kvm-ppc@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().Paul E. McKenney
These interfaces never did get used, so this commit removes them, their rcutorture tests, and documentation referencing them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Apply Dave Jones's NOCB Kconfig help feedbackPaul E. McKenney
The Kconfig help text for the RCU_NOCB_CPU_NONE, RCU_NOCB_CPU_ZERO, and RCU_NOCB_CPU_ALL Kconfig options was unclear, so this commit adds a bit more detail. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-06-10rcu: Merge adjacent identical ifdefsPaul E. McKenney
Two ifdefs in kernel/rcupdate.c now have identical conditions with nothing between them, so the commit merges them into a single ifdef. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Drive quiescent-state-forcing delay from HZPaul E. McKenney
Systems with HZ=100 can have slow bootup times due to the default three-jiffy delays between quiescent-state forcing attempts. This commit therefore auto-tunes the RCU_JIFFIES_TILL_FORCE_QS value based on the value of HZ. However, this would break very large systems that require more time between quiescent-state forcing attempts. This commit therefore also ups the default delay by one jiffy for each 256 CPUs that might be on the system (based off of nr_cpu_ids at runtime, -not- NR_CPUS at build time). Updated to collapse #ifdefs for RCU_JIFFIES_TILL_FORCE_QS into a step-function definition as suggested by Josh Triplett. Reported-by: Paul Mackerras <paulus@au1.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-06-10rcu: Remove "Experimental" flagsPaul E. McKenney
After a release or two, features are no longer experimental. Therefore, this commit removes the "Experimental" tag from them. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10kthread: Add kworker kthreads to OS-jitter documentationPaul E. McKenney
The kworker workqueue kthreads can also contribute to OS jitter. The amount of jitter depends on their use, so this commit adds documentation on avoiding OS jitter due to workqueue use. Reported-by: Jonathan Clairembault <jonathan.clairembault@novasparks.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10nohz_full: Document additional restrictionsPaul E. McKenney
This commit calls out the potential for slowing the tick even when there are multiple runnable processes per CPU, It also points out that current mainlined version keeps the tick going on at least one CPU even when all CPUs are otherwise idle. Finally, it notes the need for a 1-HZ tick in order to calculate CPU load, maintain sched average, compute CFS entity vruntime, compute avenrun, and carry out load balancing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10nohz_full: Update based on Sedat Dilek reviewPaul E. McKenney
Make it more clear that there are three options, and give hints as to which of the three is most likely to be useful in different situations. Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Move redundant call to note_gp_changes() into called functionPaul E. McKenney
The __rcu_process_callbacks() invokes note_gp_changes() immediately before invoking rcu_check_quiescent_state(), which conditionally invokes that same function. This commit therefore eliminates the call to note_gp_changes() in __rcu_process_callbacks() in favor of making unconditional to call from rcu_check_quiescent_state() to note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Inline trivial wrapper function rcu_start_gp_per_cpu()Paul E. McKenney
Given the changes that introduce note_gp_change(), rcu_start_gp_per_cpu() is now a trivial wrapper function with only one caller. This commit therefore inlines it into its sole call site. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Eliminate check_for_new_grace_period() wrapper functionPaul E. McKenney
One of the calls to check_for_new_grace_period() is now redundant due to an immediately preceding call to note_gp_changes(). Eliminating this redundant call leaves a single caller, which is simpler if inlined. This commit therefore eliminates the redundant call and inlines the body of check_for_new_grace_period() into the single remaining call site. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Merge __rcu_process_gp_end() into __note_gp_changes()Paul E. McKenney
This commit eliminates some duplicated code by merging __rcu_process_gp_end() into __note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Switch callers from rcu_process_gp_end() to note_gp_changes()Paul E. McKenney
Because note_gp_changes() now incorporates rcu_process_gp_end() function, this commit switches to the former and eliminates the latter. In addition, this commit changes external calls from __rcu_process_gp_end() to __note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Convert rcutree_plugin.h printk callsPaul E. McKenney
This commit converts printk() calls to the corresponding pr_*() calls. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Rename note_new_gpnum() to note_gp_changes()Paul E. McKenney
Because note_new_gpnum() now also checks for the ends of old grace periods, this commit changes its name to note_gp_changes(). Later commits will merge rcu_process_gp_end() into note_gp_changes(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Make __note_new_gpnum() check for ends of prior grace periodsPaul E. McKenney
The current implementation can detect the beginning of a new grace period before noting the end of a previous grace period. Although the current implementation correctly handles this sort of nonsense, it would be good to reduce RCU's state space by making such nonsense unnecessary, which is now possible thanks to the fact that RCU's callback groups are now numbered. This commit therefore makes __note_new_gpnum() invoke __rcu_process_gp_end() in order to note the ends of prior grace periods before noting the beginnings of new grace periods. Of course, this now means that note_new_gpnum() notes both the beginnings and ends of grace periods, and could therefore be used in place of rcu_process_gp_end(). But that is a job for later commits. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Move code to apply callback-numbering simplificationsPaul E. McKenney
The addition of callback numbering allows combining the detection of the ends of old grace periods and the beginnings of new grace periods. This commit moves code to set the stage for this combining. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Convert rcutree.c printk callsPaul E. McKenney
This commit converts printk() calls to the corresponding pr_*() calls. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migrationPaul E. McKenney
In Steven Rostedt's words: > I've been debugging the last couple of days why my tests have been > locking up. One of my tracing tests, runs all available tracers. The > lockup always happened with the mmiotrace, which is used to trace > interactions between priority drivers and the kernel. But to do this > easily, when the tracer gets registered, it disables all but the boot > CPUs. The lockup always happened after it got done disabling the CPUs. > > Then I decided to try this: > > while :; do > for i in 1 2 3; do > echo 0 > /sys/devices/system/cpu/cpu$i/online > done > for i in 1 2 3; do > echo 1 > /sys/devices/system/cpu/cpu$i/online > done > done > > Well, sure enough, that locked up too, with the same users. Doing a > sysrq-w (showing all blocked tasks): > > [ 2991.344562] task PC stack pid father > [ 2991.344562] rcu_preempt D ffff88007986fdf8 0 10 2 0x00000000 > [ 2991.344562] ffff88007986fc98 0000000000000002 ffff88007986fc48 0000000000000908 > [ 2991.344562] ffff88007986c280 ffff88007986ffd8 ffff88007986ffd8 00000000001d3c80 > [ 2991.344562] ffff880079248a40 ffff88007986c280 0000000000000000 00000000fffd4295 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81541750>] schedule_timeout+0xbc/0xf9 > [ 2991.344562] [<ffffffff8154bec0>] ? ftrace_call+0x5/0x2f > [ 2991.344562] [<ffffffff81049513>] ? cascade+0xa8/0xa8 > [ 2991.344562] [<ffffffff815417ab>] schedule_timeout_uninterruptible+0x1e/0x20 > [ 2991.344562] [<ffffffff810c980c>] rcu_gp_kthread+0x502/0x94b > [ 2991.344562] [<ffffffff81062791>] ? __init_waitqueue_head+0x50/0x50 > [ 2991.344562] [<ffffffff810c930a>] ? rcu_gp_fqs+0x64/0x64 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81091e31>] ? lock_release_holdtime.part.23+0x4e/0x55 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] kworker/0:1 D ffffffff81a30680 0 47 2 0x00000000 > [ 2991.344562] Workqueue: events cpuset_hotplug_workfn > [ 2991.344562] ffff880078dbbb58 0000000000000002 0000000000000006 00000000000000d8 > [ 2991.344562] ffff880078db8100 ffff880078dbbfd8 ffff880078dbbfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880078db8100 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 2991.344562] [<ffffffff810af7e6>] rebuild_sched_domains_locked+0x6e/0x3a8 > [ 2991.344562] [<ffffffff810b0ec6>] rebuild_sched_domains+0x1c/0x2a > [ 2991.344562] [<ffffffff810b109b>] cpuset_hotplug_workfn+0x1c7/0x1d3 > [ 2991.344562] [<ffffffff810b0ed9>] ? cpuset_hotplug_workfn+0x5/0x1d3 > [ 2991.344562] [<ffffffff81058e07>] process_one_work+0x2d4/0x4d1 > [ 2991.344562] [<ffffffff81058d3a>] ? process_one_work+0x207/0x4d1 > [ 2991.344562] [<ffffffff8105964c>] worker_thread+0x2e7/0x3b5 > [ 2991.344562] [<ffffffff81059365>] ? rescuer_thread+0x332/0x332 > [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0 > [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58 > [ 2991.344562] bash D ffffffff81a4aa80 0 2618 2612 0x10000000 > [ 2991.344562] ffff8800379abb58 0000000000000002 0000000000000006 0000000000000c2c > [ 2991.344562] ffff880077fea140 ffff8800379abfd8 ffff8800379abfd8 00000000001d3c80 > [ 2991.344562] ffff8800779ca5c0 ffff880077fea140 ffffffff81541fcf 0000000000000000 > [ 2991.344562] Call Trace: > [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66 > [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24 > [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609 > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40 > [ 2991.344562] [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 2991.344562] [<ffffffff81091c99>] ? __lock_is_held+0x32/0x53 > [ 2991.344562] [<ffffffff81548912>] notifier_call_chain+0x6b/0x98 > [ 2991.344562] [<ffffffff810671fd>] __raw_notifier_call_chain+0xe/0x10 > [ 2991.344562] [<ffffffff8103cf64>] __cpu_notify+0x20/0x32 > [ 2991.344562] [<ffffffff8103cf8d>] cpu_notify_nofail+0x17/0x36 > [ 2991.344562] [<ffffffff815225de>] _cpu_down+0x154/0x259 > [ 2991.344562] [<ffffffff81522710>] cpu_down+0x2d/0x3a > [ 2991.344562] [<ffffffff81526351>] store_online+0x4e/0xe7 > [ 2991.344562] [<ffffffff8134d764>] dev_attr_store+0x20/0x22 > [ 2991.344562] [<ffffffff811b3c5f>] sysfs_write_file+0x108/0x144 > [ 2991.344562] [<ffffffff8114c5ef>] vfs_write+0xfd/0x158 > [ 2991.344562] [<ffffffff8114c928>] SyS_write+0x5c/0x83 > [ 2991.344562] [<ffffffff8154c494>] tracesys+0xdd/0xe2 > > As well as held locks: > > [ 3034.728033] Showing all locks held in the system: > [ 3034.728033] 1 lock held by rcu_preempt/10: > [ 3034.728033] #0: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff810c9471>] rcu_gp_kthread+0x167/0x94b > [ 3034.728033] 4 locks held by kworker/0:1/47: > [ 3034.728033] #0: (events){.+.+.+}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #1: (cpuset_hotplug_work){+.+.+.}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1 > [ 3034.728033] #2: (cpuset_mutex){+.+.+.}, at: [<ffffffff810b0ec1>] rebuild_sched_domains+0x17/0x2a > [ 3034.728033] #3: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50 > [ 3034.728033] 1 lock held by mingetty/2563: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2565: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2569: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2572: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 1 lock held by mingetty/2575: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > [ 3034.728033] 7 locks held by bash/2618: > [ 3034.728033] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8114bc3f>] file_start_write+0x2a/0x2c > [ 3034.728033] #1: (&buffer->mutex#2){+.+.+.}, at: [<ffffffff811b3b93>] sysfs_write_file+0x3c/0x144 > [ 3034.728033] #2: (s_active#54){.+.+.+}, at: [<ffffffff811b3c3e>] sysfs_write_file+0xe7/0x144 > [ 3034.728033] #3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff810217c2>] cpu_hotplug_driver_lock+0x17/0x19 > [ 3034.728033] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103d196>] cpu_maps_update_begin+0x17/0x19 > [ 3034.728033] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103cfd8>] cpu_hotplug_begin+0x2c/0x6d > [ 3034.728033] #6: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e > [ 3034.728033] 1 lock held by bash/2980: > [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8 > > Things looked a little weird. Also, this is a deadlock that lockdep did > not catch. But what we have here does not look like a circular lock > issue: > > Bash is blocked in rcu_cpu_notify(): > > 1961 /* Exclude any attempts to start a new grace period. */ > 1962 mutex_lock(&rsp->onoff_mutex); > > > kworker is blocked in get_online_cpus(), which makes sense as we are > currently taking down a CPU. > > But rcu_preempt is not blocked on anything. It is simply sleeping in > rcu_gp_kthread (really rcu_gp_init) here: > > 1453 #ifdef CONFIG_PROVE_RCU_DELAY > 1454 if ((prandom_u32() % (rcu_num_nodes * 8)) == 0 && > 1455 system_state == SYSTEM_RUNNING) > 1456 schedule_timeout_uninterruptible(2); > 1457 #endif /* #ifdef CONFIG_PROVE_RCU_DELAY */ > > And it does this while holding the onoff_mutex that bash is waiting for. > > Doing a function trace, it showed me where it happened: > > [ 125.940066] rcu_pree-10 3.... 28384115273: schedule_timeout_uninterruptible <-rcu_gp_kthread > [...] > [ 125.940066] rcu_pree-10 3d..3 28384202439: sched_switch: prev_comm=rcu_preempt prev_pid=10 prev_prio=120 prev_state=D ==> next_comm=watchdog/3 next_pid=38 next_prio=120 > > The watchdog ran, and then: > > [ 125.940066] watchdog-38 3d..3 28384692863: sched_switch: prev_comm=watchdog/3 prev_pid=38 prev_prio=120 prev_state=P ==> next_comm=modprobe next_pid=2848 next_prio=118 > > Not sure what modprobe was doing, but shortly after that: > > [ 125.940066] modprobe-2848 3d..3 28385041749: sched_switch: prev_comm=modprobe prev_pid=2848 prev_prio=118 prev_state=R+ ==> next_comm=migration/3 next_pid=40 next_prio=0 > > Where the migration thread took down the CPU: > > [ 125.940066] migratio-40 3d..3 28389148276: sched_switch: prev_comm=migration/3 prev_pid=40 prev_prio=0 prev_state=P ==> next_comm=swapper/3 next_pid=0 next_prio=120 > > which finally did: > > [ 125.940066] <idle>-0 3...1 28389282142: arch_cpu_idle_dead <-cpu_startup_entry > [ 125.940066] <idle>-0 3...1 28389282548: native_play_dead <-arch_cpu_idle_dead > [ 125.940066] <idle>-0 3...1 28389282924: play_dead_common <-native_play_dead > [ 125.940066] <idle>-0 3...1 28389283468: idle_task_exit <-play_dead_common > [ 125.940066] <idle>-0 3...1 28389284644: amd_e400_remove_cpu <-play_dead_common > > > CPU 3 is now offline, the rcu_preempt thread that ran on CPU 3 is still > doing a schedule_timeout_uninterruptible() and it registered it's > timeout to the timer base for CPU 3. You would think that it would get > migrated right? The issue here is that the timer migration happens at > the CPU notifier for CPU_DEAD. The problem is that the rcu notifier for > CPU_DOWN is blocked waiting for the onoff_mutex to be released, which is > held by the thread that just put itself into a uninterruptible sleep, > that wont wake up until the CPU_DEAD notifier of the timer > infrastructure is called, which wont happen until the rcu notifier > finishes. Here's our deadlock! This commit breaks this deadlock cycle by substituting a shorter udelay() for the previous schedule_timeout_uninterruptible(), while at the same time increasing the probability of the delay. This maintains the intensity of the testing. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Steven Rostedt <rostedt@goodmis.org>