linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-08-21	drm/nouveau/mc: fix race condition between constructor and request_irq()	Ben Skeggs
	Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-08-21	drm/nouveau: fix reclocking on nv40	Pali Rohár
	In commit 77145f1cbdf8d28b46ff8070ca749bad821e0774 was introduced error which cause that reclocking on nv40 not working anymore. There is missing assigment of return value from pll_calc to ret. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Martin Peres <martin.peres@labri.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-08-21	drm/nouveau/ltcg: fix allocating memory as free	Maarten Lankhorst
	Allocating type=0 marks the memory as free. This allows the ltcg memory to be allocated twice. Add a BUG_ON in core/mm.c to prevent this ever happening again. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-08-21	drm/nouveau/ltcg: fix ltcg memory initialization after suspend	Maarten Lankhorst
	Some registers were not initialized in init, this causes them to be uninitialized after suspend. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-08-21	drm/nouveau/fb: fix null derefs in nv49 and nv4e init	Ilia Mirkin
	Commit dceef5d87 (drm/nouveau/fb: initialise vram controller as pfb sub-object) moved some code around and introduced these null derefs. pfb->ram is set to the new ram object outside of this ctor. Reported-by: Ronald Uitermark <ronald645@gmail.com> Tested-by: Ronald Uitermark <ronald645@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-08-20	Merge branch 'for-davem' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Regarding the iwlwifi bits, Johannes says: "We revert an rfkill bugfix that unfortunately caused more bugs, shuffle some code to avoid touching the PCIe device before it's enabled and disconnect if firmware fails to do our bidding. I also have Stanislaw's fix to not crash in some channel switch scenarios." As for the mac80211 bits, Johannes says: "This time, I have one fix from Dan Carpenter for users of nl80211hdr_put(), and one fix from myself fixing a regression with the libertas driver." Along with the above... Dan Carpenter fixes some incorrectly placed "address of" operators in hostap that caused copying of junk data. Jussi Kivilinna corrects zd1201 to use an allocated buffer rather than the stack for a URB operation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	packet: restore packet statistics tp_packets to include drops	Willem de Bruijn
	getsockopt PACKET_STATISTICS returns tp_packets + tp_drops. Commit ee80fbf301 ("packet: account statistics only in tpacket_stats_u") cleaned up the getsockopt PACKET_STATISTICS code. This also changed semantics. Historically, tp_packets included tp_drops on return. The commit removed the line that adds tp_drops into tp_packets. This patch reinstates the old semantics. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	net: phy: rtl8211: fix interrupt on status link change	Giuseppe CAVALLARO
	This is to fix a problem in the rtl8211 where the driver wasn't properly enabled the interrupt on link change status. it has to enable the ineterrupt on the bit 10 in the register 18 (INER). Reported-by: Sharma Bhupesh <B45370@freescale.com> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-21	cpufreq: fix bad unlock balance on !CONFIG_SMP	Li Zhong
	This patch tries to fix lockdep complaint attached below. It seems that we should always read acquire the cpufreq_rwsem, whether CONFIG_SMP is enabled or not. And CONFIG_HOTPLUG_CPU depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the code enabled by CONFIG_HOTPLUG_CPU. [ 0.504191] ===================================== [ 0.504627] [ BUG: bad unlock balance detected! ] [ 0.504627] 3.11.0-rc6-next-20130819 #1 Not tainted [ 0.504627] ------------------------------------- [ 0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at: [ 0.504627] [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] but there are no more locks to release! [ 0.504627] [ 0.504627] other info that might help us debug this: [ 0.504627] 1 lock held by swapper/1: [ 0.504627] #0: (subsys mutex#4){+.+.+.}, at: [<ffffffff8134a7bf>] subsys_interface_register+0x4f/0xe0 [ 0.504627] [ 0.504627] stack backtrace: [ 0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-20130819 #1 [ 0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 0.504627] ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8 [ 0.504627] ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a [ 0.504627] 00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006 [ 0.504627] Call Trace: [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff814c062b>] dump_stack+0x19/0x1b [ 0.504627] [<ffffffff81098bce>] print_unlock_imbalance_bug+0xfe/0x110 [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8109d0be>] lock_release_non_nested+0x1ee/0x310 [ 0.504627] [<ffffffff81099d0e>] ? mark_held_locks+0xae/0x120 [ 0.504627] [<ffffffff811510cb>] ? kfree+0xcb/0x1d0 [ 0.504627] [<ffffffff813d77ea>] ? cpufreq_policy_free+0x4a/0x60 [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8109d2a4>] lock_release+0xc4/0x250 [ 0.504627] [<ffffffff8106c9f3>] up_read+0x23/0x40 [ 0.504627] [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8134a809>] subsys_interface_register+0x99/0xe0 [ 0.504627] [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12 [ 0.504627] [<ffffffff813d7f0d>] cpufreq_register_driver+0x9d/0x1d0 [ 0.504627] [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12 [ 0.504627] [<ffffffff81b1a039>] acpi_cpufreq_init+0xfe/0x1f8 [ 0.504627] [<ffffffff810002ba>] do_one_initcall+0xda/0x180 [ 0.504627] [<ffffffff81ae301e>] kernel_init_freeable+0x12c/0x1bb [ 0.504627] [<ffffffff81ae2841>] ? do_early_param+0x8c/0x8c [ 0.504627] [<ffffffff814b4dd0>] ? rest_init+0x140/0x140 [ 0.504627] [<ffffffff814b4dde>] kernel_init+0xe/0xf0 [ 0.504627] [<ffffffff814d029a>] ret_from_fork+0x7a/0xb0 [ 0.504627] [<ffffffff814b4dd0>] ? rest_init+0x140/0x140 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-and-tested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	hid: roccat-pyra: convert class code to use bin_attrs in groups	Greg Kroah-Hartman
	Now that attribute groups support binary attributes, use them instead of the dev_bin_attrs field in struct class, as that is going away soon. Cc: Jiri Kosina <jkosina@suse.cz> Cc: Stefan Achatz <erazor_de@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20	Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge	David S. Miller
	Included change: - Check if the skb has been correctly prepared before going on
2013-08-20	sysfs.h: fix __BIN_ATTR_RW()	Greg Kroah-Hartman
	__BIN_ATTR_RW() wasn't passing in the _size field. As it would break the build if this macro was ever used, it's obvious no one had ever tried to use it before. Fix it so that it can be used. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20	r8169: remember WOL preferences on driver load	Peter Wu
	Do not clear Broadcast/Multicast/Unicast Wake Flag or LanWake in Config5. This is necessary to preserve WOL state when the driver is loaded. Although the r8168 vendor driver does not write Config5 (it has been commented out), Hayes Wang from Realtek said that masking bits like this is more sensible. Signed-off-by: Peter Wu <lekensteyn@gmail.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	staging: dgnc: removes ifdef HAVE_UNLOCKED_IOCTL conditionals	Lidza Louina
	This patch removes the HAVE_UNLOCKED_IOCTL conditional statements from driver.c, mgmt.c and mgmt.h. This was used to support older kernels. It isn't needed now. Signed-off-by: Lidza Louina <lidza.louina@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20	staging: dgnc: driver.c: fixes warning about assigning pointer	Lidza Louina
	This patch fixes a warning associated with assigining a pointer in the dgnc_mbuf function. Signed-off-by: Lidza Louina <lidza.louina@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-21	ACPI / PM: Hold acpi_scan_lock over system PM transitions	Rafael J. Wysocki
	Bad things happen if ACPI hotplug events are handled during system PM transitions, especially if devices are removed as a result. To prevent those bad things from happening, acquire acpi_scan_lock when a PM transition is started and release it when that transition is complete or has been aborted. This fixes resume lockup on my test-bed Acer Aspire S5 that happens when Thunderbolt devices are disconnected from the machine while suspended. Also fixes the analogous problem for Mika Westerberg on an Intel DZ77RE-75K board. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com>
2013-08-20	via-ircc: don't return zero if via_ircc_open() failed	Alexey Khoroshilov
	If via_ircc_open() fails, data structures of the driver left uninitialized, but probe (via_init_one()) returns zero. That can lead to null pointer dereference in via_remove_one(), since it does not check drvdata for NULL. The patch implements proper error code propagation. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	macvtap: Ignore tap features when VNET_HDR is off	Vlad Yasevich
	When the user turns off VNET_HDR support on the macvtap device, there is no way to provide any offload information to the user. So, it's safer to ignore offload setting then depend on the user setting them correctly. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	macvtap: Correctly set tap features when IFF_VNET_HDR is disabled.	Vlad Yasevich
	When the user turns off IFF_VNET_HDR flag, attempts to change offload features via TUNSETOFFLOAD do not work. This could cause GSO packets to be delivered to the user when the user is not prepared to handle them. To solve, allow processing of TUNSETOFFLOAD when IFF_VNET_HDR is disabled. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	macvtap: simplify usage of tap_features	Vlad Yasevich
	In macvtap, tap_features specific the features of that the user has specified via ioctl(). If we treat macvtap as a macvlan+tap then we could all the tap a pseudo-device and give it other features like SG and GSO. Then we can stop using the features of lower device (macvlan) when forwarding the traffic the tap. This solves the issue of possible checksum offload mismatch between tap feature and macvlan features. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	tcp: set timestamps for restored skb-s	Andrey Vagin
	When the repair mode is turned off, the write queue seqs are updated so that the whole queue is considered to be 'already sent. The "when" field must be set for such skb. It's used in tcp_rearm_rto for example. If the "when" field isn't set, the retransmit timeout can be calculated incorrectly and a tcp connected can stop for two minutes (TCP_RTO_MAX). Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	jiffies: Avoid undefined behavior from signed overflow	Paul E. McKenney
	According to the C standard 3.4.3p3, overflow of a signed integer results in undefined behavior. This commit therefore changes the definitions of time_after(), time_after_eq(), time_after64(), and time_after_eq64() to avoid this undefined behavior. The trick is that the subtraction is done using unsigned arithmetic, which according to 6.2.5p9 cannot overflow because it is defined as modulo arithmetic. This has the added (though admittedly quite small) benefit of shortening four lines of code by four characters each. Note that the C standard considers the cast from unsigned to signed to be implementation-defined, see 6.3.1.3p3. However, on a two's-complement system, an implementation that defines anything other than a reinterpretation of the bits is free to come to me, and I will be happy to act as a witness for its being committed to an insane asylum. (Although I have nothing against saturating arithmetic or signals in some cases, these things really should not be the default when compiling an operating-system kernel.) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Kevin Easton <kevin@guarana.org> [ paulmck: Included time_after64() and time_after_eq64(), as suggested by Eric Dumazet, also fixed commit message.] Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20	rcu: Simplify _rcu_barrier() processing	Paul E. McKenney
	This commit drops an unneeded ACCESS_ONCE() and simplifies an "our work is done" check in _rcu_barrier(). This applies feedback from Linus (https://lkml.org/lkml/2013/7/26/777) that he gave to similar code in an unrelated patch. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> [ paulmck: Fix comment to match code, reported by Lai Jiangshan. ]
2013-08-20	rcu: Make rcutorture emit online failures if verbose	Paul E. McKenney
	Although rcutorture counts CPU-hotplug online failures, it does not explicitly record which CPUs were having trouble coming online. This commit therefore emits a console message when online failure occurs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20	rcu: Remove unused variable from rcu_torture_writer()	Paul E. McKenney
	The oldbatch variable in rcu_torture_writer() is stored to, but never loaded from. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20	rcu: Sort rcutorture module parameters	Paul E. McKenney
	There are getting to be too many module parameters to permit the current semi-random order, so this patch orders them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20	rcu: Increase rcutorture test coverage	Paul E. McKenney
	Currently, rcutorture has separate torture_types to test synchronous, asynchronous, and expedited grace-period primitives. This has two disadvantages: (1) Three times the number of runs to cover the combinations and (2) Little testing of concurrent combinations of the three options. This commit therefore adds a pair of module parameters that control normal and expedited state, with the default being both types, randomly selected, by the fakewriter processes, thus reducing source-code size and increasing test coverage. In addtion, the writer task switches between asynchronous-normal and expedited grace-period primitives driven by the same pair of module parameters. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20	rcu: Add duplicate-callback tests to rcutorture	Paul E. McKenney
	This commit adds a object_debug option to rcutorture to allow the debug-object-based checks for duplicate call_rcu() invocations to be deterministically tested. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> [ paulmck: Banish mid-function ifdef, more or less per Josh Triplett. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org> [ paulmck: Improve duplicate-callback test, per Lai Jiangshan. ]
2013-08-20	MIPS: Handle OCTEON BBIT instructions in FPU emulator.	David Daney
	The branch emulation needs to handle the OCTEON BBIT instructions, otherwise we get SIGILL instead of emulation. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5726/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-08-20	xen/smp: initialize IPI vectors before marking CPU online	Chuck Anderson
	An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with: kernel BUG at drivers/xen/events.c:1328! RCU has detected that a CPU has not entered a quiescent state within the grace period. It needs to send the CPU a reschedule IPI if it is not offline. rcu_implicit_offline_qs() does this check: /* * If the CPU is offline, it is in a quiescent state. We can * trust its state not to change because interrupts are disabled. / if (cpu_is_offline(rdp->cpu)) { rdp->offline_fqs++; return 1; } Else the CPU is online. Send it a reschedule IPI. The CPU is in the middle of being hot-plugged and has been marked online (!cpu_is_offline()). See start_secondary(): set_cpu_online(smp_processor_id(), true); ... per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; start_secondary() then waits for the CPU bringing up the hot-plugged CPU to mark it as active: / * Wait until the cpu which brought this one up marked it * online before enabling interrupts. If we don't do that then * we can end up waking up the softirq thread before this cpu * reached the active state, which makes the scheduler unhappy * and schedule the softirq thread on the wrong cpu. This is * only observable with forced threaded interrupts, but in * theory it could also happen w/o them. It's just way harder * to achieve. / while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask)) cpu_relax(); / enable local interrupts / local_irq_enable(); The CPU being hot-plugged will be marked active after it has been fully initialized by the CPU managing the hot-plug. In the Xen PVHVM case xen_smp_intr_init() is called to set up the hot-plugged vCPU's XEN_RESCHEDULE_VECTOR. The hot-plugging CPU is marked online, not marked active and does not have its IPI vectors set up. rcu_implicit_offline_qs() sees the hot-plugging cpu is !cpu_is_offline() and tries to send it a reschedule IPI: This will lead to: kernel BUG at drivers/xen/events.c:1328! xen_send_IPI_one() xen_smp_send_reschedule() rcu_implicit_offline_qs() rcu_implicit_dynticks_qs() force_qs_rnp() force_quiescent_state() __rcu_process_callbacks() rcu_process_callbacks() __do_softirq() call_softirq() do_softirq() irq_exit() xen_evtchn_do_upcall() because xen_send_IPI_one() will attempt to use an uninitialized IRQ for the XEN_RESCHEDULE_VECTOR. There is at least one other place that has caused the same crash: xen_smp_send_reschedule() wake_up_idle_cpu() add_timer_on() clocksource_watchdog() call_timer_fn() run_timer_softirq() __do_softirq() call_softirq() do_softirq() irq_exit() xen_evtchn_do_upcall() xen_hvm_callback_vector() clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle a watchdog timer: / * Cycle through CPUs to check if the CPUs stay synchronized * to each other. / next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask); if (next_cpu >= nr_cpu_ids) next_cpu = cpumask_first(cpu_online_mask); watchdog_timer.expires += WATCHDOG_INTERVAL; add_timer_on(&watchdog_timer, next_cpu); This resulted in an attempt to send an IPI to a hot-plugging CPU that had not initialized its reschedule vector. One option would be to make the RCU code check to not check for CPU offline but for CPU active. As becoming active is done after a CPU is online (in older kernels). But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has been completely reworked - in the online path, cpu_active is set before* cpu_online, and also, in the cpu offline path, the cpu_active bit is reset in the CPU_DYING notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up path: "[brought up CPU].. send out a CPU_STARTING notification, and in response to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask is better left to the scheduler alone, since it has the intelligence to use it judiciously." The conclusion was that: " 1. At the IPI sender side: It is incorrect to send an IPI to an offline CPU (cpu not present in the cpu_online_mask). There are numerous places where we check this and warn/complain. 2. At the IPI receiver side: It is incorrect to let the world know of our presence (by setting ourselves in global bitmasks) until our initialization steps are complete to such an extent that we can handle the consequences (such as receiving interrupts without crashing the sender etc.) " (from Srivatsa) As the native code enables the interrupts at some point we need to be able to service them. In other words a CPU must have valid IPI vectors if it has been marked online. It doesn't need to handle the IPI (interrupts may be disabled) but needs to have valid IPI vectors because another CPU may find it in cpu_online_mask and attempt to send it an IPI. This patch will change the order of the Xen vCPU bring-up functions so that Xen vectors have been set up before start_secondary() is called. It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() fails to initialize it. Orabug 13823853 Signed-off-by Chuck Anderson <chuck.anderson@oracle.com> Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-08-20	xen/events: mask events when changing their VCPU binding	David Vrabel
	When a event is being bound to a VCPU there is a window between the EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks where an event may be lost. The hypervisor upcalls the new VCPU but the kernel thinks that event is still bound to the old VCPU and ignores it. There is even a problem when the event is being bound to the same VCPU as there is a small window beween the clear_bit() and set_bit() calls in bind_evtchn_to_cpu(). When scanning for pending events, the kernel may read the bit when it is momentarily clear and ignore the event. Avoid this by masking the event during the whole bind operation. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> CC: stable@vger.kernel.org
2013-08-20	xen/events: initialize local per-cpu mask for all possible events	David Vrabel
	The sizeof() argument in init_evtchn_cpu_bindings() is incorrect resulting in only the first 64 (or 32 in 32-bit guests) ports having their bindings being initialized to VCPU 0. In most cases this does not cause a problem as request_irq() will set the irq affinity which will set the correct local per-cpu mask. However, if the request_irq() is called on a VCPU other than 0, there is a window between the unmasking of the event and the affinity being set were an event may be lost because it is not locally unmasked on any VCPU. If request_irq() is called on VCPU 0 then local irqs are disabled during the window and the race does not occur. Fix this by initializing all NR_EVENT_CHANNEL bits in the local per-cpu masks. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org
2013-08-20	x86/xen: do not identity map UNUSABLE regions in the machine E820	David Vrabel
	If there are UNUSABLE regions in the machine memory map, dom0 will attempt to map them 1:1 which is not permitted by Xen and the kernel will crash. There isn't anything interesting in the UNUSABLE region that the dom0 kernel needs access to so we can avoid making the 1:1 mapping and treat it as RAM. We only do this for dom0, as that is where tboot case shows up. A PV domU could have an UNUSABLE region in its pseudo-physical map and would need to be handled in another patch. This fixes a boot failure on hosts with tboot. tboot marks a region in the e820 map as unusable and the dom0 kernel would attempt to map this region and Xen does not permit unusable regions to be mapped by guests. (XEN) 0000000000000000 - 0000000000060000 (usable) (XEN) 0000000000060000 - 0000000000068000 (reserved) (XEN) 0000000000068000 - 000000000009e000 (usable) (XEN) 0000000000100000 - 0000000000800000 (usable) (XEN) 0000000000800000 - 0000000000972000 (unusable) tboot marked this region as unusable. (XEN) 0000000000972000 - 00000000cf200000 (usable) (XEN) 00000000cf200000 - 00000000cf38f000 (reserved) (XEN) 00000000cf38f000 - 00000000cf3ce000 (ACPI data) (XEN) 00000000cf3ce000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000630000000 (usable) Signed-off-by: David Vrabel <david.vrabel@citrix.com> [v1: Altered the patch and description with domU's with UNUSABLE regions] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-08-20	cpufreq: Use cpufreq_policy_list for iterating over policies	Viresh Kumar
	To iterate over all policies we currently iterate over all online CPUs and then get the policy for each of them which is suboptimal. Use the newly created cpufreq_policy_list for this purpose instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	cpufreq: remove cpufreq_policy_cpu per-cpu variable	Viresh Kumar
	cpufreq_policy_cpu per-cpu variables are used for storing the ID of the CPU that manages the given CPU's policy. However, we also store a policy pointer for each cpu in cpufreq_cpu_data, so the cpufreq_policy_cpu information is simply redundant. It is better to use cpufreq_cpu_data to retrieve a policy and get policy->cpu from there, so make that happen everywhere and drop the cpufreq_policy_cpu per-cpu variables which aren't necessary any more. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	cpufreq: remove unnecessary check in __cpufreq_governor()	Viresh Kumar
	We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put governor module as we are sure event can only be START/STOP here. Remove the useless check. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	cpufreq: remove policy from cpufreq_policy_list during suspend	Viresh Kumar
	cpufreq_policy_list is a list of active policies. We do remove policies from this list when all CPUs belonging to that policy are removed. But during system suspend we don't really free a policy struct as it will be used again during resume, so we didn't remove it from cpufreq_policy_list as well.. However, this is incorrect. We are saying this policy isn't valid anymore and must not be referenced (though we haven't freed it), but it can still be used by code that iterates over cpufreq_policy_list. Remove policy from this list during system suspend as well. Of course, we must add it back whenever the first CPU belonging to that policy shows up. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	cpufreq: Fix white space in __cpufreq_remove_dev()	Viresh Kumar
	Align closing brace '}' of an if block. [rjw: Subject and changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-20	sata_fsl: save irqs while coalescing	Anthony Foiani
	Before this patch, I was seeing the following lockdep splat on my MPC8315 (PPC32) target: [ 9.086051] ================================= [ 9.090393] [ INFO: inconsistent lock state ] [ 9.094744] 3.9.7-ajf-gc39503d #1 Not tainted [ 9.099087] --------------------------------- [ 9.103432] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 9.109431] scsi_eh_1/39 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 9.114642] (&(&host->lock)->rlock){?.+...}, at: [<c02f4168>] sata_fsl_interrupt+0x50/0x250 [ 9.123137] {HARDIRQ-ON-W} state was registered at: [ 9.128004] [<c006cdb8>] lock_acquire+0x90/0xf4 [ 9.132737] [<c043ef04>] _raw_spin_lock+0x34/0x4c [ 9.137645] [<c02f3560>] fsl_sata_set_irq_coalescing+0x68/0x100 [ 9.143750] [<c02f36a0>] sata_fsl_init_controller+0xa8/0xc0 [ 9.149505] [<c02f3f10>] sata_fsl_probe+0x17c/0x2e8 [ 9.154568] [<c02acc90>] driver_probe_device+0x90/0x248 [ 9.159987] [<c02acf0c>] __driver_attach+0xc4/0xc8 [ 9.164964] [<c02aae74>] bus_for_each_dev+0x5c/0xa8 [ 9.170028] [<c02ac218>] bus_add_driver+0x100/0x26c [ 9.175091] [<c02ad638>] driver_register+0x88/0x198 [ 9.180155] [<c0003a24>] do_one_initcall+0x58/0x1b4 [ 9.185226] [<c05aeeac>] kernel_init_freeable+0x118/0x1c0 [ 9.190823] [<c0004110>] kernel_init+0x18/0x108 [ 9.195542] [<c000f6b8>] ret_from_kernel_thread+0x64/0x6c [ 9.201142] irq event stamp: 160 [ 9.204366] hardirqs last enabled at (159): [<c043f778>] _raw_spin_unlock_irq+0x30/0x50 [ 9.212469] hardirqs last disabled at (160): [<c000f414>] reenable_mmu+0x30/0x88 [ 9.219867] softirqs last enabled at (144): [<c002ae5c>] __do_softirq+0x168/0x218 [ 9.227435] softirqs last disabled at (137): [<c002b0d4>] irq_exit+0xa8/0xb4 [ 9.234481] [ 9.234481] other info that might help us debug this: [ 9.240995] Possible unsafe locking scenario: [ 9.240995] [ 9.246898] CPU0 [ 9.249337] ---- [ 9.251776] lock(&(&host->lock)->rlock); [ 9.255878] <Interrupt> [ 9.258492] lock(&(&host->lock)->rlock); [ 9.262765] [ 9.262765] * DEADLOCK * [ 9.262765] [ 9.268684] no locks held by scsi_eh_1/39. [ 9.272767] [ 9.272767] stack backtrace: [ 9.277117] Call Trace: [ 9.279589] [cfff9da0] [c0008504] show_stack+0x48/0x150 (unreliable) [ 9.285972] [cfff9de0] [c0447d5c] print_usage_bug.part.35+0x268/0x27c [ 9.292425] [cfff9e10] [c006ace4] mark_lock+0x2ac/0x658 [ 9.297660] [cfff9e40] [c006b7e4] __lock_acquire+0x754/0x1840 [ 9.303414] [cfff9ee0] [c006cdb8] lock_acquire+0x90/0xf4 [ 9.308745] [cfff9f20] [c043ef04] _raw_spin_lock+0x34/0x4c [ 9.314250] [cfff9f30] [c02f4168] sata_fsl_interrupt+0x50/0x250 [ 9.320187] [cfff9f70] [c0079ff0] handle_irq_event_percpu+0x90/0x254 [ 9.326547] [cfff9fc0] [c007a1fc] handle_irq_event+0x48/0x78 [ 9.332220] [cfff9fe0] [c007c95c] handle_level_irq+0x9c/0x104 [ 9.337981] [cfff9ff0] [c000d978] call_handle_irq+0x18/0x28 [ 9.343568] [cc7139f0] [c000608c] do_IRQ+0xf0/0x1a8 [ 9.348464] [cc713a20] [c000fc8c] ret_from_except+0x0/0x14 [ 9.353983] --- Exception: 501 at _raw_spin_unlock_irq+0x40/0x50 [ 9.353983] LR = _raw_spin_unlock_irq+0x30/0x50 [ 9.364839] [cc713af0] [c043db10] wait_for_common+0xac/0x188 [ 9.370513] [cc713b30] [c02ddee4] ata_exec_internal_sg+0x2b0/0x4f0 [ 9.376699] [cc713be0] [c02de18c] ata_exec_internal+0x68/0xa8 [ 9.382454] [cc713c20] [c02de4b8] ata_dev_read_id+0x158/0x594 [ 9.388205] [cc713ca0] [c02ec244] ata_eh_recover+0xd88/0x13d0 [ 9.393962] [cc713d20] [c02f2520] sata_pmp_error_handler+0xc0/0x8ac [ 9.400234] [cc713dd0] [c02ecdc8] ata_scsi_port_error_handler+0x464/0x5e8 [ 9.407023] [cc713e10] [c02ecfd0] ata_scsi_error+0x84/0xb8 [ 9.412528] [cc713e40] [c02c4974] scsi_error_handler+0xd8/0x47c [ 9.418457] [cc713eb0] [c004737c] kthread+0xa8/0xac [ 9.423355] [cc713f40] [c000f6b8] ret_from_kernel_thread+0x64/0x6c This fix was suggested by Bhushan Bharat <R65777@freescale.com>, and was discussed in email at: http://linuxppc.10917.n7.nabble.com/MPC8315-reboot-failure-lockdep-splat-possibly-related-tp75162.html Same patch successfully tested with 3.9.7. linux-next compiled but not tested on hardware. This patch is based off linux-next tag next-20130819 (which is commit 66a01bae29d11916c09f9f5a937cafe7d402e4a5 ) Signed-off-by: Anthony Foiani <anthony.foiani@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
2013-08-20	arm64: perf: fix event validation for software group leaders	Will Deacon
	This is a port of c95eb3184ea1 ("ARM: 7809/1: perf: fix event validation for software group leaders") to arm64, which fixes a panic in the arm64 perf backend found as a result of Vince's fuzzing tool. Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-08-20	arm64: perf: fix array out of bounds access in armpmu_map_hw_event()	Will Deacon
	This is a port of d9f966357b14 ("ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()") to arm64, which fixes an oops in the arm64 perf backend found as a result of Vince's fuzzing tool. Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-08-20	drivers/spi/spi-tegra114.c clean use of devm_ioremap_resource()	Laurent Navet
	Check of 'r' and calls to dev_err are already done in devm_ioremap_resource, so no need to do them twice. Signed-off-by: Laurent Navet <laurent.navet@gmail.com> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-08-20	spi: octeon: Convert to use bits_per_word_mask	Axel Lin
	Since commit 543bb25 "spi: add ability to validate xfer->bits_per_word in SPI core", the driver can set bits_per_word_mask for the master then the SPI core will reject transfers that attempt to use an unsupported bits_per_word value. So we can remove octeon_spi_validate_bpw() and let SPI core handle the checking. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-08-20	spi: octeon: Remove unused bits_per_word variable in octeon_spi_do_transfer	Axel Lin
	Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-08-20	spi: octeon: Remove empty octeon_spi_nop_transfer_hardware function	Axel Lin
	Both prepare_transfer_hardware and unprepare_transfer_hardware callbacks are optional, so we don't need to implement an empty function for them. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-08-20	x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC=y and more than 512G RAM	Yinghai Lu
	Dave Hansen reported that systems between 500G and 600G RAM crash early if DEBUG_PAGEALLOC is selected. > [ 0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff] > [ 0.000000] [mem 0x00000000-0x000fffff] page 4k > [ 0.000000] BRK [0x02086000, 0x02086fff] PGTABLE > [ 0.000000] BRK [0x02087000, 0x02087fff] PGTABLE > [ 0.000000] BRK [0x02088000, 0x02088fff] PGTABLE > [ 0.000000] init_memory_mapping: [mem 0xe80ee00000-0xe80effffff] > [ 0.000000] [mem 0xe80ee00000-0xe80effffff] page 4k > [ 0.000000] BRK [0x02089000, 0x02089fff] PGTABLE > [ 0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE > [ 0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory It turns out that we missed increasing needed pages in BRK to mapping initial 2M and [0,1M) when we switched to use the #PF handler to set memory mappings: > commit 8170e6bed465b4b0c7687f93e9948aca4358a33b > Author: H. Peter Anvin <hpa@zytor.com> > Date: Thu Jan 24 12:19:52 2013 -0800 > > x86, 64bit: Use a #PF handler to materialize early mappings on demand Before that, we had the maping from [0,512M) in head_64.S, and we can spare two pages [0-1M). After that change, we can not reuse pages anymore. When we have more than 512M ram, we need an extra page for pgd page with [512G, 1024g). Increase pages in BRK for page table to solve the boot crash. Reported-by: Dave Hansen <dave.hansen@intel.com> Bisected-by: Dave Hansen <dave.hansen@intel.com> Tested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: <stable@vger.kernel.org> # v3.9 and later Link: http://lkml.kernel.org/r/1376351004-4015-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-20	x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock	Yoshihiro YUNOMAE
	Prevent crash_kexec() from deadlocking on ioapic_lock. When crash_kexec() is executed on a CPU, the CPU will take ioapic_lock in disable_IO_APIC(). So if the cpu gets an NMI while locking ioapic_lock, a deadlock will happen. In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC(). You can reproduce this deadlock the following way: 1. Add mdelay(1000) after raw_spin_lock_irqsave() in native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c Although the deadlock can occur without this modification, it will increase the potential of the deadlock problem. 2. Build and install the kernel 3. Set up the OS which will run panic() and kexec when NMI is injected # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf # vim /etc/default/grub add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line # grub2-mkconfig 4. Reboot the OS 5. Run following command for each vcpu on the guest # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done; By running this command, cpus will get ioapic_lock for setting affinity. 6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you use VM). After injecting NMI, panic() is called in an nmi-handler context. Then, kexec will normally run in panic(), but the operation will be stopped by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()-> native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()-> clear_IO_APIC_pin()->ioapic_read_entry(). Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevel Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-20	bnx2x: set VF DMAE when first function has 0 supported VFs	Ariel Elior
	There are possible HW configurations in which PFs will have SR-IOV capability but will have Max VFs set to 0 - this happens when there are Multi-Function devices where the VFs are allocated to only some of the PFs. DMAE is configured to support VFs only if the configuring PF has supported VFs. In case the first PF to be loaded will be one without supported VFs, it will not configure DMAE to the VF-supporting mode. When VFs of other PFs will be loaded later on, they will not be able to communicate with their PF. This changes the requirement for configuring DMAE for VF-supporting mode; If the device has SR-IOV capabilities there must be some PF that has max supported VFs > 0, thus it will configure the DMAE for supporting VFs. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	bnx2x: Protect against VFs' ndos when SR-IOV is disabled	Ariel Elior
	Since SR-IOV can be activated dynamically and iproute2 can be called asynchronously, the various callbacks need a robust sanity check before attempting to access the SR-IOV database and members since there are numerous states in which it can find the driver (e.g., PF is down, sriov was not enabled yet, VF is down, etc.). In many of the states the callback result will be null pointer dereference. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20	bnx2x: prevent VF benign attentions	Yuval Mintz
	During probe, VFs might erroneously try to access the shared memory (which only PFs are capabale of accessing), causing benign attentions to appear. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>