summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-01-15iommu/vt-d: Allow IR works in XAPIC mode though CPU works in X2APIC modeJiang Liu
Currently if CPU supports X2APIC, IR hardware must work in X2APIC mode or disabled. Change the code to support IR working in XAPIC mode when CPU supports X2APIC. Then the CPU APIC driver will decide how to handle such as configuration by: 1) Disabling X2APIC mode 2) Forcing X2APIC physical mode This change also fixes a live locking when 1) BIOS enables CPU X2APIC 2) DMAR table disables X2APIC mode or IR hardware doesn't support X2APIC with following messages: [ 37.863463] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2 [ 37.863463] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry [ 37.879372] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2 [ 37.879372] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry [ 37.895282] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2 [ 37.895282] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry [ 37.911192] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2 Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1420615903-28253-11-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15iommu/vt-d: Allocate IRQ remapping data structures only for all IOMMUsJoerg Roedel
IRQ remapping is only supported when all IOMMUs in the system support it. So check if all IOMMUs in the system support IRQ remapping before doing the allocations. [Jiang] 1) Rebased to v3.19. 2) Remove redundant check of ecap_ir_support(iommu->ecap) in function intel_enable_irq_remapping(). Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1420615903-28253-10-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15iommu/vt-d: Prepare for killing function irq_remapping_supported()Jiang Liu
Prepare for killing function irq_remapping_supported() by moving code from intel_irq_remapping_supported() into intel_prepare_irq_remapping(). Combined with patch from Joerg at https://lkml.org/lkml/2014/12/15/487, so assume an signed-off from Joerg. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1420615903-28253-9-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Handle XAPIC remap mode proper.Jiang Liu
If remapping is in XAPIC mode, the setup code just skips X2APIC initialization without checking max CPU APIC ID in system, which may cause problem if system has a CPU with APIC ID bigger than 255. Handle IR in XAPIC mode the same way as if remapping is disabled. [ tglx: Split out from previous patch ] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-8-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Refine enable_IR_x2apic() and related functionsJiang Liu
Refine enable_IR_x2apic() and related functions for better readability. [ tglx: Removed the XAPIC mode change and split it out into a seperate patch. Added comments. ] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-8-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Correctly detect X2APIC status in function enable_IR()Jiang Liu
X2APIC will be disabled if user specifies "nox2apic" on kernel command line, even when x2apic_preenabled is true. So correctly detect X2APIC status by using x2apic_enabled() instead of x2apic_preenabled. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-7-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Kill useless variable x2apic_enabled in function enable_IR_x2apic()Jiang Liu
Local variable x2apic_enabled has been assigned to but never referred, so kill it. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-6-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Panic if kernel doesn't support x2apic but BIOS has enabled x2apicJiang Liu
When kernel doesn't support X2APIC but BIOS has enabled X2APIC, system may panic or hang without useful messages. On the other hand, it's hard to dynamically disable X2APIC when CONFIG_X86_X2APIC is disabled. So panic with a clear message in such a case. Now system panics as below when X2APIC is disabled and interrupt remapping is enabled: [ 0.316118] LAPIC pending interrupts after 512 EOI [ 0.322126] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.368655] Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC [ 0.378300] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0+ #340 [ 0.385300] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0051.L05.1406240953 06/24/2014 [ 0.396997] ffff88046dc03000 ffff88046c307dd8 ffffffff8179dada 00000000000043f2 [ 0.405629] ffffffff81a92158 ffff88046c307e58 ffffffff8179b757 0000000000000002 [ 0.414261] 0000000000000008 ffff88046c307e68 ffff88046c307e08 ffffffff813ad82b [ 0.422890] Call Trace: [ 0.425711] [<ffffffff8179dada>] dump_stack+0x45/0x57 [ 0.431533] [<ffffffff8179b757>] panic+0xc1/0x1f5 [ 0.436978] [<ffffffff813ad82b>] ? delay_tsc+0x3b/0x70 [ 0.442910] [<ffffffff8166fa2c>] panic_if_irq_remap+0x1c/0x20 [ 0.449524] [<ffffffff81d73645>] setup_IO_APIC+0x405/0x82e [ 0.464979] [<ffffffff81d6fcc2>] native_smp_prepare_cpus+0x2d9/0x31c [ 0.472274] [<ffffffff81d5d0ac>] kernel_init_freeable+0xd6/0x223 [ 0.479170] [<ffffffff81792ad0>] ? rest_init+0x80/0x80 [ 0.485099] [<ffffffff81792ade>] kernel_init+0xe/0xf0 [ 0.490932] [<ffffffff817a537c>] ret_from_fork+0x7c/0xb0 [ 0.497054] [<ffffffff81792ad0>] ? rest_init+0x80/0x80 [ 0.502983] ---[ end Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC System hangs as below when X2APIC and interrupt remapping are both disabled: [ 1.102782] pci 0000:00:02.0: System wakeup disabled by ACPI [ 1.109351] pci 0000:00:03.0: System wakeup disabled by ACPI [ 1.115915] pci 0000:00:03.2: System wakeup disabled by ACPI [ 1.122479] pci 0000:00:03.3: System wakeup disabled by ACPI [ 1.132274] pci 0000:00:1c.0: Enabling MPC IRBNCE [ 1.137620] pci 0000:00:1c.0: Intel PCH root port ACS workaround enabled [ 1.145239] pci 0000:00:1c.0: System wakeup disabled by ACPI [ 1.151790] pci 0000:00:1c.7: Enabling MPC IRBNCE [ 1.157128] pci 0000:00:1c.7: Intel PCH root port ACS workaround enabled [ 1.164748] pci 0000:00:1c.7: System wakeup disabled by ACPI [ 1.171447] pci 0000:00:1e.0: System wakeup disabled by ACPI [ 1.178612] acpiphp: Slot [8] registered [ 1.183095] pci 0000:00:02.0: PCI bridge to [bus 01] [ 1.188867] acpiphp: Slot [2] registered With this patch applied, the system panics in both cases with a proper panic message. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-5-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Clear stale x2apic modeThomas Gleixner
If x2apic got disabled on the kernel command line, then the following issue can happen: enable_IR_x2apic() .... x2apic_mode = 1; enable_x2apic(); if (x2apic_disabled) { __disable_x2apic(); return; } That leaves X2APIC disabled in hardware, but x2apic_mode stays 1. So all other code which checks x2apic_mode gets the wrong information. Set x2apic_mode to 0 after disabling it in hardware. This is just a hotfix. The proper solution is to rework this code so it has seperate functions for the initial setup on the boot processor and the secondary cpus, but that's beyond the scope of this fix. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com>
2015-01-15iommu/vt-d: Convert allocations to GFP_KERNELThomas Gleixner
No reason anymore to do GFP_ATOMIC allocations which are not harmful in the normal bootup case, but matter in the physical hotplug scenario. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Acked-and-tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20141205084147.472428339@linutronix.de Link: http://lkml.kernel.org/r/1420615903-28253-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15iommu/vt-d: Move iommu preparatory allocations to irq_remap_ops.prepareThomas Gleixner
The whole iommu setup for irq remapping is a convoluted mess. The iommu detect function gets called from mem_init() and the prepare callback gets called from enable_IR_x2apic() for unknown reasons. Of course AMD and Intel setup differs in nonsensical ways. Intels prepare callback is explicit while AMDs prepare callback is implicit in setup_irq_remapping_ops() just to be called in the prepare call again. Because all of this gets called from enable_IR_x2apic() and the dmar prepare function merily parses the ACPI tables, but does not allocate memory we end up with memory allocation from irq disabled context later on. AMDs iommu code at least allocates the required memory from the prepare function. That has issues as well, but thats not scope of this patch. The goal of this change is to distangle the allocation from the actual enablement. There is no point to allocate memory from irq disabled regions with GFP_ATOMIC just because it does not matter at that point in the boot stage. It matters with physical hotplug later on. There is another issue with the current setup. Due to the conversion to stacked irqdomains we end up with a call into the irqdomain allocation code from irq disabled context, but that code does GFP_KERNEL allocations rightfully as there is no reason to do preperatory allocations with GFP_ATOMIC. That change caused the allocator code to complain about GFP_KERNEL allocations invoked in atomic context. Boris provided a temporary hackaround which changed the GFP flags if irq_domain_add() got called from atomic context. Not pretty and we really dont want to get this into a mainline release for obvious reasons. Move the ACPI table parsing and the resulting memory allocations from the enable to the prepare function. That allows to get rid of the horrible hackaround in irq_domain_add() later. [Jiang] Rebased onto v3.19 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Acked-and-tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20141205084147.313026156@linutronix.de Link: http://lkml.kernel.org/r/1420615903-28253-3-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15iommu, x86: Restructure setup of the irq remapping featureThomas Gleixner
enable_IR_x2apic() calls setup_irq_remapping_ops() which by default installs the intel dmar remapping ops and then calls the amd iommu irq remapping prepare callback to figure out whether we are running on an AMD machine with irq remapping hardware. Right after that it calls irq_remapping_prepare() which pointlessly checks: if (!remap_ops || !remap_ops->prepare) return -ENODEV; and then calls remap_ops->prepare() which is silly in the AMD case as it got called from setup_irq_remapping_ops() already a few microseconds ago. Simplify this and just collapse everything into irq_remapping_prepare(). The irq_remapping_prepare() remains still silly as it assigns blindly the intel ops, but that's not scope of this patch. The scope here is to move the preperatory work, i.e. memory allocations out of the atomic section which is required to enable irq remapping. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Acked-and-tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20141205084147.232633738@linutronix.de Link: http://lkml.kernel.org/r/1420615903-28253-2-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15s390/bpf: Zero extend parameters before calling C functionMichael Holzheu
The s390x ABI requires to zero extend parameters before functions are called. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-15s390/bpf: Fix sk_load_byte_msh()Michael Holzheu
In sk_load_byte_msh() sk_load_byte_slow() is called instead of sk_load_byte_msh_slow(). Fix this and call the correct function. Besides of this load only one byte instead of two and fix the comment. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-15s390/bpf: Fix offset parameter for skb_copy_bits()Michael Holzheu
Currently the offset parameter for skb_copy_bits is changed in sk_load_word() and sk_load_half(). Therefore it is not correct when calling skb_copy_bits(). Fix this and use the original offset for the function call. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-15s390/bpf: Fix skb_copy_bits() parameter passingMichael Holzheu
The skb_copy_bits() function has the following signature: int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len) Currently in bpf_jit.S the "to" and "len" parameters have been exchanged. So fix this and call the function with the correct parameters. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-15s390/bpf: Fix JMP_JGE_K (A >= K) and JMP_JGT_K (A > K)Michael Holzheu
Currently the signed COMPARE HALFWORD IMMEDIATE (chi) and COMPARE (c) instructions are used to compare "A" with "K". This is not correct because "A" and "K" are both unsigned. To fix this remove the chi instruction (no unsigned analogon available) and use the unsigned COMPARE LOGICAL (cl) instruction instead of COMPARE (c). Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-15be2net: Allow GRE to work concurrently while a VxLAN tunnel is configuredSriharsha Basavapatna
Other tunnels like GRE break while VxLAN offloads are enabled in Skyhawk-R. To avoid this, we should restrict offload features on a per-packet basis in such conditions. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge branch 'thermal-soc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal fixes from Zhang Rui: "Specifics: - bogus type qualifier fix in OF thermal code. - Minor fixes on imx and rcar thermal drivers. - Update TI SoC thermal maintainer entry. - Updated documentation of OF cpufreq cooling register" * 'thermal-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: rcar: Spelling/grammar: s/drier use .../driver uses ...s/ thermal: rcar: change type of ctemp in rcar_thermal_update_temp() thermal: rcar: fix ENR register value Documentation: thermal: document of_cpufreq_cooling_register() Thermal: imx: add clk disable/enable for suspend/resume MAINTAINERS: update ti-soc-thermal status MAINTAINERS: Add linux-omap to list of reviewers for TI Thermal thermal: of: Remove bogus type qualifier for of_thermal_get_trip_points()
2015-01-14Merge tag 'fixes-for-v3.19-rc6' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: usb: fixes for v3.19-rc6 The final set of fixes for v3.19. Two of the fixes are related to dwc3 scatter/gather implementation when we have more requests queued than available TRBs, while the other is a build fix for mv-usb PHY. Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-01-14Merge tag 'usb-serial-3.19-rc5' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for v3.18-rc5 Here are a few fixes for reported problems including a possible null-deref on probe with keyspan, a misbehaving modem, and a couple of issues with the USB console. Some new device IDs are also added. Signed-off-by: Johan Hovold <johan@kernel.org>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Don't use uninitialized data in IPVS, from Dan Carpenter. 2) conntrack race fixes from Pablo Neira Ayuso. 3) Fix TX hangs with i40e, from Jesse Brandeburg. 4) Fix budget return from poll calls in dnet and alx, from Eric Dumazet. 5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph Jaeger. 6) Fix bug introduced by conversion to list_head in TIPC retransmit code, from Jon Paul Maloy. 7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey Khoroshilov. 8) Fix bridge build with INET disabled, from Arnd Bergmann. 9) Fix netlink array overrun for PROBE attributes in openvswitch, from Thomas Graf. 10) Don't hold spinlock across synchronize_irq() in tg3 driver, from Prashant Sreedharan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) tg3: Release tp->lock before invoking synchronize_irq() tg3: tg3_reset_task() needs to use rtnl_lock to synchronize tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin openvswitch: packet messages need their own probe attribtue i40e: adds FCoE configure option cxgb4vf: Fix queue allocation for 40G adapter netdevice: Add missing parentheses in macro bridge: only provide proxy ARP when CONFIG_INET is enabled neighbour: fix base_reachable_time(_ms) not effective immediatly when changed net: fec: fix MDIO bus assignement for dual fec SoC's xen-netfront: use different locks for Rx and Tx stats drivers: net: cpsw: fix multicast flush in dual emac mode cxgb4vf: Initialize mdio_addr before using it net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb() MAINTAINERS: add me as ibmveth maintainer tipc: fix bug in broadcast retransmit code update ip-sysctl.txt documentation (v2) net/at91_ether: prepare and unprepare clock ...
2015-01-14Merge branch 'tg3-net'David S. Miller
Prashant Sreedharan says: ==================== tg3: synchronize_irq() should be called without taking locks v2: Added Reported-by, Tested-by fields and reference to the thread that reported the problem This series addresses the problem reported by Peter Hurley in mail thread https://lkml.org/lkml/2015/1/12/1082 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14tg3: Release tp->lock before invoking synchronize_irq()Prashant Sreedharan
synchronize_irq() can sleep waiting, for pending IRQ handlers so driver should release the tp->lock spin lock before invoking synchronize_irq() Reported-by: Peter Hurley <peter@hurleysoftware.com> Tested-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14tg3: tg3_reset_task() needs to use rtnl_lock to synchronizePrashant Sreedharan
Currently tg3_reset_task() uses only tp->lock for synchronizing with code paths like tg3_open() etc. But since tp->lock is released before doing synchronize_irq(), rtnl_lock should be taken in tg3_reset_task() to synchronize it with other code paths. Reported-by: Peter Hurley <peter@hurleysoftware.com> Tested-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14tg3: tg3_timer() should grab tp->lock before checking for tp->irq_syncPrashant Sreedharan
This is to avoid the race between tg3_timer() and the execution paths which does not invoke tg3_timer_stop() and releases tp->lock before calling synchronize_irq() Reported-by: Peter Hurley <peter@hurleysoftware.com> Tested-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Two bugfixes for arm64. I will have another pull request next week, but otherwise things are calm" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: arm64: KVM: Fix HCR setting for 32bit guests arm64: KVM: Fix TLB invalidation by IPA/VMID
2015-01-14team: avoid possible underflow of count_pending value for notify_peers and ↵Jiri Pirko
mcast_rejoin This patch is fixing a race condition that may cause setting count_pending to -1, which results in unwanted big bulk of arp messages (in case of "notify peers"). Consider following scenario: count_pending == 2 CPU0 CPU1 team_notify_peers_work atomic_dec_and_test (dec count_pending to 1) schedule_delayed_work team_notify_peers atomic_add (adding 1 to count_pending) team_notify_peers_work atomic_dec_and_test (dec count_pending to 1) schedule_delayed_work team_notify_peers_work atomic_dec_and_test (dec count_pending to 0) schedule_delayed_work team_notify_peers_work atomic_dec_and_test (dec count_pending to -1) Fix this race by using atomic_dec_if_positive - that will prevent count_pending running under 0. Fixes: fc423ff00df3a1955441 ("team: add peer notification") Fixes: 492b200efdd20b8fcfd ("team: add support for sending multicast rejoins") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two small performance tweaks, the plumbing for the execveat system call and a couple of bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/uprobes: fix user space PER events s390/bpf: Fix JMP_JGE_X (A > X) and JMP_JGT_X (A >= X) s390/bpf: Fix ALU_NEG (A = -A) s390/mm: avoid using pmd_to_page for !USE_SPLIT_PMD_PTLOCKS s390/timex: fix get_tod_clock_ext() inline assembly s390: wire up execveat syscall s390/kernel: use stnsm 255 instead of stosm 0 s390/vtime: Get rid of redundant WARN_ON s390/zcrypt: kernel oops at insmod of the z90crypt device driver
2015-01-14openvswitch: packet messages need their own probe attribtueThomas Graf
User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow and packet messages. This leads to an out-of-bounds access in ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE > OVS_PACKET_ATTR_MAX. Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes while maintaining to be binary compatible with existing OVS binaries. Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tracked-down-by: Florian Westphal <fw@strlen.de> Signed-off-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14i40e: adds FCoE configure optionVasu Dev
Adds FCoE config option I40E_FCOE, so that FCoE can be enabled as needed but otherwise have it disabled by default. This also eliminate multiple FCoE config checks, instead now just one config check for CONFIG_I40E_FCOE. The I40E FCoE was added with 3.17 kernel and therefore this patch shall be applied to stable 3.17 kernel also. CC: <stable@vger.kernel.org> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14cxgb4vf: Fix queue allocation for 40G adapterHariprasad Shenai
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge tag 'locks-v3.19-1' of git://git.samba.org/jlayton/linuxLinus Torvalds
Pull file locking fix from Jeff Layton: "Just a simple bugfix for a regression that I introduced into v3.18 with the internal lease API overhaul -- mea culpa. Kudos to Linda and Neil for tracking this down and fixing it" * tag 'locks-v3.19-1' of git://git.samba.org/jlayton/linux: locks: fix NULL-deref in generic_delete_lease
2015-01-14netdevice: Add missing parentheses in macroBenjamin Poirier
For example, one could conceivably call for_each_netdev_in_bond_rcu(condition ? bond1 : bond2, slave) and get an unexpected result. Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer fixes from Jens Axboe: "The major part is an update to the NVMe driver, fixing various issues around surprise removal and hung controllers. Most of that is from Keith, and parts are simple blk-mq fixes or exports/additions of minor functions to aid this effort, and parts are changes directly to the NVMe driver. Apart from the above, this contains: - Small blk-mq change from me, killing an unused member of the hardware queue structure. - Small fix from Ming Lei, fixing up a few drivers that didn't properly check for ERR_PTR() returns from blk_mq_init_queue()" * 'for-linus' of git://git.kernel.dk/linux-block: NVMe: Fix locking on abort handling NVMe: Start and stop h/w queues on reset NVMe: Command abort handling fixes NVMe: Admin queue removal handling NVMe: Reference count admin queue usage NVMe: Start all requests blk-mq: End unstarted requests on a dying queue blk-mq: Allow requests to never expire blk-mq: Add helper to abort requeued requests blk-mq: Let drivers cancel requeue_work blk-mq: Export if requests were started blk-mq: Wake tasks entering queue on dying blk-mq: get rid of ->cmd_size in the hardware queue block: fix checking return value of blk_mq_init_queue block: wake up waiters when a queue is marked dying NVMe: Fix double free irq blk-mq: Export freeze/unfreeze functions blk-mq: Exit queue on alloc failure
2015-01-14ksoftirqd: Use new cond_resched_rcu_qs() functionPaul E. McKenney
Simplify run_ksoftirqd() by using the new cond_resched_rcu_qs() function that conditionally reschedules, but unconditionally supplies an RCU quiescent state. This commit is separate from the previous commit by Calvin Owens because Calvin's approach can be backported, while this commit cannot be. The reason that this commit cannot be backported is that cond_resched_rcu_qs() does not always provide the needed quiescent state in earlier kernels. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14ksoftirqd: Enable IRQs and call cond_resched() before poking RCUCalvin Owens
While debugging an issue with excessive softirq usage, I encountered the following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread infrastructure"): [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ] ...but despite this note, the patch still calls RCU with IRQs disabled. This seemingly innocuous change caused a significant regression in softirq CPU usage on the sending side of a large TCP transfer (~1 GB/s): when introducing 0.01% packet loss, the softirq usage would jump to around 25%, spiking as high as 50%. Before the change, the usage would never exceed 5%. Moving the call to rcu_note_context_switch() after the cond_sched() call, as it was originally before the hotplug patch, completely eliminated this problem. Signed-off-by: Calvin Owens <calvinowens@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14bridge: only provide proxy ARP when CONFIG_INET is enabledArnd Bergmann
When IPV4 support is disabled, we cannot call arp_send from the bridge code, which would result in a kernel link error: net/built-in.o: In function `br_handle_frame_finish': :(.text+0x59914): undefined reference to `arp_send' :(.text+0x59a50): undefined reference to `arp_tbl' This makes the newly added proxy ARP support in the bridge code depend on the CONFIG_INET symbol and lets the compiler optimize the code out to avoid the link error. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") Cc: Kyeyoon Park <kyeyoonp@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14spi: st-ssc4: Remove duplicate code to test unsupported mode bitsAxel Lin
spi_setup() will test unsupported mode bits before calling spi->master->setup. Thus remove duplicate code to test unsupported mode bits in spi_st_setup(). Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-01-14regulator: Update documentation after renaming function argumentKrzysztof Kozlowski
Update documentation for regulator_register() function after renaming its argument. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-01-14ASoC: wm8904: fix runtime warningBo Shen
Correct the path name for mux to get rid of the following warning: --->8--- wm8904 1-001a: Control not supported for path ADCL -> [Left] -> AIFOUTL wm8904 1-001a: ASoC: no dapm match for ADCL --> Left --> AIFOUTL wm8904 1-001a: ASoC: Failed to add route ADCL -> Left -> AIFOUTL wm8904 1-001a: Control not supported for path ADCR -> [Right] -> AIFOUTL wm8904 1-001a: ASoC: no dapm match for ADCR --> Right --> AIFOUTL wm8904 1-001a: ASoC: Failed to add route ADCR -> Right -> AIFOUTL wm8904 1-001a: Control not supported for path ADCL -> [Left] -> AIFOUTR wm8904 1-001a: ASoC: no dapm match for ADCL --> Left --> AIFOUTR wm8904 1-001a: ASoC: Failed to add route ADCL -> Left -> AIFOUTR wm8904 1-001a: Control not supported for path ADCR -> [Right] -> AIFOUTR wm8904 1-001a: ASoC: no dapm match for ADCR --> Right --> AIFOUTR wm8904 1-001a: ASoC: Failed to add route ADCR -> Right -> AIFOUTR wm8904 1-001a: Control not supported for path AIFINR -> [Right] -> DACL wm8904 1-001a: ASoC: no dapm match for AIFINR --> Right --> DACL wm8904 1-001a: ASoC: Failed to add route AIFINR -> Right -> DACL wm8904 1-001a: Control not supported for path AIFINL -> [Left] -> DACL wm8904 1-001a: ASoC: no dapm match for AIFINL --> Left --> DACL wm8904 1-001a: ASoC: Failed to add route AIFINL -> Left -> DACL wm8904 1-001a: Control not supported for path AIFINR -> [Right] -> DACR wm8904 1-001a: ASoC: no dapm match for AIFINR --> Right --> DACR wm8904 1-001a: ASoC: Failed to add route AIFINR -> Right -> DACR wm8904 1-001a: Control not supported for path AIFINL -> [Left] -> DACR wm8904 1-001a: ASoC: no dapm match for AIFINL --> Left --> DACR wm8904 1-001a: ASoC: Failed to add route AIFINL -> Left -> DACR ---8<--- Signed-off-by: Bo Shen <voice.shen@atmel.com> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-01-14usb: dwc3: gadget: Stop TRB preparation after limit is reachedAmit Virdi
DWC3 gadget sets up a pool of 32 TRBs for each EP during initialization. This means, the max TRBs that can be submitted for an EP is fixed to 32. Since the request queue for an EP is a linked list, any number of requests can be queued to it by the gadget layer. However, the dwc3 driver must not submit TRBs more than the pool it has created for. This limit wasn't respected when SG was used resulting in submitting more than the max TRBs, eventually leading to non-transfer of the TRBs submitted over the max limit. Root cause: When SG is used, there are two loops iterating to prepare TRBs: - Outer loop over the request_list - Inner loop over the SG list The code was missing break to get out of the outer loop. Fixes: eeb720fb21d6 (usb: dwc3: gadget: add support for SG lists) Cc: <stable@vger.kernel.org> # v3.9+ Signed-off-by: Amit Virdi <amit.virdi@st.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-01-14usb: dwc3: gadget: Fix TRB preparation during SGAmit Virdi
When scatter gather (SG) is used, multiple TRBs are prepared from one DWC3 request (dwc3_request). So while preparing TRBs, the 'last' flag should be set only when it is the last TRB being prepared from the last dwc3_request entry. The current implementation uses list_is_last to check if the dwc3_request is the last entry from the request_list. However, list_is_last returns false for the last entry too. This is because, while preparing the first TRB from a request, the function dwc3_prepare_one_trb modifies the request's next and prev pointers while moving the URB to req_queued. Hence, list_is_last always returns false no matter what. The correct way is not to access the modified pointers of dwc3_request but to use list_empty macro instead. Fixes: e5ba5ec833aa (usb: dwc3: gadget: fix scatter gather implementation) Signed-off-by: Amit Virdi <amit.virdi@st.com> Cc: <stable@vger.kernel.org> # v3.9+ Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-01-14spi: orion: Change spi-orion to use transfer_one() semantics for SPI transfersKen Wilson
This commit changes spi-orion to provide setup, set_cs, and transfer_one functions instead of transfer_one_message. This allows chip select support for both native and GPIO chip selects to be added. Signed-off-by: Ken Wilson <ken.wilson@opengear.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-01-14ASoC: simple-card: Fix crash in asoc_simple_card_unref()Geert Uytterhoeven
If asoc_simple_card_probe() fails, asoc_simple_card_unref() may be called before dev_set_drvdata(), causing a NULL pointer dereference in asoc_simple_card_unref(): Unable to handle kernel NULL pointer dereference at virtual address 000000d4 ... PC is at asoc_simple_card_unref+0x14/0x48 LR is at asoc_simple_card_probe+0x3d4/0x40c This typically happens because asoc_simple_card_parse_of() returns -EPROBE_DEFER, but other failure modes are possible. devm_snd_soc_register_card()/snd_soc_register_card() may fail either before or after dev_set_drvdata(). Pass a snd_soc_card pointer instead of a platform_device pointer to asoc_simple_card_unref() to fix this. Note that if CONFIG_OF_DYNAMIC=n, of_node_put() is a dummy, and gcc may optimize away the loop over card->dai_link, never actually dereferencing card, and thus avoiding the crash... Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: e512e001dafa54e5 ("ASoC: simple-card: Fix the reference count of device nodes") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-01-14ASoC: fsl: imx-wm8962: Set the card owner fieldFabio Estevam
The following crash happens when trying to unload the snd_soc_imx_wm8962 module while playback is active: [ 208.666868] Unable to handle kernel paging request at virtc [ 208.674110] pgd = 80004000 [ 208.676867] [7f06541c] *pgd=4c334811, *pte=00000000, *ppte=00000000 [ 208.683211] Internal error: Oops: 80000007 [#1] SMP ARM [ 208.688445] Modules linked in: snd_soc_wm8962 snd_soc_fsl_ssi snd_soc_imx_audmux imx_pcm_fiq evbug] ... In order to avoid such problem, fill the card owner field as suggested by Lars-Peter Clausen: "But looking at the source it seems that this is a core feature of ALSA and at least for the card module itself it will do the ref-counting when a stream is started/stopped. And we even support setting the owner of a card in ASoC. It's just that pretty much no ASoC card driver bothers to set the owner field in the snd_soc_card struct. So this particular problem can be fixed by updating the imx-wm8962 driver to set the owner field." By doing as suggested, we no longer see the crash when attempting to unload the snd_soc_imx_wm8962 module while playback is active: $ modprobe -r snd_soc_imx_wm8962 modprobe: can't unload module snd_soc_imx_wm8962: Resource temporarily unavailable Reported-by: Jiada Wang <jiada_wang@mentor.com> Suggested-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-01-14softirq/preempt: Add missing current->preempt_disable_ip updateHeiko Carstens
While debugging some "sleeping function called from invalid context" bug I realized that the debugging message "Preemption disabled at:" pointed to an incorrect function. In particular if the last function/action that disabled preemption was spin_lock_bh() then current->preempt_disable_ip won't be updated. The reason for this is that __local_bh_disable_ip() will increase preempt_count manually instead of calling preempt_count_add(), which would handle the update correctly. It look like the manual handling was done to work around some lockdep issue. So add the missing update of current->preempt_disable_ip to __local_bh_disable_ip() as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150107090441.GC4365@osiris Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14locking/osq: No need for load/acquire when acquire-pollingDavidlohr Bueso
Both mutexes and rwsems took a performance hit when we switched over from the original mcs code to the cancelable variant (osq). The reason being the use of smp_load_acquire() when polling for node->locked. This is not needed as reordering is not an issue, as such, relax the barrier semantics. Paul describes the scenario nicely: https://lkml.org/lkml/2013/11/19/405 - If we start polling before the insertion is complete, all that happens is that the first few polls have no chance of seeing a lock grant. - Ordering the polling against the initialization -- the above xchg() is already doing that for us. The smp_load_acquire() when unqueuing make sense. In addition, we don't need to worry about leaking the critical region as osq is only used internally. This impacts both regular and large levels of concurrency, ie on a 40 core system with a disk intensive workload: disk-1 804.83 ( 0.00%) 828.16 ( 2.90%) disk-61 8063.45 ( 0.00%) 18181.82 (125.48%) disk-121 7187.41 ( 0.00%) 20119.17 (179.92%) disk-181 6933.32 ( 0.00%) 20509.91 (195.82%) disk-241 6850.81 ( 0.00%) 20397.80 (197.74%) disk-301 6815.22 ( 0.00%) 20287.58 (197.68%) disk-361 7080.40 ( 0.00%) 20205.22 (185.37%) disk-421 7076.13 ( 0.00%) 19957.33 (182.04%) disk-481 7083.25 ( 0.00%) 19784.06 (179.31%) disk-541 7038.39 ( 0.00%) 19610.92 (178.63%) disk-601 7072.04 ( 0.00%) 19464.53 (175.23%) disk-661 7010.97 ( 0.00%) 19348.23 (175.97%) disk-721 7069.44 ( 0.00%) 19255.33 (172.37%) disk-781 7007.58 ( 0.00%) 19103.14 (172.61%) disk-841 6981.18 ( 0.00%) 18964.22 (171.65%) disk-901 6968.47 ( 0.00%) 18826.72 (170.17%) disk-961 6964.61 ( 0.00%) 18708.02 (168.62%) Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1420573509-24774-7-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14perf: Avoid horrible stack usagePeter Zijlstra (Intel)
Both Linus (most recent) and Steve (a while ago) reported that perf related callbacks have massive stack bloat. The problem is that software events need a pt_regs in order to properly report the event location and unwind stack. And because we could not assume one was present we allocated one on stack and filled it with minimal bits required for operation. Now, pt_regs is quite large, so this is undesirable. Furthermore it turns out that most sites actually have a pt_regs pointer available, making this even more onerous, as the stack space is pointless waste. This patch addresses the problem by observing that software events have well defined nesting semantics, therefore we can use static per-cpu storage instead of on-stack. Linus made the further observation that all but the scheduler callers of perf_sw_event() have a pt_regs available, so we change the regular perf_sw_event() to require a valid pt_regs (where it used to be optional) and add perf_sw_event_sched() for the scheduler. We have a scheduler specific call instead of a more generic _noregs() like construct because we can assume non-recursion from the scheduler and thereby simplify the code further (_noregs would have to put the recursion context call inline in order to assertain which __perf_regs element to use). One last note on the implementation of perf_trace_buf_prepare(); we allow .regs = NULL for those cases where we already have a pt_regs pointer available and do not need another. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Javi Merino <javi.merino@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Link: http://lkml.kernel.org/r/20141216115041.GW3337@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14locking/mcs: Better differentiate between MCS variantsDavidlohr Bueso
We have two flavors of the MCS spinlock: standard and cancelable (OSQ). While each one is independent of the other, we currently mix and match them. This patch: - Moves the OSQ code out of mcs_spinlock.h (which only deals with the traditional version) into include/linux/osq_lock.h. No unnecessary code is added to the more global header file, anything locks that make use of OSQ must include it anyway. - Renames mcs_spinlock.c to osq_lock.c. This file only contains osq code. - Introduces a CONFIG_LOCK_SPIN_ON_OWNER in order to only build osq_lock if there is support for it. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/1420573509-24774-5-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>