summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2013-10-08x86: mkpiggy.c: Explicitly close the output fileGeyslan G. Bem
Even though the resource is released when the application is closed or when returned from main function, modify the code to make it obvious, and to keep static analysis tools from complaining. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Link: http://lkml.kernel.org/r/1381184219-10985-1-git-send-email-geyslan@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-08Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixlets: On the kernel side: - fix a race - fix a bug in the handling of the perf ring-buffer data page On the tooling side: - fix the handling of certain corrupted perf.data files - fix a bug in 'perf probe' - fix a bug in 'perf record + perf sched' - fix a bug in 'make install' - fix a bug in libaudit feature-detection on certain distros" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf session: Fix infinite loop on invalid perf.data file perf tools: Fix installation of libexec components perf probe: Fix to find line information for probe list perf tools: Fix libaudit test perf stat: Set child_pid after perf_evlist__prepare_workload() perf tools: Add default handler for mmap2 events perf/x86: Clean up cap_user_time* setting perf: Fix perf_pmu_migrate_context
2013-10-07net: fix unsafe set_memory_rw from softirqAlexei Starovoitov
on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] *** DEADLOCK *** [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-07crypto: sha256_ssse3 - also test for BMI2Oliver Neukum
The AVX2 implementation also uses BMI2 instructions, but doesn't test for their availability. The assumption that AVX2 and BMI2 always go together is false. Some Haswells have AVX2 but not BMI2. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-10-06x86/iommu: Clean up the CONFIG_GART_IOMMU config option a bitIngo Molnar
Improve the explanation of this config option. Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/n/tip-gUMmysvsbl3mccbyf6olmxqg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-06x86/reboot: Add reboot quirk for Dell Latitude E5410Ville Syrjälä
Dell Latitude E5410 needs reboot=pci to actually reboot. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://lkml.kernel.org/r/1380888964-14517-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-06x86/iommu: Don't make AMD_GART depend on EXPERT and default yAndi Kleen
The AMD_GART driver was made EXPERT/EMBEDDED a long time ago to avoid unbootable 64bit systems with 32bit only devices. This was before swiotlb was there, which does the job of this fallback today. SWIOTLB is always on, so systems should always boot. The drawback is that every system has to compile that driver in (it cannot be a module). Also: - Newer AMD CPUs (the APUs) don't seem to have AMD_GART support at all anymore. - Newer AMD platforms have a much better real IOMMU - The AMD GART driver was never very good (lots of overhead, e.g. in flushing due to some workarounds) and it's doubtful it's really better than SWIOTLB. - On older K8 systems it didn't even work with all chipsets. - The 32bit device bounce buffer case should be rare/ non performance critical these days anyways. - On non AMD systems it is not needed at all. So drop the EXPERT dependency on AMD_GART and remove the default y. The driver can be still compiled in, just it's an explicit decision now, and people who don't want it can unselect it. I also clarified the description a bit. This allows to save ~8K text on most modern x86-64 systems. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1380922676-23007-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04Merge tag 'pci-v3.12-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "We merged what was intended to be an MMCONFIG cleanup, but in fact, for systems without _CBA (which is almost everything), it broke extended config space for domain 0 and it broke all config space for other domains. This reverts the change" * tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
2013-10-04Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"Bjorn Helgaas
This reverts commit 07f9b61c3915e8eb156cb4461b3946736356ad02. 07f9b61c was intended to be a cleanup that didn't change anything, but in fact, for systems without _CBA (which is almost everything), it broke extended config space for domain 0 and all config space for other domains. Reference: http://lkml.kernel.org/r/20131004011806.GE20450@dangermouse.emea.sgi.com Reported-by: Hedi Berriche <hedi@sgi.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-04x86/efi: Fix config_table_type array terminationLeif Lindholm
Incorrect use of 0 in terminating entry of arch_tables[] causes the following sparse warning, arch/x86/platform/efi/efi.c:74:27: sparse: Using plain integer as NULL pointer Replace with NULL. Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org> [ Included sparse warning in commit message. ] Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-10-04x86, build, pci: Fix PCI_MSI build on !SMPThomas Petazzoni
Commit ebd97be635 ('PCI: remove ARCH_SUPPORTS_MSI kconfig option') removed the ARCH_SUPPORTS_MSI option which architectures could select to indicate that they support MSI. Now, all architectures are supposed to build fine when MSI support is enabled: instead of having the architecture tell *when* MSI support can be used, it's up to the architecture code to ensure that MSI support can be enabled. On x86, commit ebd97be635 removed the following line: select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC) Which meant that MSI support was only available when the local APIC and I/O APIC were enabled. While this is always true on SMP or x86-64, it is not necessarily the case on i386 !SMP. The below patch makes sure that the local APIC and I/O APIC support is always enabled when MSI support is enabled. To do so, it: * Ensures the X86_UP_APIC option is not visible when PCI_MSI is enabled. This is the option that allows, on UP machines, to enable or not the APIC support. It is already not visible on SMP systems, or x86-64 systems, for example. We're simply also making it invisible on i386 MSI systems. * Ensures that the X86_LOCAL_APIC and X86_IO_APIC options are 'y' when PCI_MSI is enabled. Notice that this change requires a change in drivers/iommu/Kconfig to avoid a recursive Kconfig dependencey. The AMD_IOMMU option selects PCI_MSI, but was depending on X86_IO_APIC. This dependency is no longer needed: as soon as PCI_MSI is selected, the presence of X86_IO_APIC is guaranteed. Moreover, the AMD_IOMMU already depended on X86_64, which already guaranteed that X86_IO_APIC was enabled, so this dependency was anyway redundant. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Link: http://lkml.kernel.org/r/1380794354-9079-1-git-send-email-thomas.petazzoni@free-electrons.com Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-10-04Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two simplefb fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/simplefb: Mark framebuffer mem-resources as IORESOURCE_BUSY to avoid bootup warning x86/simplefb: Fix overflow causing bogus fall-back
2013-10-04perf/x86: Suppress duplicated abort LBR recordsAndi Kleen
Haswell always give an extra LBR record after every TSX abort. Suppress the extra record. This only works when the abort is visible in the LBR If the original abort has already left the 16 LBR entries the extra entry will will stay. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-7-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04perf/x86: Add Haswell specific transaction flag reportingAndi Kleen
In the PEBS handler report the transaction flags using the new generic transaction flags facility. Most of them come from the "tsx_tuning" field in PEBSv2, but the abort code is derived from the RAX register reported in the PEBS record. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04Merge branch 'perf/urgent' into perf/coreIngo Molnar
Pick up the latest fixes before applying new patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04perf/x86: Clean up cap_user_time* settingPeter Zijlstra
Currently the cap_user_time_zero capability has different tests than cap_user_time; even though they expose the exact same data. Switch from CONSTANT && NONSTOP to sched_clock_stable to also deal with multi cabinet machines and drop the tsc_disabled() check.. non of this will work sanely without tsc anyway. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-nmgn0j0muo1r4c94vlfh23xy@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-03x86/UV: Add call to KGDB/KDB from NMI handlerMike Travis
This patch restores the capability to enter KDB (and KGDB) from the UV NMI handler. This is needed because the UV system console is not capable of sending the 'break' signal to the serial console port. It is also useful when the kernel is hung in such a way that it isn't responding to normal external I/O, so sending 'g' to sysreq-trigger does not work either. Another benefit of the external NMI command is that all the cpus receive the NMI signal at roughly the same time so they are more closely aligned timewise. It utilizes the newly added kgdb_nmicallin function to gain entry to KGDB/KDB by the master. The slaves still enter via the standard kgdb_nmicallback function. It also uses the new 'send_ready' pointer to tell KGDB/KDB to signal the slaves when to proceed into the KGDB slave loop. It is enabled when the nmi action is set to "kdb" and the kernel is built with CONFIG_KDB enabled. Note that if kgdb is connected that interface will be used instead. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20131002151418.089692683@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-03KVM: mmu: change useless int return types to voidPaolo Bonzini
kvm_mmu initialization is mostly filling in function pointers, there is no way for it to fail. Clean up unused return values. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: mmu: unify destroy_kvm_mmu with kvm_mmu_unloadPaolo Bonzini
They do the same thing, and destroy_kvm_mmu can be confused with kvm_mmu_destroy. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: mmu: remove uninteresting MMU "new_cr3" callbacksPaolo Bonzini
The new_cr3 MMU callback has been a wrapper for mmu_free_roots since commit e676505 (KVM: MMU: Force cr3 reload with two dimensional paging on mov cr3 emulation, 2012-07-08). The commit message mentioned that "mmu_free_roots() is somewhat of an overkill, but fixing that is more complicated and will be done after this minimal fix". One year has passed, and no one really felt the need to do a different fix. Wrap the call with a kvm_mmu_new_cr3 function for clarity, but remove the callback. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: mmu: remove uninteresting MMU "free" callbacksPaolo Bonzini
The free MMU callback has been a wrapper for mmu_free_roots since mmu_free_roots itself was introduced (commit 17ac10a, [PATCH] KVM: MU: Special treatment for shadow pae root pages, 2007-01-05), and has always been the same for all MMU cases. Remove the indirection as it is useless. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: x86: only copy XSAVE state for the supported featuresPaolo Bonzini
This makes the interface more deterministic for userspace, which can expect (after configuring only the features it supports) to get exactly the same state from the kernel, independent of the host CPU and kernel version. Suggested-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: x86: prevent setting unsupported XSAVE statesPaolo Bonzini
A guest can still attempt to save and restore XSAVE states even if they have been masked in CPUID leaf 0Dh. This usually is not visible to the guest, but is still wrong: "Any attempt to set a reserved bit (as determined by the contents of EAX and EDX after executing CPUID with EAX=0DH, ECX= 0H) in XCR0 for a given processor will result in a #GP exception". The patch also performs the same checks as __kvm_set_xcr in KVM_SET_XSAVE. This catches migration from newer to older kernel/processor before the guest starts running. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03KVM: x86: mask unsupported XSAVE entries from leaf 0Dh index 0Paolo Bonzini
XSAVE entries that KVM does not support are reported by KVM_GET_SUPPORTED_CPUID for leaf 0Dh index 0 if the host supports them; they should be left out unless there is also hypervisor support for them. Sub-leafs are correctly handled in supported_xcr0_bit, fix index 0 to match. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03x86/simplefb: Mark framebuffer mem-resources as IORESOURCE_BUSY to avoid ↵David Herrmann
bootup warning IORESOURCE_BUSY is used to mark temporary driver mem-resources instead of global regions. This suppresses warnings if regions overlap with a region marked as BUSY. This was always the case for VESA/VGA/EFI framebuffer regions so do the same for simplefb regions. The reason we do this is to allow device handover to real GPU drivers like i915/radeon/nouveau which get the same regions via PCI BARs. Maybe at some point we will be able to unregister platform devices properly during the handover. In this case the simplefb region would get removed before the new region is created. However, this is currently not the case and would require rather huge changes in remove_conflicting_framebuffers(). Add the BUSY marker now and try to eventually rewrite the handover for a next release. Also see kernel/resource.c for more information: /* * if a resource is "BUSY", it's not a hardware resource * but a driver mapping of such a resource; we don't want * to warn for those; some drivers legitimately map only * partial hardware resources. (example: vesafb) */ This suppresses warnings like: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 199 at arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2e3/0x390() Info: mapping multiple BARs. Your kernel is fine. Call Trace: dump_stack+0x54/0x8d warn_slowpath_common+0x7d/0xa0 warn_slowpath_fmt+0x4c/0x50 iomem_map_sanity_check+0xac/0xe0 __ioremap_caller+0x2e3/0x390 ioremap_wc+0x32/0x40 i915_driver_load+0x670/0xf50 [i915] ... Reported-by: Tom Gundersen <teg@jklm.no> Tested-by: Tom Gundersen <teg@jklm.no> Tested-by: Pavel Roskin <proski@gnu.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Link: http://lkml.kernel.org/r/1380724864-1757-1-git-send-email-dh.herrmann@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-02x86/reboot: Sort reboot DMI quirks by vendorDave Jones
Grouping them by vendor should make it easier to spot duplicates. Signed-off-by: Dave Jones <davej@fedoraproject.org> Link: http://lkml.kernel.org/r/20131001203655.GA10719@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-02Merge branch 'irq/core-v6' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into irq/core Pull hardirq and softirq nesting updates from Frederic Weisbecker, which fix nesting related stack overruns such as: http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop Beyond being a fix, this series also optimizes and reorganizes arch hardirq/softirq stack processing to be faster and more robust. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-02Merge tag 'v3.12-rc3' into irq/coreIngo Molnar
Merge Linux v3.12-rc3, to refresh the tree from a v3.11 base to a v3.12 base. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-02x86/simplefb: Fix overflow causing bogus fall-backTom Gundersen
On my MacBook Air lfb_size is 4M, which makes the bitshit overflow (to 256GB - larger than 32 bits), meaning we fall back to efifb unnecessarily. Cast to u64 to avoid the overflow. Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Stephen Warren <swarren@nvidia.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Link: http://lkml.kernel.org/r/1380644320-1026-1-git-send-email-teg@jklm.no Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-01Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull two KVM fixes from Gleb Natapov. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: do not check bit 12 of EPT violation exit qualification when undefined ARM: kvm: rename cpu_reset to avoid name clash
2013-10-01x86/geode: Fix incorrect placement of __initdata tagBartlomiej Zolnierkiewicz
__initdata tag should be placed between the variable name and equal sign for the variable to be placed in the intended .init.data section. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Link: http://lkml.kernel.org/r/18427893.G5JGWn465D@amdc1032 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-01x86: Tell about irq stack coverageFrederic Weisbecker
x86-64 runs irq_exit() under the irq stack. So it can afford to run softirqs in hardirq exit without the need to switch the stacks. The hardirq stack is good enough for that. Now x86-64 runs softirqs in the hardirq stack anyway, so what we mostly skip is some needless per cpu refcounting updates there. x86-32 is not concerned because it only runs the irq handler on the irq stack. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>
2013-10-01irq: Consolidate do_softirq() arch overriden implementationsFrederic Weisbecker
All arch overriden implementations of do_softirq() share the following common code: disable irqs (to avoid races with the pending check), check if there are softirqs pending, then execute __do_softirq() on a specific stack. Consolidate the common parts such that archs only worry about the stack switch. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>
2013-10-01x86/boot: Further compress CPUs bootup messageBorislav Petkov
Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 [ 0.603005] .... node #1, CPUs: #8 #9 #10 #11 #12 #13 #14 #15 [ 1.200005] .... node #2, CPUs: #16 #17 #18 #19 #20 #21 #22 #23 [ 1.796005] .... node #3, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 [ 2.393005] .... node #4, CPUs: #32 #33 #34 #35 #36 #37 #38 #39 [ 2.996005] .... node #5, CPUs: #40 #41 #42 #43 #44 #45 #46 #47 [ 3.600005] .... node #6, CPUs: #48 #49 #50 #51 #52 #53 #54 #55 [ 4.202005] .... node #7, CPUs: #56 #57 #58 #59 #60 #61 #62 #63 [ 4.811005] .... node #8, CPUs: #64 #65 #66 #67 #68 #69 #70 #71 [ 5.421006] .... node #9, CPUs: #72 #73 #74 #75 #76 #77 #78 #79 [ 6.032005] .... node #10, CPUs: #80 #81 #82 #83 #84 #85 #86 #87 [ 6.648006] .... node #11, CPUs: #88 #89 #90 #91 #92 #93 #94 #95 [ 7.262005] .... node #12, CPUs: #96 #97 #98 #99 #100 #101 #102 #103 [ 7.865005] .... node #13, CPUs: #104 #105 #106 #107 #108 #109 #110 #111 [ 8.466005] .... node #14, CPUs: #112 #113 #114 #115 #116 #117 #118 #119 [ 9.073006] .... node #15, CPUs: #120 #121 #122 #123 #124 #125 #126 #127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-30x86 / ACPI: fix incorrect placement of __initdata tagBartlomiej Zolnierkiewicz
__initdata tag should not be placed between "struct" and "resource" because it prevents the variable from being placed in the intended .init.data section. Fix it. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30hotplug, powerpc, x86: Remove cpu_hotplug_driver_lock()Toshi Kani
cpu_hotplug_driver_lock() serializes CPU online/offline operations when ARCH_CPU_PROBE_RELEASE is set. This lock interface is no longer necessary with the following reason: - lock_device_hotplug() now protects CPU online/offline operations, including the probe & release interfaces enabled by ARCH_CPU_PROBE_RELEASE. The use of cpu_hotplug_driver_lock() is redundant. - cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE is defined, which is misleading and is only enabled on powerpc. This patch removes the cpu_hotplug_driver_lock() interface. As a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu probe & release interface as intended. There is no functional change in this patch. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30x86 efi: bugfix interrupt disabling sequenceBart Kuivenhoven
The problem in efi_main was that the idt was cleared before the interrupts were disabled. The UEFI spec states that interrupts aren't used so this shouldn't be too much of a problem. Peripherals however don't necessarily know about this and thus might cause interrupts to happen anyway. Even if ExitBootServices() has been called. This means there is a risk of an interrupt being triggered while the IDT register is nullified and the interrupt bit hasn't been cleared, allowing for a triple fault. This patch disables the interrupt flag, while leaving the existing IDT in place. The CPU won't care about the IDT at all as long as the interrupt bit is off, so it's safe to leave it in place as nothing will ever happen to it. [ Removed the now unused 'idt' variable - Matt ] Signed-off-by: Bart Kuivenhoven <bemk@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-09-30x86: EFI stub support for large memory mapsLinn Crosetto
This patch fixes a problem with EFI memory maps larger than 128 entries when booting using the EFI stub, which results in overflowing e820_map in boot_params and an eventual halt when checking the map size in sanitize_e820_map(). If the number of map entries is greater than what can fit in e820_map, add the extra entries to the setup_data list using type SETUP_E820_EXT. These extra entries are then picked up when the setup_data list is parsed in parse_e820_ext(). Signed-off-by: Linn Crosetto <linn@hp.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-09-30KVM: Convert kvm_lock back to non-raw spinlockPaolo Bonzini
In commit e935b8372cf8 ("KVM: Convert kvm_lock to raw_spinlock"), the kvm_lock was made a raw lock. However, the kvm mmu_shrink() function tries to grab the (non-raw) mmu_lock within the scope of the raw locked kvm_lock being held. This leads to the following: BUG: sleeping function called from invalid context at kernel/rtmutex.c:659 in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0 Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm] Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt Call Trace: [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160 [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50 [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm] [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0 [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260 [<ffffffff8111824a>] balance_pgdat+0x54a/0x730 [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0 [<ffffffff811185bf>] kswapd+0x18f/0x490 [<ffffffff81070961>] ? get_parent_ip+0x11/0x50 [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50 [<ffffffff81118430>] ? balance_pgdat+0x730/0x730 [<ffffffff81060d2b>] kthread+0xdb/0xe0 [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100 [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10 [<ffffffff81060c50>] ? __init_kthread_worker+0x After the previous patch, kvm_lock need not be a raw spinlock anymore, so change it back. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: kvm@vger.kernel.org Cc: gleb@redhat.com Cc: jan.kiszka@siemens.com Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2Gleb Natapov
If #PF happens during delivery of an exception into L2 and L1 also do not have the page mapped in its shadow page table then L0 needs to generate vmexit to L2 with original event in IDT_VECTORING_INFO, but current code combines both exception and generates #DF instead. Fix that by providing nVMX specific function to handle page faults during page table walk that handles this case correctly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30KVM: nVMX: Check all exceptions for intercept during delivery to L2Gleb Natapov
All exceptions should be checked for intercept during delivery to L2, but we check only #PF currently. Drop nested_run_pending while we are at it since exception cannot be injected during vmentry anyway. Signed-off-by: Gleb Natapov <gleb@redhat.com> [Renamed the nested_vmx_check_exception function. - Paolo] Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFOGleb Natapov
If an exception causes vmexit directly it should not be reported in IDT_VECTORING_INFO during the exit. For that we need to be able to distinguish between exception that is injected into nested VM and one that is reinjected because its delivery failed. Fortunately we already have mechanism to do so for nested SVM, so here we just use correct function to requeue exceptions and make sure that reinjected exception is not moved to IDT_VECTORING_INFO during vmexit emulation and not re-checked for interception during delivery. Signed-off-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30KVM: nVMX: Amend nested_run_pending logicGleb Natapov
EXIT_REASON_VMLAUNCH/EXIT_REASON_VMRESUME exit does not mean that nested VM will actually run during next entry. Move setting nested_run_pending closer to vmentry emulation code and move its clearing close to vmexit to minimize amount of code that will erroneously run with nested_run_pending set. Signed-off-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-28Merge branches 'sched-urgent-for-linus', 'timers-urgent-for-linus' and ↵Linus Torvalds
'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler, timer and x86 fixes from Ingo Molnar: - A context tracking ARM build and functional fix - A handful of ARM clocksource/clockevent driver fixes - An AMD microcode patch level sysfs reporting fixlet * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm: Fix build error with context tracking calls * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast clocksource: of: Respect device tree node status clocksource: exynos_mct: Set IRQ affinity when the CPU goes online arm: clocksource: mvebu: Use the main timer as clock source from DT * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Fix patch level reporting for family 15h
2013-09-28Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A couple of tooling fixlets and a PMU detection printout fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix PMU detection printout when no PMU is detected perf symbols: Demangle cloned functions perf machine: Fix path unpopulated in machine__create_modules() perf tools: Explicitly add libdl dependency perf probe: Fix probing symbols with optimization suffix perf trace: Add mmap2 handler perf kmem: Make it work again on non NUMA machines
2013-09-28perf/x86: Fix PMU detection printout when no PMU is detectedIngo Molnar
Ran into this cryptic PMU bootup log recently: [ 0.124047] Performance Events: [ 0.125000] smpboot: ... Turns out we print this if no PMU is detected. Fall back to the right condition so that the following is printed: [ 0.122381] Performance Events: no PMU driver, software events only. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/n/tip-u2fwaUffakjp0qkpRfqljgsn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-28x86: Improve the printout of the SMP bootup CPU tableBorislav Petkov
As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.644008] smpboot: Booting Node 1, Processors: #8 #9 #10 #11 #12 #13 #14 #15 OK [ 1.245006] smpboot: Booting Node 2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK [ 1.864005] smpboot: Booting Node 3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK [ 2.489005] smpboot: Booting Node 4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK [ 3.093005] smpboot: Booting Node 5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK [ 3.698005] smpboot: Booting Node 6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK [ 4.304005] smpboot: Booting Node 7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Libin <huawei.libin@huawei.com> Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-28lockdep, x86/alternatives: Drop ancient lockdep fixup messageBorislav Petkov
It messes up the output of the nodes/cores bootup table and it is obsolete anyway, see 17abecfe651c x86: fix up alternatives with lockdep enabled Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/20130927143442.GE4422@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-28sched: Revert need_resched() to look at TIF_NEED_RESCHEDPeter Zijlstra
Yuanhan reported a serious throughput regression in his pigz benchmark. Using the ftrace patch I found that several idle paths need more TLC before we can switch the generic need_resched() over to preempt_need_resched. The preemption paths benefit most from preempt_need_resched and do indeed use it; all other need_resched() users don't really care that much so reverting need_resched() back to tif_need_resched() is the simple and safe solution. Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: lkp@linux.intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130927153003.GF15690@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-27xen/mmu: Correct PAT MST setting.Konrad Rzeszutek Wilk
Jan Beulich spotted that the PAT MSR settings in the Xen public document that "the first (PAT6) column was wrong across the board, and the column for PAT7 was missing altogether." This updates it to be in sync. CC: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <jbeulich@suse.com>