Age | Commit message (Collapse) | Author |
|
The patch raises the hard limit of VCPU count to 254.
This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.
To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.
Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Suggested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Using the read/write operation to remove the same code
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The operations of read emulation and write emulation are very similar, so we
can abstract the operation of them, in larter patch, it is used to cleanup the
same code
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
If the range spans a page boundary, the mmio access can be broke, fix it as
write emulation.
And we already get the guest physical address, so use it to read guest data
directly to avoid walking guest page table again
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Src2CL decode (used for double width shifts) erronously decodes only bit 3
of %rcx, instead of bits 7:0.
Fix by decoding %cl in its entirety.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
__update_clear_spte_slow should return original spte while the
current code returns low half of original spte combined with high
half of new spte.
Signed-off-by: Zhao Jin <cronozhj@gmail.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
We use the page->private field and hence should use the proper
macros and set proper bits. Also WARN_ON in case somebody
tries to overwrite our data.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
We dropped a lot of the MMU debugfs in favour of using
tracing API - but there is one which just provides
mostly static information that was made invisible by this change.
Bring it back.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Now that the hypercall interface changes are in -unstable, make the
kernel side code not ignore the segment (aka domain) number anymore
(which results in pretty odd behavior on such systems). Rather, if
only the old interfaces are available, don't call them for devices on
non-zero segments at all.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
[v1: Edited git description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Include <asm/aes.h> to pick up the declarations for crypto_aes_encrypt_x86
and crypto_aes_decrypt_x86 to quiet the sparse noise:
warning: symbol 'crypto_aes_encrypt_x86' was not declared. Should it be static?
warning: symbol 'crypto_aes_decrypt_x86' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Patch adds x86_64 assembly implementation of blowfish. Two set of assembler
functions are provided. First set is regular 'one-block at time'
encrypt/decrypt functions. Second is 'four-block at time' functions that
gain performance increase on out-of-order CPUs. Performance of 4-way
functions should be equal to 1-way functions with in-order CPUs.
Summary of the tcrypt benchmarks:
Blowfish assembler vs blowfish C (256bit 8kb block ECB)
encrypt: 2.2x speed
decrypt: 2.3x speed
Blowfish assembler vs blowfish C (256bit 8kb block CBC)
encrypt: 1.12x speed
decrypt: 2.5x speed
Blowfish assembler vs blowfish C (256bit 8kb block CTR)
encrypt: 2.5x speed
Full output:
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt
http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt
Tests were run on:
vendor_id : AuthenticAMD
cpu family : 16
model : 10
model name : AMD Phenom(tm) II X6 1055T Processor
stepping : 0
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A deadlock was introduced on x86 in commit ef68c8f87ed1 ("x86:
Serialize EFI time accesses on rtc_lock") because efi_get_time()
and friends can be called with rtc_lock already held by
read_persistent_time(), e.g.:
timekeeping_init()
read_persistent_clock() <-- acquire rtc_lock
efi_get_time()
phys_efi_get_time() <-- acquire rtc_lock <DEADLOCK>
To fix this let's push the locking down into the get_wallclock()
and set_wallclock() implementations. Only the clock
implementations that access the x86 RTC directly need to acquire
rtc_lock, so it makes sense to push the locking down into the
rtc, vrtc and efi code.
The virtualization implementations don't require rtc_lock to be
held because they provide their own serialization.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Avi Kivity <avi@redhat.com> [for the virtualization aspect]
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This is a workaround for a UV2 hub bug that affects the format of system
global addresses.
The GRU API for UV2 was inadvertently broken by a hardware change. The
format of the physical address used for TLB dropins and for addresses used
with instructions running in unmapped mode has changed. This change was
not documented and became apparent only when diags failed running on
system simulators.
For UV1, TLB and GRU instruction physical addresses are identical to
socket physical addresses (although high NASID bits must be OR'ed into the
address).
For UV2, socket physical addresses need to be converted. The NODE portion
of the physical address needs to be shifted so that the low bit is in bit
39 or bit 40, depending on an MMR value.
It is not yet clear if this bug will be fixed in a silicon respin. If it
is fixed, the hub revision will be incremented & the workaround disabled.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This new driver replaces the old PCEngines Alix 2/3 LED driver with a
new driver that controls the LEDs through the leds-gpio driver. The
old driver accessed GPIOs directly, which created a conflict and
prevented also loading the cs5535-gpio driver to read other GPIOs on
the Alix board. With this new driver, we hook into leds-gpio which in
turn uses GPIO to control the LEDs and therefore it's possible to
control both the LEDs and access onboard GPIOs
Driver is moved to platform/geode as requested by Grant and any other
geode initialisation modules should move here also
This driver is inspired by leds-net5501.c by Alessandro Zummo.
Ideally, leds-net5501.c should also be moved to platform/geode.
Additionally the driver relies on parts of the patch: 7f131cf3ed ("leds:
leds-alix2c - take port address from MSR) by Daniel Mack to perform
detection of the Alix board.
[akpm@linux-foundation.org: include module.h]
Signed-off-by: Ed Wildgoose <kernel@wildgooses.com>
Cc: git@wildgooses.com
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Daniel Mack <daniel@caiaq.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Consolidate the io-apic EOI code in clear_IO_APIC_pin() and
eoi_ioapic_irq().
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Rafael Wysocki <rjw@novell.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: lchiquitto@novell.com
Cc: jbeulich@novell.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110825190657.259696697@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
For older IO-APIC's, we were clearing the remote-IRR by changing
the RTE trigger mode to edge and then back to level. We wanted
to mask the RTE during this process, so we were essentially
doing mask+edge and then to unmask+level.
As part of the commit ca64c47cecd0321b2e0dcbd7aaff44b68ce20654,
we moved this EOI process earlier where the IO-APIC RTE is
masked. So we were wrongly unmasking it in the eoi_ioapic_irq().
So change the remote-IRR clear sequence in eoi_ioapic_irq() to
mask + edge and then restore the previous RTE entry which will
restore the mask status as well as the level trigger.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Rafael Wysocki <rjw@novell.com>
Cc: lchiquitto@novell.com
Cc: jbeulich@novell.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110825190657.210286410@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
In the kdump scenario mentioned below, we can have a case where
the device using level triggered interrupt will not generate any
interrupts in the kdump kernel.
1. IO-APIC sends a level triggered interrupt to the CPU's local APIC.
2. Kernel crashed before the CPU services this interrupt, leaving
the remote-IRR in the IO-APIC set.
3. kdump kernel boot sequence does clear_IO_APIC() as part of IO-APIC
initialization. But this fails to reset remote-IRR bit of the
IO-APIC RTE as the remote-IRR bit is read-only.
4. Device using that level triggered entry can't generate any
more interrupts because of the remote-IRR bit.
In clear_IO_APIC_pin(), check if the remote-IRR bit is set and if
so do an explicit attempt to clear it (by doing EOI write on
modern io-apic's and changing trigger mode to edge/level on
older io-apic's). Also before doing the explicit EOI to the
io-apic, ensure that the trigger mode is indeed set to level.
This will enable the explicit EOI to the io-apic to reset the
remote-IRR bit.
Tested-by: Leonardo Chiquitto <lchiquitto@novell.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Fixes: https://bugzilla.novell.com/show_bug.cgi?id=701686
Cc: Rafael Wysocki <rjw@novell.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Thomas Renninger <trenn@suse.de>
Cc: jbeulich@novell.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110825190657.157502602@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.
Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.
And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Define irq_remap_modify_chip_defaults() and remove the duplicate
code, cleanup the unnecessary ifdefs.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.499225692@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
IRQ set affinity routine is same for the IO-APIC IRQ's aswell as
the MSI IRQ's in the presence of interrupt-remapping. This is
because we modify the interrupt-remapping table entry and
doesn't touch the IO-APIC RTE or the MSI entry.
So remove the ir_msi_set_affinity() and re-use the
ir_ioapic_set_affinity()
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.452760446@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
On the platforms which are x2apic and interrupt-remapping
capable, Linux kernel is enabling x2apic even if the BIOS
doesn't. This is to take advantage of the features that x2apic
brings in.
Some of the OEM platforms are running into issues because of
this, as their bios is not x2apic aware. For example, this was
resulting in interrupt migration issues on one of the platforms.
Also if the BIOS SMI handling uses APIC interface to send SMI's,
then the BIOS need to be aware of x2apic mode that OS has
enabled.
On some of these platforms, BIOS doesn't have a HW mechanism to
turnoff the x2apic feature to prevent OS from enabling it.
To resolve this mess, recent changes to the VT-d2 specification:
http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf
includes a mechanism that provides BIOS a way to request system
software to opt out of enabling x2apic mode.
Look at the x2apic optout flag in the DMAR tables before
enabling the x2apic mode in the platform. Also print a warning
that we have disabled x2apic based on the BIOS request.
Kernel boot parameter "intremap=no_x2apic_optout" can be used to
override the BIOS x2apic optout request.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.171766616@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen:
xen/i386: follow-up to "replace order-based range checking of M2P table by linear one"
xen/irq: Alter the locking to use a mutex instead of a spinlock.
xen/e820: if there is no dom0_mem=, don't tweak extra_pages.
xen: disable PV spinlocks on HVM
|
|
On x86-64, they were just wasteful: with the explicitly added (now
unnecessary) padding, the size of the alternatives structure was 16
bytes, and an alignment of 8 bytes didn't hurt much.
However, it was still silly, since the natural size and alignment for
the structure is actually just 12 bytes, 4-byte aligned since commit
59e97e4d6fbc ("x86: Make alternative instruction pointers relative").
So removing the padding, and removing the extra alignment is just a good
idea.
On x86-32, the alignment of 4 bytes was correct, but was incorrectly
hardcoded as 8 bytes in <asm/alternative-asm.h>. That header file had
used to be an x86-64 only header file, but various unification efforts
have made it be used for x86-32 too (ie the unification of rwlock and
rwsem).
That in turn caused x86-32 boot failures, because the extra alignment
would result in random zero-filled words in the altinstructions section,
causing oopses early at boot when doing alternative instruction
replacement.
So just remove all the alignment noise entirely. It's wrong, and it's
unnecessary. The section itself is already properly aligned by the
linker scripts, and all additions to the section had better be of the
proper 12-byte format, keeping it aligned. So if the align directive
were to ever make a difference, that would be an indication of a serious
bug to begin with.
Reported-by: Werner Landgraf <w.landgraf@ru.r>
Acked-by: Andrew Lutomirski <luto@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fast-forward merge with Linus to be able to merge patches
based on more recent version of the tree.
|
|
It was pointed out by 'make versioncheck' that the include of
linux/version.h is not needed in arch/x86/mm/mmio-mod.c .
This patch removes it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
This patch fixes the typo in parameters passed to
x86_32 switch_to() description.
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
linear one"
The numbers obtained from the hypervisor really can't ever lead to an
overflow here, only the original calculation going through the order
of the range could have. This avoids the (as Jeremy points outs)
somewhat ugly NULL-based calculation here.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
del_timer_sync() can cause a deadlock when called in interrupt context.
It is used with on_each_cpu() in some parts for sysfs files like bank*,
check_interval, cmci_disabled and ignore_ce.
However, use of on_each_cpu() results in calling the function passed
as the argument in interrupt context. This causes a flood of nested
warnings from del_timer_sync() (it runs on each CPU) caused even by a
simple file access like:
$ echo 300 > /sys/devices/system/machinecheck/machinecheck0/check_interval
Fortunately, these MCE-specific files are rarely used and AFAIK only few
MCE geeks experience this warning.
To remove the warning, move timer deletion outside of the interrupt
context.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
|
|
The patch "xen: use maximum reservation to limit amount of usable RAM"
(d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11) breaks machines that
do not use 'dom0_mem=' argument with:
reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
(XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...
The reason being that the last E820 entry is created using the
'extra_pages' (which is based on how many pages have been freed).
The mentioned git commit sets the initial value of 'extra_pages'
using a hypercall which returns the number of pages (if dom0_mem
has been used) or -1 otherwise. If the later we return with
MAX_DOMAIN_PAGES as basis for calculation:
return min(max_pages, MAX_DOMAIN_PAGES);
and use it:
extra_limit = xen_get_max_pages();
if (extra_limit >= max_pfn)
extra_pages = extra_limit - max_pfn;
else
extra_pages = 0;
which means we end up with extra_pages = 128GB in PFNs (33554432)
- 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
and then we add that value to the E820 making it:
Xen: 00000000ff000000 - 0000000100000000 (reserved)
Xen: 0000000100000000 - 000000133f2e2000 (usable)
which is clearly wrong. It should look as so:
Xen: 00000000ff000000 - 0000000100000000 (reserved)
Xen: 0000000100000000 - 000000027fbda000 (usable)
Naturally this problem does not present itself if dom0_mem=max:X
is used.
CC: stable@kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The cmci_discover_lock can be taken in atomic context (cpu bring
up sequence) and therefore cannot be preempted on -rt.
In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
The oprofilefs_lock can be taken in atomic context (in profiling
interrupts) and therefore cannot cannot be preempted on -rt -
annotate it.
In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* 'upstream/bugfix' of git://github.com/jsgf/linux-xen:
xen: use non-tracing preempt in xen_clocksource_read()
|
|
L3 subcaches 0 and 1 of AMD Family 15h CPUs can have a size of 2MB.
Update the calculation routine for the number of L3 indices to
reflect that.
Signed-off-by: Frank Arnold <frank.arnold@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Rosenfeld Hans <Hans.Rosenfeld@amd.com>
Cc: Herrmann3 Andreas <Andreas.Herrmann3@amd.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Frank Arnold <Frank.Arnold@amd.com>
Link: http://lkml.kernel.org/r/20110726170449.GB32536@aftab
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
It's not a good reason to allocate memory in the smp function call
just because someone thought it's the most conveniant place.
The AMD L3 data is coupled to the northbridge info by a pointer to the
corresponding north bridge data. So allocating it with the northbridge
data and referencing the northbridge in the cache_info code instead
uses less memory and gets rid of that atomic allocation hack in the
smp function call.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20110723212626.688229918@linutronix.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit f9b90566c ("x86: reduce stack usage in init_intel_cacheinfo")
introduced a shadow structure to reduce the stack usage on large
machines instead of making the smaller structure embedded into the
large one. That's definitely a candidate for the bad taste award.
Move the small struct into the large one and get rid of the ugly type
casts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20110723212626.625651773@linutronix.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
free_cache_attributes() kfree's:
per_cpu(ici_cpuid4_info, cpu)->l3
which is a pointer to memory which was allocated as a block in
amd_init_l3_cache(). l3 of a particular cpu points to a part of this
memory blob. The part and the rest of the blob are still referenced by
other cpus.
As far as I can tell from the git history this is a leftover from the
conversion from per cpu to node data with commit ba06edb63(x86,
cacheinfo: Make L3 cache info per node) and the following commit
f658bcfb2(x86, cacheinfo: Cleanup L3 cache index disable support)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20110723212626.550539989@linutronix.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit b03e7495a862 ("PCI: Set PCI-E Max Payload Size on fabric")
introduced a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.
To correct this, verify that the self pointer in pci_bus is non-NULL
before dereferencing it.
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
PV spinlocks cannot possibly work with the current code because they are
enabled after pvops patching has already been done, and because PV
spinlocks use a different data structure than native spinlocks so we
cannot switch between them dynamically. A spinlock that has been taken
once by the native code (__ticket_spin_lock) cannot be taken by
__xen_spin_lock even after it has been released.
Reported-and-Tested-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The automatic increase of the min_delta_ns of a clockevents device
should be done in the clockevents code as the minimum delay is an
attribute of the clockevents device.
In addition not all architectures want the automatic adjustment, on a
massively virtualized system it can happen that the programming of a
clock event fails several times in a row because the virtual cpu has
been rescheduled quickly enough. In that case the minimum delay will
erroneously be increased with no way back. The new config symbol
GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
adjustment. The config option is selected only for x86.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The Hyper-V clocksource driver is best integrated with Hyper-V
detection code since:
(a) Linux guests running on Hyper-V require it
(b) Integration into that code significanly reduces code size
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: gregkh@suse.de
Cc: devel@linuxdriverproject.org
Cc: virtualization@lists.osdl.org
Link: http://lkml.kernel.org/r/1315434310-4827-1-git-send-email-kys@microsoft.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
* 'perf-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
x86, perf: Check that current->mm is alive before getting user callchain
perf_event: Fix broken calc_timer_values()
perf events: Fix slow and broken cgroup context switch code
|
|
* 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen:
xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.
xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
xen: use maximum reservation to limit amount of usable RAM
|
|
* 'kvm-updates/3.1' of git://github.com/avikivity/kvm:
KVM: Fix instruction size issue in pvclock scaling
|
|
We have hit a couple of customer bugs where they would like to
use those parameters to run an UP kernel - but both of those
options turn of important sources of interrupt information so
we end up not being able to boot. The correct way is to
pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and
the kernel will patch itself to be a UP kernel.
Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308
CC: stable@kernel.org
Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
interrupt context
If vmalloc page_fault happens inside of interrupt handler with interrupts
disabled then on exit path from exception handler when there is no pending
interrupts, the following code (arch/x86/xen/xen-asm_32.S:112):
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
sete XEN_vcpu_info_mask(%eax)
will enable interrupts even if they has been previously disabled according to
eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99)
testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
setz XEN_vcpu_info_mask(%eax)
Solution is in setting XEN_vcpu_info_mask only when it should be set
according to
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
but not clearing it if there isn't any pending events.
Reproducer for bug is attached to RHBZ 707552
CC: stable@kernel.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).
On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:
Before:
Memory: 627792k/4472000k available
After:
Memory: 549740k/11132224k available
A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved
low memory is also reduced from 253 MiB to 32 MiB. The total
additional usable RAM is 329 MiB.
For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0') (c/s 23790)
CC: stable@kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
An event may occur when an mm is already released.
I added an event in dequeue_entity() and caught a panic with
the following backtrace:
[ 434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[ 434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120
...
[ 434.421258] Call Trace:
[ 434.421258] [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0
[ 434.421258] [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90
[ 434.421258] [<ffffffff8101b048>] perf_callchain_user+0x128/0x170
[ 434.421258] [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100
[ 434.421258] [<ffffffff81116690>] perf_prepare_sample+0x200/0x280
[ 434.421258] [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290
[ 434.421258] [<ffffffff81065240>] ? tg_shares_up+0x0/0x670
[ 434.421258] [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0
[ 434.421258] [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0
[ 434.421258] [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250
[ 434.421258] [<ffffffff81119204>] perf_tp_event+0x44/0x70
[ 434.421258] [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110
[ 434.421258] [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0
[ 434.421258] [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60
[ 434.421258] [<ffffffff8105818a>] dequeue_task+0x9a/0xb0
[ 434.421258] [<ffffffff810581e2>] deactivate_task+0x42/0xe0
[ 434.421258] [<ffffffff814bc019>] thread_return+0x191/0x808
[ 434.421258] [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60
[ 434.421258] [<ffffffff8106f4c4>] do_exit+0x464/0x910
[ 434.421258] [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0
[ 434.421258] [<ffffffff8106fa57>] sys_exit_group+0x17/0x20
[ 434.421258] [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit de2d1a524e94 ("KVM: Fix register corruption in pvclock_scale_delta")
introduced a mul instruction that may have only a memory operand; the
assembler therefore cannot select the correct size:
pvclock.s:229: Error: no instruction mnemonic suffix given and no register
operands; can't size instruction
In this example the assembler is:
#APP
mul -48(%rbp) ; shrd $32, %rdx, %rax
#NO_APP
A simple solution is to use mulq.
Signed-off-by: Duncan Sands <baldrick@free.fr>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Use __compiletime_error() to produce a compile-time error rather than
link-time, where available.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
Make trylock code common regardless of ticket size.
(Also, rename arch_spinlock.slock to head_tail.)
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|