Age | Commit message (Collapse) | Author |
|
opal_nvram_write currently just assumes success if it encounters an
error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO
on other errors instead.
Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The H_CPU_BEHAV_* flags should be checked for in the 'behaviour' field
of 'struct h_cpu_char_result' -- 'character' is for H_CPU_CHAR_*
flags.
Found by playing around with QEMU's implementation of the hypercall:
H_CPU_CHAR=0xf000000000000000
H_CPU_BEHAV=0x0000000000000000
This clears H_CPU_BEHAV_FAVOUR_SECURITY and H_CPU_BEHAV_L1D_FLUSH_PR
so pseries_setup_rfi_flush() disables 'rfi_flush'; and it also
clears H_CPU_CHAR_L1D_THREAD_PRIV flag. So there is no RFI flush
mitigation at all for cpu_show_meltdown() to report; but currently
it does:
Original kernel:
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Mitigation: RFI Flush
Patched kernel:
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Not affected
H_CPU_CHAR=0x0000000000000000
H_CPU_BEHAV=0xf000000000000000
This sets H_CPU_BEHAV_BNDS_CHK_SPEC_BAR so cpu_show_spectre_v1() should
report vulnerable; but currently it doesn't:
Original kernel:
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Not affected
Patched kernel:
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Vulnerable
Brown-paper-bag-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: f636c14790ea ("powerpc/pseries: Set or clear security feature flags")
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move __map_kernel_page_nid() inside #ifdef SPARSEMEM_VMEMMAP]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Try to allocate kernel page tables for direct mapping and vmemmap
according to the node of the memory they will map. The node is not
available for the linear map in early boot, so use range allocation
to allocate the page tables from the region they map, which is
effectively node-local.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix build error in radix__create_section_mapping()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Per-node allocations are possible on 64s with radix that does
not have the bolted SLB limitation.
Hash would be able to do the same if all CPUs had the bottom of
their node-local memory bolted as well. This is left as an
exercise for the reader.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add dummy definition of boot_cpuid for !SMP]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Rename the dummy allocate_pacas() to fix 32-bit build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Build an array that finds hardware CPU number from logical CPU
number in firmware CPU discovery. Use that rather than setting
paca of other CPUs directly, to begin with. Subsequent patch will
not have pacas allocated at this point.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix SMP=n build by adding #ifdef in arch_match_cpu_phys_id()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Move this into the early setup code, and don't iterate over CPU masks.
We don't want to call into sysfs so early from setup, and a future patch
won't initialize CPU masks by the time this is called.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fold in incremental fix from Nick for DSCR handling]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Split sparsemem initialisation from basic numa topology discovery.
Move the parsing earlier in boot, before pacas are allocated.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
slb_shadow structures are avoided for radix environment.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
We no longer allocate lppacas in an array, so this patch removes the
1kB static alignment for the structure, and enforces the PAPR
alignment requirements at allocation time. We can not reduce the 1kB
allocation size however, due to existing KVM hypervisors.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Change the paca array into an array of pointers to pacas. Allocate
pacas individually.
This allows flexibility in where the PACAs are allocated. Future work
will allocate them node-local. Platforms that don't have address limits
on PACAs would be able to defer PACA allocations until later in boot
rather than allocate all possible ones up-front then freeing unused.
This is slightly more overhead (one additional indirection) for cross
CPU paca references, but those aren't too common.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The "lppaca" is a structure registered with the hypervisor. This is
unnecessary when running on non-virtualised platforms. One field from
the lppaca (pmcregs_in_use) is also used by the host, so move the host
part out into the paca (lppaca field is still updated in
guest mode).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix non-pseries build with some #ifdefs]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In mpic_physmask() we loop over all CPUs up to 32, then get the hard
SMP processor id of that CPU.
Currently that's possibly walking off the end of the paca array, but
in a future patch we will change the paca array to be an array of
pointers, and in that case we will get a NULL for missing CPUs and
oops. eg:
Unable to handle kernel paging request for data at address 0x88888888888888b8
Faulting instruction address: 0xc00000000004e380
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP .mpic_set_affinity+0x60/0x1a0
LR .irq_do_set_affinity+0x48/0x100
Fix it by checking the CPU is possible, this also fixes the code if
there are gaps in the CPU numbering which probably never happens on
mpic systems but who knows.
Debugged-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Vector RAW in UML needs to BPF filter its own MAC only
if QDISC_BYPASS has failed. If QDISC_BYPASS is successful, the
frames originated locally are not visible to readers on the
raw socket.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The patches for the UML vector drivers were in-flight when
the timer changes happened and were not covered by them.
This change migrates vector_kern.c to use the new timer API.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Recent libcs have gotten a bit more strict, so we actually need to
include the right headers and use the right types. This enables UML to
compile again.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
KVM PPC update for 4.17
- Improvements for the radix page fault handler for HV KVM on POWER9.
|
|
This is a cosmetic patch that deals with the address filter structure's
ambiguous fields 'filter' and 'range'. The former stands to mean that the
filter's *action* should be to filter the traces to its address range if
it's set or stop tracing if it's unset. This is confusing and hard on the
eyes, so this patch replaces it with 'action' enum. The 'range' field is
completely redundant (meaning that the filter is an address range as
opposed to a single address trigger), as we can use zero size to mean the
same thing.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180329120648.11902-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Conflicts:
kernel/events/hw_breakpoint.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates for 4.17 from Marc Zyngier:
- New Qualcomm PDC irqchip
- New Microsemi Ocelot irqchip
- Suspend/resume support for some oddball GICv3 irqchip
- Better GIC/GICv3 support for kexec
- Various cleanups and fixes
|
|
Since commit 8253bb3f8255 ("regmap: potentially duplicate the name
string stored in regmap"), the name field of struct regmap_config
is copied in __regmap_init(), so we no longer need to worry about
keeping our own copy of the name.
Just use a string literal in the initialization of da8xx_cfgchip_config
instead of creating a separate variable for the name.
Signed-off-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
|
|
platform_data
This fixes a possible kernel oops due to using stack allocated platform
data for the USB PHY driver on DA8XX devices. If the platform device
probe is deferred, then we get a corrupt pointer for the platform data.
We now use a global static struct for the platform data so that the
platform data pointer does not get written over.
Fixes: bdec5a6b5789 ("ARM: da8xx: use platform data for CFGCHIP syscon regmap")
Signed-off-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.16. Apologies if this is a bit big at
rc7, but they're all reasonably important fixes. None are actually for
new code, so they aren't indicative of 4.16 being in bad shape from
our point of view.
- Fix missing AT_BASE_PLATFORM (in auxv) when we're using a new
firmware interface for describing CPU features.
- Fix lost pending interrupts due to a race in our interrupt
soft-masking code.
- A workaround for a nest MMU bug with TLB invalidations on Power9.
- A workaround for broadcast TLB invalidations on Power9.
- Fix a bug in our instruction SLB miss handler, when handling bad
addresses (eg. >= TASK_SIZE), which could corrupt non-volatile user
GPRs.
Thanks to: Aneesh Kumar K.V, Balbir Singh, Benjamin Herrenschmidt,
Nicholas Piggin"
* tag 'powerpc-4.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
powerpc/mm: Fixup tlbie vs store ordering issue on POWER9
powerpc/mm/radix: Move the functions that does the actual tlbie closer
powerpc/mm/radix: Remove unused code
powerpc/mm: Workaround Nest MMU bug with TLB invalidations
powerpc/mm: Add tracking of the number of coprocessors using a context
powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Here are are a couple of last-minute fixes for 4.16, mostly for
regressions. As usual, the majory are device tree changes:
- USB 3 support on rk3399 didn't work and is being reverted for now
- One fix for an old suspend/resume bug on rk3399
- A few regulator related fixes on Banana Pi M2, and on imx7d-sdb
- A boot regression fix for all Aspeed SoCs failing to find their
memory
- One more dtc warning fix
The other changes are:
- A few updates to the MAINTAINERS file
- A revert for an incorrect orion5x cleanup
- Two power management fixes for OMAP"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP: Fix SRAM W+X mapping
ARM: dts: aspeed: Add default memory node
mailmap: Update email address for Gregory CLEMENT
ARM: davinci: fix the GPIO lookup for omapl138-hawk
MAINTAINERS: Update Tegra IOMMU maintainer
ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name
ARM: ux500: Fix PMU IRQ regression
ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399"
arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
ARM: OMAP: Fix dmtimer init for omap1
MAINTAINERS: update email address for Maxime Ripard
ARM: dts: sun6i: a31s: bpi-m2: add missing regulators
ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
|
|
nested_run_pending
When vCPU runs L2 and there is a pending event that requires to exit
from L2 to L1 and nested_run_pending=1, vcpu_enter_guest() will request
an immediate-exit from L2 (See req_immediate_exit).
Since now handling of req_immediate_exit also makes sure to set
KVM_REQ_EVENT, there is no need to also set it on vmx_vcpu_run() when
nested_run_pending=1.
This optimizes cases where VMRESUME was executed by L1 to enter L2 and
there is no pending events that require exit from L2 to L1. Previously,
this would have set KVM_REQ_EVENT unnecessarly.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
pending
In case L2 VMExit to L0 during event-delivery, VMCS02 is filled with
IDT-vectoring-info which vmx_complete_interrupts() makes sure to
reinject before next resume of L2.
While handling the VMExit in L0, an IPI could be sent by another L1 vCPU
to the L1 vCPU which currently runs L2 and exited to L0.
When L0 will reach vcpu_enter_guest() and call inject_pending_event(),
it will note that a previous event was re-injected to L2 (by
IDT-vectoring-info) and therefore won't check if there are pending L1
events which require exit from L2 to L1. Thus, L0 enters L2 without
immediate VMExit even though there are pending L1 events!
This commit fixes the issue by making sure to check for L1 pending
events even if a previous event was reinjected to L2 and bailing out
from inject_pending_event() before evaluating a new pending event in
case an event was already reinjected.
The bug was observed by the following setup:
* L0 is a 64CPU machine which runs KVM.
* L1 is a 16CPU machine which runs KVM.
* L0 & L1 runs with APICv disabled.
(Also reproduced with APICv enabled but easier to analyze below info
with APICv disabled)
* L1 runs a 16CPU L2 Windows Server 2012 R2 guest.
During L2 boot, L1 hangs completely and analyzing the hang reveals that
one L1 vCPU is holding KVM's mmu_lock and is waiting forever on an IPI
that he has sent for another L1 vCPU. And all other L1 vCPUs are
currently attempting to grab mmu_lock. Therefore, all L1 vCPUs are stuck
forever (as L1 runs with kernel-preemption disabled).
Observing /sys/kernel/debug/tracing/trace_pipe reveals the following
series of events:
(1) qemu-system-x86-19066 [030] kvm_nested_vmexit: rip:
0xfffff802c5dca82f reason: EPT_VIOLATION ext_inf1: 0x0000000000000182
ext_inf2: 0x00000000800000d2 ext_int: 0x00000000 ext_int_err: 0x00000000
(2) qemu-system-x86-19054 [028] kvm_apic_accept_irq: apicid f
vec 252 (Fixed|edge)
(3) qemu-system-x86-19066 [030] kvm_inj_virq: irq 210
(4) qemu-system-x86-19066 [030] kvm_entry: vcpu 15
(5) qemu-system-x86-19066 [030] kvm_exit: reason EPT_VIOLATION
rip 0xffffe00069202690 info 83 0
(6) qemu-system-x86-19066 [030] kvm_nested_vmexit: rip:
0xffffe00069202690 reason: EPT_VIOLATION ext_inf1: 0x0000000000000083
ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000
(7) qemu-system-x86-19066 [030] kvm_nested_vmexit_inject: reason:
EPT_VIOLATION ext_inf1: 0x0000000000000083 ext_inf2: 0x0000000000000000
ext_int: 0x00000000 ext_int_err: 0x00000000
(8) qemu-system-x86-19066 [030] kvm_entry: vcpu 15
Which can be analyzed as follows:
(1) L2 VMExit to L0 on EPT_VIOLATION during delivery of vector 0xd2.
Therefore, vmx_complete_interrupts() will set KVM_REQ_EVENT and reinject
a pending-interrupt of 0xd2.
(2) L1 sends an IPI of vector 0xfc (CALL_FUNCTION_VECTOR) to destination
vCPU 15. This will set relevant bit in LAPIC's IRR and set KVM_REQ_EVENT.
(3) L0 reach vcpu_enter_guest() which calls inject_pending_event() which
notes that interrupt 0xd2 was reinjected and therefore calls
vmx_inject_irq() and returns. Without checking for pending L1 events!
Note that at this point, KVM_REQ_EVENT was cleared by vcpu_enter_guest()
before calling inject_pending_event().
(4) L0 resumes L2 without immediate-exit even though there is a pending
L1 event (The IPI pending in LAPIC's IRR).
We have already reached the buggy scenario but events could be
furthered analyzed:
(5+6) L2 VMExit to L0 on EPT_VIOLATION. This time not during
event-delivery.
(7) L0 decides to forward the VMExit to L1 for further handling.
(8) L0 resumes into L1. Note that because KVM_REQ_EVENT is cleared, the
LAPIC's IRR is not examined and therefore the IPI is still not delivered
into L1!
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The reason that exception.pending should block re-injection of
NMI/interrupt is not described correctly in comment in code.
Instead, it describes why a pending exception should be injected
before a pending NMI/interrupt.
Therefore, move currently present comment to code-block evaluating
a new pending event which explains why exception.pending is evaluated
first.
In addition, create a new comment describing that exception.pending
blocks re-injection of NMI/interrupt because the exception was
queued by handling vmexit which was due to NMI/interrupt delivery.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@orcle.com>
[Used a comment from Sean J <sean.j.christopherson@intel.com>. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
For exceptions & NMIs events, KVM code use the following
coding convention:
*) "pending" represents an event that should be injected to guest at
some point but it's side-effects have not yet occurred.
*) "injected" represents an event that it's side-effects have already
occurred.
However, interrupts don't conform to this coding convention.
All current code flows mark interrupt.pending when it's side-effects
have already taken place (For example, bit moved from LAPIC IRR to
ISR). Therefore, it makes sense to just rename
interrupt.pending to interrupt.injected.
This change follows logic of previous commit 664f8e26b00c ("KVM: X86:
Fix loss of exception which has not yet been injected") which changed
exception to follow this coding convention as well.
It is important to note that in case !lapic_in_kernel(vcpu),
interrupt.pending usage was and still incorrect.
In this case, interrrupt.pending can only be set using one of the
following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and
KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that
QEMU uses them either to re-set an "interrupt.pending" state it has
received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or
via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt
from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR
before sending ioctl to KVM. So it seems that indeed "interrupt.pending"
in this case is also suppose to represent "interrupt.injected".
However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr()
is misusing (now named) interrupt.injected in order to return if
there is a pending interrupt.
This leads to nVMX/nSVM not be able to distinguish if it should exit
from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should
re-inject an injected interrupt.
Therefore, add a FIXME at these functions for handling this issue.
This patch introduce no semantics change.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
kvm_inject_realmode_interrupt() is called from one of the injection
functions which writes event-injection to VMCS: vmx_queue_exception(),
vmx_inject_irq() and vmx_inject_nmi().
All these functions are called just to cause an event-injection to
guest. They are not responsible of manipulating the event-pending
flag. The only purpose of kvm_inject_realmode_interrupt() should be
to emulate real-mode interrupt-injection.
This was also incorrect when called from vmx_queue_exception().
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Enlightened VMCS is just a structure in memory, the main benefit
besides avoiding somewhat slower VMREAD/VMWRITE is using clean field
mask: we tell the underlying hypervisor which fields were modified
since VMEXIT so there's no need to inspect them all.
Tight CPUID loop test shows significant speedup:
Before: 18890 cycles
After: 8304 cycles
Static key is being used to avoid performance penalty for non-Hyper-V
deployments.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
TLFS 5.0 says: "Support for an enlightened VMCS interface is reported with
CPUID leaf 0x40000004. If an enlightened VMCS interface is supported,
additional nested enlightenments may be discovered by reading the CPUID
leaf 0x4000000A (see 2.4.11)."
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The definitions are according to the Hyper-V TLFS v5.0. KVM on Hyper-V will
use these.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Virtual Processor Assist Pages usage allows us to do optimized EOI
processing for APIC, enable Enlightened VMCS support in KVM and more.
struct hv_vp_assist_page is defined according to the Hyper-V TLFS v5.0b.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The assist page has been used only for the paravirtual EOI so far, hence
the "APIC" in the MSR name. Renaming to match the Hyper-V TLFS where it's
called "Virtual VP Assist MSR".
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
mshyperv.h now only contains fucntions/variables we define in kernel, all
definitions from TLFS should go to hyperv-tlfs.h.
'enum hv_cpuid_function' is removed as we already have this info in
hyperv-tlfs.h, code in mshyperv.c is adjusted accordingly.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
hyperv.h is not part of uapi, there are no (known) users outside of kernel.
We are making changes to this file to match current Hyper-V Hypervisor
Top-Level Functional Specification (TLFS, see:
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs)
and we don't want to maintain backwards compatibility.
Move the file renaming to hyperv-tlfs.h to avoid confusing it with
mshyperv.h. In future, all definitions from TLFS should go to it and
all kernel objects should go to mshyperv.h or include/linux/hyperv.h.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
static_key_disable_cpuslocked(): static key 'virt_spin_lock_key+0x0/0x20' used before call to jump_label_init()
WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:161 static_key_disable_cpuslocked+0x61/0x80
RIP: 0010:static_key_disable_cpuslocked+0x61/0x80
Call Trace:
static_key_disable+0x16/0x20
start_kernel+0x192/0x4b3
secondary_startup_64+0xa5/0xb0
Qspinlock will be choosed when dedicated pCPUs are available, however, the
static virt_spin_lock_key is set in kvm_spinlock_init() before jump_label_init()
has been called, which will result in a WARN(). This patch fixes it by delaying
the virt_spin_lock_key setup to .smp_prepare_cpus().
Reported-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Fixes: b2798ba0b876 ("KVM: X86: Choose qspinlock when dedicated physical CPUs are available")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Bring the PLE(pause loop exit) logic to AMD svm driver.
While testing, we found this helping in situations where numerous
pauses are generated. Without these patches we could see continuos
VMEXITS due to pause interceptions. Tested it on AMD EPYC server with
boot parameter idle=poll on a VM with 32 vcpus to simulate extensive
pause behaviour. Here are VMEXITS in 10 seconds interval.
Pauses 810199 504
Total 882184 325415
Signed-off-by: Babu Moger <babu.moger@amd.com>
[Prevented the window from dropping below the initial value. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This patch adds the support for pause filtering threshold. This feature
support is indicated by CPUID Fn8000_000A_EDX. See AMD APM Vol 2 Section
15.14.4 Pause Intercept Filtering for more details.
In this mode, a 16-bit pause filter threshold field is added in VMCB.
The threshold value is a cycle count that is used to reset the pause
counter. As with simple pause filtering, VMRUN loads the pause count
value from VMCB into an internal counter. Then, on each pause instruction
the hardware checks the elapsed number of cycles since the most recent
pause instruction against the pause Filter Threshold. If the elapsed cycle
count is greater than the pause filter threshold, then the internal pause
count is reloaded from VMCB and execution continues. If the elapsed cycle
count is less than the pause filter threshold, then the internal pause
count is decremented. If the count value is less than zero and pause
intercept is enabled, a #VMEXIT is triggered. If advanced pause filtering
is supported and pause filter threshold field is set to zero, the filter
will operate in the simpler, count only mode.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This patch brings some of the code from vmx to x86.h header file. Now, we
can share this code between vmx and svm. Modified couple functions to make
it common.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Get rid of ple_window_actual_max, because its benefits are really
minuscule and the logic is complicated.
The overflows(and underflow) are controlled in __ple_window_grow
and _ple_window_shrink respectively.
Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
[Fixed potential wraparound and change the max to UINT_MAX. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The vmx module parameters are supposed to be unsigned variants.
Also fixed the checkpatch errors like the one below.
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
+module_param(ple_gap, uint, S_IRUGO);
Signed-off-by: Babu Moger <babu.moger@amd.com>
[Expanded uint to unsigned int in code. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Cortex-R8 has identical initialisation requirements to Cortex-R7, so
hook it up in proc-v7.S in the same way.
Signed-off-by: Luca Scalabrino <luca.scalabrino@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
CONFIG_FORTIFY_SOURCE detects various overflows at compile-time.
(6974f0c4555e ("include/linux/string.h:
add the option of fortified string.h functions)
ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and
run with CONFIG_FORTIFY_SOURCE.
Since ARM can be built and run with that flag like other architectures,
select ARCH_HAS_FORTIFY_SOURCE as default.
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
This reverts commit 4b1e84276a6172980c5bf39aa091ba13e90d6dad.
Software uses the valid bits to decide if the values can be used for
further processing or other actions. So setting the valid bits will have
software act on values that it shouldn't be acting on.
The recommendation to save all the register values does not mean that
the values are always valid.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: bp@suse.de
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180326191526.64314-1-Yazen.Ghannam@amd.com
|
|
A critical error was found testing the fixed UV4 HUB in that an MMR address
was found to be incorrect. This causes the virtual address space for
accessing the MMIOH1 region to be allocated with the incorrect size.
Fixes: 673aa20c55a1 ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes")
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/dt
Pull "i.MX device tree updates for 4.17, round 2" Shawn Guo:
- Add missing property '#sound-dai-cells' for sgtl5000 codec node
in imx6ul-isiot board to fix warning seen with DTC 1.4.6.
- Use stdout-path instead of linux,stdout-path to fix DTC warning
reported by DTC 1.4.6.
* tag 'imx-dt-4.17-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6ul-isiot: Pass the required '#sound-dai-cells'
ARM: dts: imx6-phytec: Use the standard 'stdout-path' property
|