Age | Commit message (Collapse) | Author |
|
Add support for adding and parsing Hostwide elements to the
Guest-state-buffer data structure used in apiv2. These elements are used to
share meta-information pertaining to entire L1-Lpar and this
meta-information is maintained by L0-PowerVM hypervisor. Example of this
include the amount of the page-table memory currently used by L0-PowerVM
for hosting the Shadow-Pagetable of all active L2-Guests. More of the are
documented in kernel-documentation at [1]. The Hostwide GSB elements are
currently only support with H_GUEST_SET_STATE hcall with a special flag
namely 'KVMPPC_GS_FLAGS_HOST_WIDE'.
The patch introduces new defs for the 5 new Hostwide GSB elements including
their GSIDs as well as introduces a new class of GSB elements namely
'KVMPPC_GS_CLASS_HOSTWIDE' to indicate to GSB construction/parsing
infrastructure in 'kvm/guest-state-buffer.c'. Also
gs_msg_ops_vcpu_get_size(), kvmppc_gsid_type() and
kvmppc_gse_{flatten,unflatten}_iden() are updated to appropriately indicate
the needed size for these Hostwide GSB elements as well as how to
flatten/unflatten their GSIDs so that they can be marked as available in
GSB bitmap.
[1] Documention/arch/powerpc/kvm-nested.rst
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-3-vaibhav@linux.ibm.com
|
|
The Power11 architected and raw mode support in Linux was merged in commit
c2ed087ed35c ("powerpc: Add Power11 architected and raw mode"), and the
corresponding support in QEMU is pending in [1], which is currently in
its V6.
Currently, booting a KVM guest inside a pseries LPAR (Logical Partition)
on a kernel without P11 support results the guest boot in a Power10
compatibility mode (i.e., with logical PVR of Power10). However, booting
a KVM guest on a kernel with P11 support causes the following boot crash.
On a Power11 LPAR, the Power Hypervisor (L0) returns a support for both
Power10 and Power11 capabilities through H_GUEST_GET_CAPABILITIES hcall.
However, KVM currently supports only Power10 capabilities, resulting in
only Power10 capabilities being set as "nested capabilities" via an
H_GUEST_SET_CAPABILITIES hcall.
In the guest entry path, gs_msg_ops_kvmhv_nestedv2_config_fill_info() is
called by kvmhv_nestedv2_flush_vcpu() to fill the GSB (Guest State
Buffer) elements. The arch_compat is set to the logical PVR of Power11,
followed by an H_GUEST_SET_STATE hcall. This hcall returns
H_INVALID_ELEMENT_VALUE as a return code when setting a Power11 logical
PVR, as only Power10 capabilities were communicated as supported between
PHYP and KVM, utimately resulting in the KVM guest boot crash.
KVM: unknown exit, hardware reason ffffffffffffffea
NIP 000000007daf97e0 LR 000000007daf1aec CTR 000000007daf1ab4 XER 0000000020040000 CPU#0
MSR 8000000000103000 HID0 0000000000000000 HF 6c002000 iidx 3 didx 3
TB 00000000 00000000 DECR 0
GPR00 8000000000003000 000000007e580e20 000000007db26700 0000000000000000
GPR04 00000000041a0c80 000000007df7f000 0000000000200000 000000007df7f000
GPR08 000000007db6d5d8 000000007e65fa90 000000007db6d5d0 0000000000003000
GPR12 8000000000000001 0000000000000000 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20 0000000000000000 0000000000000000 0000000000000000 000000007db21a30
GPR24 000000007db65000 0000000000000000 0000000000000000 0000000000000003
GPR28 000000007db6d5e0 000000007db22220 000000007daf27ac 000000007db75000
CR 20000404 [ E - - - - G - G ] RES 000@ffffffffffffffff
SRR0 000000007daf97e0 SRR1 8000000000102000 PVR 0000000000820200 VRSAVE 0000000000000000
SPRG0 0000000000000000 SPRG1 000000000000ff20 SPRG2 0000000000000000 SPRG3 0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000
CFAR 0000000000000000
LPCR 0000000000020400
PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000
Fix this by adding the Power11 capability support and the required
plumbing in place.
Note:
* Booting a Power11 KVM nested PAPR guest requires [1] in QEMU.
[1] https://lore.kernel.org/all/20240731055022.696051-1-adityag@linux.ibm.com/
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241028101622.741573-1-amachhiw@linux.ibm.com
|
|
The nestedv2 APIs has the guest state element defined for HASHPKEYR
for the save-restore with L0. However, its ignored in the code.
The patch takes care of this for the HASHPKEYR GSID.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/171759286679.1480.17383725118762651985.stgit@linux.ibm.com
|
|
The nestedv2 APIs has the guest state element defined for HASHKEYR for
the save-restore with L0. However, its ignored in the code.
The patch takes care of this for the HASHKEYR GSID.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/171759284380.1480.15665015792935543933.stgit@linux.ibm.com
|
|
The nestedv2 APIs has the guest state element defined for DEXCR
for the save-restore with L0. However, its ignored in the code.
The patch takes care of this for the DEXCR GSID.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/171759281060.1480.654592298305141881.stgit@linux.ibm.com
|
|
state buffer
Add support for using DPDES in the library for using guest state
buffers. DPDES support is needed for enabling usage of doorbells in a L2
KVM on PAPR guest.
Fixes: 6ccbbc33f06a ("KVM: PPC: Add helper library for Guest State Buffers")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240605113913.83715-2-gautam@linux.ibm.com
|
|
gs_msg_ops_kvmhv_nestedv2_config_fill_info()
The return value of kvmppc_gse_put_buff_info() is not assigned to 'rc' and
'rc' is uninitialized at this point.
So the error handling can not work.
Assign the expected value to 'rc' to fix the issue.
Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/a7ed4cc12e0a0bbd97fac44fe6c222d1c393ec95.1706441651.git.christophe.jaillet@wanadoo.fr
|
|
Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
below error as L1 qemu sends PVR value 'arch_compat' == 0 via
ppc_set_compat ioctl. This triggers a condition failure in
kvmppc_set_arch_compat() resulting in an EINVAL.
qemu-system-ppc64: Unable to set CPU compatibility mode in KVM: Invalid
argument
Also, a value of 0 for arch_compat generally refers the default
compatibility of the host. But, arch_compat, being a Guest Wide Element
in nested API v2, cannot be set to 0 in GSB as PowerVM (L0) expects a
non-zero value. A value of 0 triggers a kernel trap during a reboot and
consequently causes it to fail:
[ 22.106360] reboot: Restarting system
KVM: unknown exit, hardware reason ffffffffffffffea
NIP 0000000000000100 LR 000000000000fe44 CTR 0000000000000000 XER 0000000020040092 CPU#0
MSR 0000000000001000 HID0 0000000000000000 HF 6c000000 iidx 3 didx 3
TB 00000000 00000000 DECR 0
GPR00 0000000000000000 0000000000000000 c000000002a8c300 000000007fe00000
GPR04 0000000000000000 0000000000000000 0000000000001002 8000000002803033
GPR08 000000000a000000 0000000000000000 0000000000000004 000000002fff0000
GPR12 0000000000000000 c000000002e10000 0000000105639200 0000000000000004
GPR16 0000000000000000 000000010563a090 0000000000000000 0000000000000000
GPR20 0000000105639e20 00000001056399c8 00007fffe54abab0 0000000105639288
GPR24 0000000000000000 0000000000000001 0000000000000001 0000000000000000
GPR28 0000000000000000 0000000000000000 c000000002b30840 0000000000000000
CR 00000000 [ - - - - - - - - ] RES 000@ffffffffffffffff
SRR0 0000000000000000 SRR1 0000000000000000 PVR 0000000000800200 VRSAVE 0000000000000000
SPRG0 0000000000000000 SPRG1 0000000000000000 SPRG2 0000000000000000 SPRG3 0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000
HSRR0 0000000000000000 HSRR1 0000000000000000
CFAR 0000000000000000
LPCR 0000000000020400
PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000
kernel:trap=0xffffffea | pc=0x100 | msr=0x1000
This patch updates kvmppc_set_arch_compat() to use the host PVR value if
'compat_pvr' == 0 indicating that qemu doesn't want to enforce any
specific PVR compat mode.
The relevant part of the code might need a rework if PowerVM implements
a support for `arch_compat == 0` in nestedv2 API.
Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests")
Reviewed-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240207054526.3720087-1-amachhiw@linux.ibm.com
|
|
In the nestedv2 case, the L1 may register the L2's VPA with the L0. This
allows the L0 to manage the L2's dispatch count, as well as enable
possible performance optimisations by seeing if certain resources are
not being used by the L2 (such as the PMCs).
Use the H_GUEST_SET_STATE call to inform the L0 of the L2's VPA
address. This can not be done in the H_GUEST_VCPU_RUN input buffer.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231201132618.555031-11-vaibhav@linux.ibm.com
|
|
A series of hcalls have been added to the PAPR which allow a regular
guest partition to create and manage guest partitions of its own. KVM
already had an interface that allowed this on powernv platforms. This
existing interface will now be called "nestedv1". The newly added PAPR
interface will be called "nestedv2". PHYP will support the nestedv2
interface. At this time the host side of the nestedv2 interface has not
been implemented on powernv but there is no technical reason why it
could not be added.
The nestedv1 interface is still supported.
Add support to KVM to utilize these hcalls to enable running nested
guests as a pseries guest on PHYP.
Overview of the new hcall usage:
- L1 and L0 negotiate capabilities with
H_GUEST_{G,S}ET_CAPABILITIES()
- L1 requests the L0 create a L2 with
H_GUEST_CREATE() and receives a handle to use in future hcalls
- L1 requests the L0 create a L2 vCPU with
H_GUEST_CREATE_VCPU()
- L1 sets up the L2 using H_GUEST_SET and the
H_GUEST_VCPU_RUN input buffer
- L1 requests the L0 runs the L2 vCPU using H_GUEST_VCPU_RUN()
- L2 returns to L1 with an exit reason and L1 reads the
H_GUEST_VCPU_RUN output buffer populated by the L0
- L1 handles the exit using H_GET_STATE if necessary
- L1 reruns L2 vCPU with H_GUEST_VCPU_RUN
- L1 frees the L2 in the L0 with H_GUEST_DELETE()
Support for the new API is determined by trying
H_GUEST_GET_CAPABILITIES. On a successful return, use the nestedv2
interface.
Use the vcpu register state setters for tracking modified guest state
elements and copy the thread wide values into the H_GUEST_VCPU_RUN input
buffer immediately before running a L2. The guest wide
elements can not be added to the input buffer so send them with a
separate H_GUEST_SET call if necessary.
Make the vcpu register getter load the corresponding value from the real
host with H_GUEST_GET. To avoid unnecessarily calling H_GUEST_GET, track
which values have already been loaded between H_GUEST_VCPU_RUN calls. If
an element is present in the H_GUEST_VCPU_RUN output buffer it also does
not need to be loaded again.
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-11-jniethe5@gmail.com
|