Age | Commit message (Collapse) | Author |
|
Android needs to move large memory regions for garbage collection. The GC
requires moving physical pages of multi-gigabyte heap using mremap.
During this move, the application threads have to be paused for
correctness. It is critical to keep this pause as short as possible to
avoid jitters during user interaction.
Optimize mremap for >= 1GB-sized regions by moving at the PUD/PGD level if
the source and destination addresses are PUD-aligned. For
CONFIG_PGTABLE_LEVELS == 3, moving at the PUD level in effect moves PGD
entries, since the PUD entry is “folded back” onto the PGD entry. Add
HAVE_MOVE_PUD so that architectures where moving at the PUD level isn't
supported/tested can turn this off by not selecting the config.
Link: https://lkml.kernel.org/r/20201014005320.2233162-4-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Hassan Naveed <hnaveed@wavecomp.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jia He <justin.he@arm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Minchan Kim <minchan@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: SeongJae Park <sjpark@amazon.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
For many workloads, pagetable consumption is significant and it makes
sense to expose it in the memory.stat for the memory cgroups. However at
the moment, the pagetables are accounted per-zone. Converting them to
per-node and using the right interface will correctly account for the
memory cgroups as well.
[akpm@linux-foundation.org: export __mod_lruvec_page_state to modules for arch/mips/kvm/]
Link: https://lkml.kernel.org/r/20201130212541.2781790-3-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during
fork() for ptes") pages under a FOLL_PIN will not be write protected
during COW for fork. This means that pages returned from
pin_user_pages(FOLL_WRITE) should not become write protected while the pin
is active.
However, there is a small race where get_user_pages_fast(FOLL_PIN) can
establish a FOLL_PIN at the same time copy_present_page() is write
protecting it:
CPU 0 CPU 1
get_user_pages_fast()
internal_get_user_pages_fast()
copy_page_range()
pte_alloc_map_lock()
copy_present_page()
atomic_read(has_pinned) == 0
page_maybe_dma_pinned() == false
atomic_set(has_pinned, 1);
gup_pgd_range()
gup_pte_range()
pte_t pte = gup_get_pte(ptep)
pte_access_permitted(pte)
try_grab_compound_head()
pte = pte_wrprotect(pte)
set_pte_at();
pte_unmap_unlock()
// GUP now returns with a write protected page
The first attempt to resolve this by using the write protect caused
problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid
early COW write protect games during fork()")
Instead wrap copy_p4d_range() with the write side of a seqcount and check
the read side around gup_pgd_range(). If there is a collision then
get_user_pages_fast() fails and falls back to slow GUP.
Slow GUP is safe against this race because copy_page_range() is only
called while holding the exclusive side of the mmap_lock on the src
mm_struct.
[akpm@linux-foundation.org: coding style fixes]
Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com
Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com
Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de> [seqcount_t parts]
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "selftests/vm: gup_test, hmm-tests, assorted improvements", v3.
Summary: This series provides two main things, and a number of smaller
supporting goodies. The two main points are:
1) Add a new sub-test to gup_test, which in turn is a renamed version
of gup_benchmark. This sub-test allows nicer testing of dump_pages(),
at least on user-space pages.
For quite a while, I was doing a quick hack to gup_test.c whenever I
wanted to try out changes to dump_page(). Then Matthew Wilcox asked me
what I meant when I said "I used my dump_page() unit test", and I
realized that it might be nice to check in a polished up version of
that.
Details about how it works and how to use it are in the commit
description for patch #6 ("selftests/vm: gup_test: introduce the
dump_pages() sub-test").
2) Fixes a limitation of hmm-tests: these tests are incredibly useful,
but only if people actually build and run them. And it turns out that
libhugetlbfs is a little too effective at throwing a wrench in the
works, there. So I've added a little configuration check that removes
just two of the 21 hmm-tests, if libhugetlbfs is not available.
Further details in the commit description of patch #8
("selftests/vm: hmm-tests: remove the libhugetlbfs dependency").
Other smaller things that this series does:
a) Remove code duplication by creating gup_test.h.
b) Clear up the sub-test organization, and their invocation within
run_vmtests.sh.
c) Other minor assorted improvements.
[1] v2 is here:
https://lore.kernel.org/linux-doc/20200929212747.251804-1-jhubbard@nvidia.com/
[2] https://lore.kernel.org/r/CAHk-=wgh-TMPHLY3jueHX7Y2fWh3D+nMBqVS__AZm6-oorquWA@mail.gmail.com
This patch (of 9):
Rename nearly every "gup_benchmark" reference and file name to "gup_test".
The one exception is for the actual gup benchmark test itself.
The current code already does a *little* bit more than benchmarking, and
definitely covers more than get_user_pages_fast(). More importantly,
however, subsequent patches are about to add some functionality that is
non-benchmark related.
Closely related changes:
* Kconfig: in addition to renaming the options from GUP_BENCHMARK to
GUP_TEST, update the help text to reflect that it's no longer a
benchmark-only test.
Link: https://lkml.kernel.org/r/20201026064021.3545418-1-jhubbard@nvidia.com
Link: https://lkml.kernel.org/r/20201026064021.3545418-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There are a few spelling mistakes in the Kconfig comments and help text.
Fix these.
Link: https://lkml.kernel.org/r/20201207155004.171962-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 5.11
- PSCI relay at EL2 when "protected KVM" is enabled
- New exception injection code
- Simplification of AArch32 system register handling
- Fix PMU accesses when no PMU is enabled
- Expose CSV3 on non-Meltdown hosts
- Cache hierarchy discovery fixes
- PV steal-time cleanups
- Allow function pointers at EL2
- Various host EL2 entry cleanups
- Simplification of the EL2 vector allocation
|
|
The GHCB specification requires the hypervisor to save the address of an
AP Jump Table so that, for example, vCPUs that have been parked by UEFI
can be started by the OS. Provide support for the AP Jump Table set/get
exit code.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The memblock code ignores any memory that doesn't fit in the
linear mapping. In order to preserve the distance between two physical
memory locations and their mappings in the linear map, any hole between
two memory regions occupies the same space in the linear map.
On most systems, this is hardly a problem (the memory banks are close
together, and VA_BITS represents a large space compared to the available
memory *and* the potential gaps).
On NUMA systems, things are quite different: the gaps between the
memory nodes can be pretty large compared to the memory size itself,
and the range from memblock_start_of_DRAM() to memblock_end_of_DRAM()
can exceed the space described by VA_BITS.
Unfortunately, we're not very good at making this obvious to the user,
and on a D05 system (two sockets and 4 nodes with 64GB each)
accidentally configured with 39bit VA, we display something like this:
[ 0.000000] NUMA: NODE_DATA [mem 0x1ffbffe100-0x1ffbffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x2febfc1100-0x2febfc2fff]
[ 0.000000] NUMA: Initmem setup node 2 [<memory-less node>]
[ 0.000000] NUMA: NODE_DATA [mem 0x2febfbf200-0x2febfc10ff]
[ 0.000000] NUMA: NODE_DATA(2) on node 1
[ 0.000000] NUMA: Initmem setup node 3 [<memory-less node>]
[ 0.000000] NUMA: NODE_DATA [mem 0x2febfbd300-0x2febfbf1ff]
[ 0.000000] NUMA: NODE_DATA(3) on node 1
which isn't very explicit, and doesn't tell the user why 128GB
have suddently disappeared.
Let's add a warning message telling the user that memory has been
truncated, and offer a potential solution (bumping VA_BITS up).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20201215152918.1511108-1-maz@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The irq descriptor is already there, no need to look it up again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20201210194043.769108348@linutronix.de
|
|
The irq descriptor is already there, no need to look it up again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-parisc@vger.kernel.org
Link: https://lore.kernel.org/r/20201210194043.659522455@linutronix.de
|
|
The irq descriptor is already there, no need to look it up again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201210194043.546326568@linutronix.de
|
|
The irq descriptor is already there, no need to look it up again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201210194043.454288890@linutronix.de
|
|
The SMP variant works perfectly fine on UP as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-parisc@vger.kernel.org
Link: https://lore.kernel.org/r/20201210194043.172893840@linutronix.de
|
|
This function uses irq_to_desc() and is going to be used by modules to
replace the open coded irq_to_desc() (ab)usage. The final goal is to remove
the export of irq_to_desc() so driver cannot fiddle with it anymore.
Move it into the core code and fixup the usage sites to include the proper
header.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201210194042.548936472@linutronix.de
|
|
* pm-sleep:
PM: sleep: Add dev_wakeup_path() helper
PM / suspend: fix kernel-doc markup
PM: sleep: Print driver flags for all devices during suspend/resume
* pm-acpi:
PM: ACPI: Refresh wakeup device power configuration every time
PM: ACPI: PCI: Drop acpi_pm_set_bridge_wakeup()
PM: ACPI: reboot: Use S5 for reboot
* pm-domains:
PM: domains: create debugfs nodes when adding power domains
PM: domains: replace -ENOTSUPP with -EOPNOTSUPP
* powercap:
powercap: Adjust printing the constraint name with new line
powercap: RAPL: Add AMD Fam19h RAPL support
powercap: Add AMD Fam17h RAPL support
powercap/intel_rapl_msr: Convert rapl_msr_priv into pointer
x86/msr-index: sort AMD RAPL MSRs by address
|
|
When building with W=1, GCC complains that we haven't defined prototypes
for a number of non-static functions in entry-common.c:
| arch/arm64/kernel/entry-common.c:203:25: warning: no previous prototype for 'el1_sync_handler' [-Wmissing-prototypes]
| 203 | asmlinkage void noinstr el1_sync_handler(struct pt_regs *regs)
| | ^~~~~~~~~~~~~~~~
| arch/arm64/kernel/entry-common.c:377:25: warning: no previous prototype for 'el0_sync_handler' [-Wmissing-prototypes]
| 377 | asmlinkage void noinstr el0_sync_handler(struct pt_regs *regs)
| | ^~~~~~~~~~~~~~~~
| arch/arm64/kernel/entry-common.c:447:25: warning: no previous prototype for 'el0_sync_compat_handler' [-Wmissing-prototypes]
| 447 | asmlinkage void noinstr el0_sync_compat_handler(struct pt_regs *regs)
| | ^~~~~~~~~~~~~~~~~~~~~~~
... and so automated build systems using W=1 end up sending a number of
emails, despite this not being a real problem as the only callers are in
entry.S where prototypes cannot matter.
For similar cases in entry-common.c we added prototypes to
asm/exception.h, so let's do the same thing here for consistency.
Note that there are a number of other warnings printed with W=1, both
under arch/arm64 and in core code, and this patch only addresses the
cases in entry-common.c. Automated build systems typically filter these
warnings such that they're only reported when changes are made nearby,
so we don't need to solve them all at once.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201214113353.44417-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The previous call to update_freq_counters_refs() has already updated the
per-cpu variables, don't overwrite them with the same value again.
Fixes: 4b9cf23c179a ("arm64: wrap and generalise counter read functions")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/7a171f710cdc0f808a2bfbd7db839c0d265527e7.1607579234.git.viresh.kumar@linaro.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This makes it easy to disable building with -Werror:
$ make defconfig
$ grep WERROR .config
# CONFIG_PPC_DISABLE_WERROR is not set
CONFIG_PPC_WERROR=y
$ make disable-werror.config
GEN Makefile
Using .config as base
Merging arch/powerpc/configs/disable-werror.config
Value of CONFIG_PPC_DISABLE_WERROR is redefined by fragment arch/powerpc/configs/disable-werror.config:
Previous value: # CONFIG_PPC_DISABLE_WERROR is not set
New value: CONFIG_PPC_DISABLE_WERROR=y
...
$ grep WERROR .config
CONFIG_PPC_DISABLE_WERROR=y
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201023040002.3313371-1-mpe@ellerman.id.au
|
|
Add a phony target for ppc64le_allnoconfig, which tests some
combinations of CONFIG symbols that aren't covered by any of our
defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125031551.2112715-1-mpe@ellerman.id.au
|
|
Sometimes we can't read an error log from OPAL, and we print an error
message accordingly. But the OPAL userspace tools seem to like retrying a
lot, in which case we flood the kernel log with a lot of messages.
Change pr_err() to pr_err_ratelimited() to help with this.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211021140.28402-1-ajd@linux.ibm.com
|
|
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
|
|
with GCC10
ppc-linux-objdump -d vmlinux | grep -e "<csum_partial>" -e "<__csum_partial>"
With gcc9 I get:
c0017ef8 <__csum_partial>:
c00182fc: 4b ff fb fd bl c0017ef8 <__csum_partial>
c0018478: 4b ff fa 80 b c0017ef8 <__csum_partial>
c03e8458: 4b c2 fa a0 b c0017ef8 <__csum_partial>
c03e8518: 4b c2 f9 e1 bl c0017ef8 <__csum_partial>
c03ef410: 4b c2 8a e9 bl c0017ef8 <__csum_partial>
c03f0b24: 4b c2 73 d5 bl c0017ef8 <__csum_partial>
c04279a4: 4b bf 05 55 bl c0017ef8 <__csum_partial>
c0429820: 4b be e6 d9 bl c0017ef8 <__csum_partial>
c0429944: 4b be e5 b5 bl c0017ef8 <__csum_partial>
c042b478: 4b be ca 81 bl c0017ef8 <__csum_partial>
c042b554: 4b be c9 a5 bl c0017ef8 <__csum_partial>
c045f15c: 4b bb 8d 9d bl c0017ef8 <__csum_partial>
c0492190: 4b b8 5d 69 bl c0017ef8 <__csum_partial>
c0492310: 4b b8 5b e9 bl c0017ef8 <__csum_partial>
c0495594: 4b b8 29 65 bl c0017ef8 <__csum_partial>
c049c420: 4b b7 ba d9 bl c0017ef8 <__csum_partial>
c049c870: 4b b7 b6 89 bl c0017ef8 <__csum_partial>
c049c930: 4b b7 b5 c9 bl c0017ef8 <__csum_partial>
c04a9ca0: 4b b6 e2 59 bl c0017ef8 <__csum_partial>
c04bdde4: 4b b5 a1 15 bl c0017ef8 <__csum_partial>
c04be480: 4b b5 9a 79 bl c0017ef8 <__csum_partial>
c04be710: 4b b5 97 e9 bl c0017ef8 <__csum_partial>
c04c969c: 4b b4 e8 5d bl c0017ef8 <__csum_partial>
c04ca2fc: 4b b4 db fd bl c0017ef8 <__csum_partial>
c04cf5bc: 4b b4 89 3d bl c0017ef8 <__csum_partial>
c04d0440: 4b b4 7a b9 bl c0017ef8 <__csum_partial>
With gcc10 I get:
c0018d08 <__csum_partial>:
c0019020 <csum_partial>:
c0019020: 4b ff fc e8 b c0018d08 <__csum_partial>
c001914c: 4b ff fe d4 b c0019020 <csum_partial>
c0019250: 4b ff fd d1 bl c0019020 <csum_partial>
c03e404c <csum_partial>:
c03e404c: 4b c3 4c bc b c0018d08 <__csum_partial>
c03e4050: 4b ff ff fc b c03e404c <csum_partial>
c03e40fc: 4b ff ff 51 bl c03e404c <csum_partial>
c03e6680: 4b ff d9 cd bl c03e404c <csum_partial>
c03e68c4: 4b ff d7 89 bl c03e404c <csum_partial>
c03e7934: 4b ff c7 19 bl c03e404c <csum_partial>
c03e7bf8: 4b ff c4 55 bl c03e404c <csum_partial>
c03eb148: 4b ff 8f 05 bl c03e404c <csum_partial>
c03ecf68: 4b c2 bd a1 bl c0018d08 <__csum_partial>
c04275b8 <csum_partial>:
c04275b8: 4b bf 17 50 b c0018d08 <__csum_partial>
c0427884: 4b ff fd 35 bl c04275b8 <csum_partial>
c0427b18: 4b ff fa a1 bl c04275b8 <csum_partial>
c0427bd8: 4b ff f9 e1 bl c04275b8 <csum_partial>
c0427cd4: 4b ff f8 e5 bl c04275b8 <csum_partial>
c0427e34: 4b ff f7 85 bl c04275b8 <csum_partial>
c045a1c0: 4b bb eb 49 bl c0018d08 <__csum_partial>
c0489464 <csum_partial>:
c0489464: 4b b8 f8 a4 b c0018d08 <__csum_partial>
c04896b0: 4b ff fd b5 bl c0489464 <csum_partial>
c048982c: 4b ff fc 39 bl c0489464 <csum_partial>
c0490158: 4b b8 8b b1 bl c0018d08 <__csum_partial>
c0492f0c <csum_partial>:
c0492f0c: 4b b8 5d fc b c0018d08 <__csum_partial>
c049326c: 4b ff fc a1 bl c0492f0c <csum_partial>
c049333c: 4b ff fb d1 bl c0492f0c <csum_partial>
c0493b18: 4b ff f3 f5 bl c0492f0c <csum_partial>
c0493f50: 4b ff ef bd bl c0492f0c <csum_partial>
c0493ffc: 4b ff ef 11 bl c0492f0c <csum_partial>
c04a0f78: 4b b7 7d 91 bl c0018d08 <__csum_partial>
c04b3e3c: 4b b6 4e cd bl c0018d08 <__csum_partial>
c04b40d0 <csum_partial>:
c04b40d0: 4b b6 4c 38 b c0018d08 <__csum_partial>
c04b4448: 4b ff fc 89 bl c04b40d0 <csum_partial>
c04b46f4: 4b ff f9 dd bl c04b40d0 <csum_partial>
c04bf448: 4b b5 98 c0 b c0018d08 <__csum_partial>
c04c5264: 4b b5 3a a5 bl c0018d08 <__csum_partial>
c04c61e4: 4b b5 2b 25 bl c0018d08 <__csum_partial>
gcc10 defines multiple versions of csum_partial() which are just
an unconditionnal branch to __csum_partial().
To enforce inlining of that branch to __csum_partial(),
mark csum_partial() as __always_inline.
With this patch with gcc10:
c0018d08 <__csum_partial>:
c0019148: 4b ff fb c0 b c0018d08 <__csum_partial>
c001924c: 4b ff fa bd bl c0018d08 <__csum_partial>
c03e40ec: 4b c3 4c 1d bl c0018d08 <__csum_partial>
c03e4120: 4b c3 4b e8 b c0018d08 <__csum_partial>
c03eb004: 4b c2 dd 05 bl c0018d08 <__csum_partial>
c03ecef4: 4b c2 be 15 bl c0018d08 <__csum_partial>
c0427558: 4b bf 17 b1 bl c0018d08 <__csum_partial>
c04286e4: 4b bf 06 25 bl c0018d08 <__csum_partial>
c0428cd8: 4b bf 00 31 bl c0018d08 <__csum_partial>
c0428d84: 4b be ff 85 bl c0018d08 <__csum_partial>
c045a17c: 4b bb eb 8d bl c0018d08 <__csum_partial>
c0489450: 4b b8 f8 b9 bl c0018d08 <__csum_partial>
c0491860: 4b b8 74 a9 bl c0018d08 <__csum_partial>
c0492eec: 4b b8 5e 1d bl c0018d08 <__csum_partial>
c04a0eac: 4b b7 7e 5d bl c0018d08 <__csum_partial>
c04b3e34: 4b b6 4e d5 bl c0018d08 <__csum_partial>
c04b426c: 4b b6 4a 9d bl c0018d08 <__csum_partial>
c04b463c: 4b b6 46 cd bl c0018d08 <__csum_partial>
c04c004c: 4b b5 8c bd bl c0018d08 <__csum_partial>
c04c0368: 4b b5 89 a1 bl c0018d08 <__csum_partial>
c04c5254: 4b b5 3a b5 bl c0018d08 <__csum_partial>
c04c60d4: 4b b5 2c 35 bl c0018d08 <__csum_partial>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a1d31f84ddb0926813b17fcd5cc7f3fa7b4deac2.1602759123.git.christophe.leroy@csgroup.eu
|
|
Threshold Event Counter Multiplier (TECM) is part of Monitor Mode
Control Register A (MMCRA). This field along with Threshold Event
Counter Exponent (TECE) is used to get threshould counter value.
In Power10, this is a 8bit field, so patch fixes the
current code to modify the MMCRA[TECM] extraction macro to
handle this change. ISA v3.1 says this is a 7 bit field but
POWER10 it's actually 8 bits which will hopefully be fixed
in ISA v3.1 update.
Fixes: 170a315f41c6 ("powerpc/perf: Support to export MMCRA[TEC*] field to userspace")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1608022578-1532-1-git-send-email-atrajeev@linux.vnet.ibm.com
|
|
Commit 7bfe54b5f165 ("powerpc/mm: Refactor the floor/ceiling check in
hugetlb range freeing functions") inadvertely removed the mask
applied to start parameter in those two functions, leading to the
following crash on power9.
LTP: starting hugemmap05_1 (hugemmap05 -m)
------------[ cut here ]------------
kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:387!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 NUMA PowerNV
...
CPU: 99 PID: 308 Comm: ksoftirqd/99 Tainted: G O 5.10.0-rc7-next-20201211 #1
NIP: c00000000005dbec LR: c0000000003352f4 CTR: 0000000000000000
REGS: c00020000bb6f830 TRAP: 0700 Tainted: G O (5.10.0-rc7-next-20201211)
MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002284 XER: 20040000
GPR00: c0000000003352f4 c00020000bb6fad0 c000000007f70b00 c0002000385b3ff0
GPR04: 0000000000000000 0000000000000003 c00020000bb6f8b4 0000000000000001
GPR08: 0000000000000001 0000000000000009 0000000000000008 0000000000000002
GPR12: 0000000024002488 c000201fff649c00 c000000007f2a20c 0000000000000000
GPR16: 0000000000000007 0000000000000000 c000000000194d10 c000000000194d10
GPR24: 0000000000000014 0000000000000015 c000201cc6e72398 c000000007fac4b4
GPR28: c000000007f2bf80 c000000007fac2f8 0000000000000008 c000200033870000
NIP [c00000000005dbec] __tlb_remove_table+0x1dc/0x1e0
pgtable_free at arch/powerpc/mm/book3s64/pgtable.c:387
(inlined by) __tlb_remove_table at arch/powerpc/mm/book3s64/pgtable.c:405
LR [c0000000003352f4] tlb_remove_table_rcu+0x54/0xa0
Call Trace:
__tlb_remove_table+0x13c/0x1e0 (unreliable)
tlb_remove_table_rcu+0x54/0xa0
__tlb_remove_table_free at mm/mmu_gather.c:101
(inlined by) tlb_remove_table_rcu at mm/mmu_gather.c:156
rcu_core+0x35c/0xbb0
rcu_do_batch at kernel/rcu/tree.c:2502
(inlined by) rcu_core at kernel/rcu/tree.c:2737
__do_softirq+0x480/0x704
run_ksoftirqd+0x74/0xd0
run_ksoftirqd at kernel/softirq.c:651
(inlined by) run_ksoftirqd at kernel/softirq.c:642
smpboot_thread_fn+0x278/0x320
kthread+0x1c4/0x1d0
ret_from_kernel_thread+0x5c/0x80
Properly apply the masks before calling pmd_free_tlb() and
pud_free_tlb() respectively.
Fixes: 7bfe54b5f165 ("powerpc/mm: Refactor the floor/ceiling check in hugetlb range freeing functions")
Reported-by: Qian Cai <qcai@redhat.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/56feccd7b6fcd98e353361a233fa7bb8e67c3164.1607780469.git.christophe.leroy@csgroup.eu
|
|
According to ISAv3.1 and ISAv3.0b, the msgsndp is described to split
RB in:
msgtype <- (RB) 32:36
payload <- (RB) 37:63
t <- (RB) 57:63
The current way of getting 'msgtype', and 't' is missing their MSB:
msgtype: ((arg >> 27) & 0xf) : Gets (RB) 33:36, missing bit 32
t: (arg &= 0x3f) : Gets (RB) 58:63, missing bit 57
Fixes this by applying the correct mask.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201208215707.31149-1-leobras.c@gmail.com
|
|
Fix the following coccicheck warning:
./arch/powerpc/kvm/booke.c:503:6-16: WARNING: Comparison to bool
./arch/powerpc/kvm/booke.c:505:6-17: WARNING: Comparison to bool
./arch/powerpc/kvm/booke.c:507:6-16: WARNING: Comparison to bool
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1604764178-8087-1-git-send-email-kaixuxia@tencent.com
|
|
Fix the following coccinelle warnings:
./arch/powerpc/kvm/book3s_xics.c:476:3-15: WARNING: Assignment of 0/1 to bool variable
./arch/powerpc/kvm/book3s_xics.c:504:3-15: WARNING: Assignment of 0/1 to bool variable
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1604730382-5810-1-git-send-email-kaixuxia@tencent.com
|
|
An SEV-ES guest is started by invoking a new SEV initialization ioctl,
KVM_SEV_ES_INIT. This identifies the guest as an SEV-ES guest, which is
used to drive the appropriate ASID allocation, VMSA encryption, etc.
Before being able to run an SEV-ES vCPU, the vCPU VMSA must be encrypted
and measured. This is done using the LAUNCH_UPDATE_VMSA command after all
calls to LAUNCH_UPDATE_DATA have been performed, but before LAUNCH_MEASURE
has been performed. In order to establish the encrypted VMSA, the current
(traditional) VMSA and the GPRs are synced to the page that will hold the
encrypted VMSA and then LAUNCH_UPDATE_VMSA is invoked. The vCPU is then
marked as having protected guest state.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e9643245adb809caf3a87c09997926d2f3d6ff41.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The run sequence is different for an SEV-ES guest compared to a legacy or
even an SEV guest. The guest vCPU register state of an SEV-ES guest will
be restored on VMRUN and saved on VMEXIT. There is no need to restore the
guest registers directly and through VMLOAD before VMRUN and no need to
save the guest registers directly and through VMSAVE on VMEXIT.
Update the svm_vcpu_run() function to skip register state saving and
restoring and provide an alternative function for running an SEV-ES guest
in vmenter.S
Additionally, certain host state is restored across an SEV-ES VMRUN. As
a result certain register states are not required to be restored upon
VMEXIT (e.g. FS, GS, etc.), so only do that if the guest is not an SEV-ES
guest.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <fb1c66d32f2194e171b95fc1a8affd6d326e10c1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
An SEV-ES vCPU requires additional VMCB vCPU load/put requirements. SEV-ES
hardware will restore certain registers on VMEXIT, but not save them on
VMRUN (see Table B-3 and Table B-4 of the AMD64 APM Volume 2), so make the
following changes:
General vCPU load changes:
- During vCPU loading, perform a VMSAVE to the per-CPU SVM save area and
save the current values of XCR0, XSS and PKRU to the per-CPU SVM save
area as these registers will be restored on VMEXIT.
General vCPU put changes:
- Do not attempt to restore registers that SEV-ES hardware has already
restored on VMEXIT.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <019390e9cb5e93cd73014fa5a040c17d42588733.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
An SEV-ES vCPU requires additional VMCB initialization requirements for
vCPU creation and vCPU load/put requirements. This includes:
General VMCB initialization changes:
- Set a VMCB control bit to enable SEV-ES support on the vCPU.
- Set the VMCB encrypted VM save area address.
- CRx registers are part of the encrypted register state and cannot be
updated. Remove the CRx register read and write intercepts and replace
them with CRx register write traps to track the CRx register values.
- Certain MSR values are part of the encrypted register state and cannot
be updated. Remove certain MSR intercepts (EFER, CR_PAT, etc.).
- Remove the #GP intercept (no support for "enable_vmware_backdoor").
- Remove the XSETBV intercept since the hypervisor cannot modify XCR0.
General vCPU creation changes:
- Set the initial GHCB gpa value as per the GHCB specification.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <3a8aef366416eddd5556dfa3fdc212aafa1ad0a2.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
SEV and SEV-ES guests each have dedicated ASID ranges. Update the ASID
allocation routine to return an ASID in the respective range.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <d7aed505e31e3954268b2015bb60a1486269c780.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The SVM host save area is used to restore some host state on VMEXIT of an
SEV-ES guest. After allocating the save area, clear it and add the
encryption mask to the SVM host save area physical address that is
programmed into the VM_HSAVE_PA MSR.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <b77aa28af6d7f1a0cb545959e08d6dc75e0c3cba.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The GHCB specification defines how NMIs are to be handled for an SEV-ES
guest. To detect the completion of an NMI the hypervisor must not
intercept the IRET instruction (because a #VC while running the NMI will
issue an IRET) and, instead, must receive an NMI Complete exit event from
the guest.
Update the KVM support for detecting the completion of NMIs in the guest
to follow the GHCB specification. When an SEV-ES guest is active, the
IRET instruction will no longer be intercepted. Now, when the NMI Complete
exit event is received, the iret_interception() function will be called
to simulate the completion of the NMI.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5ea3dd69b8d4396cefdc9048ebc1ab7caa70a847.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The guest FPU state is automatically restored on VMRUN and saved on VMEXIT
by the hardware, so there is no reason to do this in KVM. Eliminate the
allocation of the guest_fpu save area and key off that to skip operations
related to the guest FPU state.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <173e429b4d0d962c6a443c4553ffdaf31b7665a4.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
SEV-ES guests do not currently support SMM. Update the has_emulated_msr()
kvm_x86_ops function to take a struct kvm parameter so that the capability
can be reported at a VM level.
Since this op is also called during KVM initialization and before a struct
kvm instance is available, comments will be added to each implementation
of has_emulated_msr() to indicate the kvm parameter can be null.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <75de5138e33b945d2fb17f81ae507bda381808e3.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Since many of the registers used by the SEV-ES are encrypted and cannot
be read or written, adjust the __get_sregs() / __set_sregs() to take into
account whether the VMSA/guest state is encrypted.
For __get_sregs(), return the actual value that is in use by the guest
for all registers being tracked using the write trap support.
For __set_sregs(), skip setting of all guest registers values.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <23051868db76400a9b07a2020525483a1e62dbcf.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.
SEV-ES guests introduce new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.
Add support to track the value of the guest CR8 register using the control
register write trap so that the hypervisor understands the guest operating
mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5a01033f4c8b3106ca9374b7cadf8e33da852df1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.
SEV-ES guests introduce new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.
Add support to track the value of the guest CR4 register using the control
register write trap so that the hypervisor understands the guest operating
mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c3880bf2db8693aa26f648528fbc6e967ab46e25.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.
SEV-ES support introduces new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.
Add support to track the value of the guest CR0 register using the control
register write trap so that the hypervisor understands the guest operating
mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <182c9baf99df7e40ad9617ff90b84542705ef0d7.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For SEV-ES guests, the interception of EFER write access is not
recommended. EFER interception occurs prior to EFER being modified and
the hypervisor is unable to modify EFER itself because the register is
located in the encrypted register state.
SEV-ES support introduces a new EFER write trap. This trap provides
intercept support of an EFER write after it has been modified. The new
EFER value is provided in the VMCB EXITINFO1 field, allowing the
hypervisor to track the setting of the guest EFER.
Add support to track the value of the guest EFER value using the EFER
write trap so that the hypervisor understands the guest operating mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <8993149352a3a87cd0625b3b61bfd31ab28977e1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For an SEV-ES guest, string-based port IO is performed to a shared
(un-encrypted) page so that both the hypervisor and guest can read or
write to it and each see the contents.
For string-based port IO operations, invoke SEV-ES specific routines that
can complete the operation using common KVM port IO support.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <9d61daf0ffda496703717218f415cdc8fd487100.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For an SEV-ES guest, MMIO is performed to a shared (un-encrypted) page
so that both the hypervisor and guest can read or write to it and each
see the contents.
The GHCB specification provides software-defined VMGEXIT exit codes to
indicate a request for an MMIO read or an MMIO write. Add support to
recognize the MMIO requests and invoke SEV-ES specific routines that
can complete the MMIO operation. These routines use common KVM support
to complete the MMIO operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <af8de55127d5bcc3253d9b6084a0144c12307d4d.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add trace events for entry to and exit from VMGEXIT MSR protocol
processing. The vCPU will be common for the trace events. The MSR
protocol processing is guided by the GHCB GPA in the VMCB, so the GHCB
GPA will represent the input and output values for the entry and exit
events, respectively. Additionally, the exit event will contain the
return code for the event.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c5b3b440c3e0db43ff2fc02813faa94fa54896b0.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add trace events for entry to and exit from VMGEXIT processing. The vCPU
id and the exit reason will be common for the trace events. The exit info
fields will represent the input and output values for the entry and exit
events, respectively.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <25357dca49a38372e8f483753fb0c1c2a70a6898.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The GHCB specification defines a GHCB MSR protocol using the lower
12-bits of the GHCB MSR (in the hypervisor this corresponds to the
GHCB GPA field in the VMCB).
Function 0x100 is a request for termination of the guest. The guest has
encountered some situation for which it has requested to be terminated.
The GHCB MSR value contains the reason for the request.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <f3a1f7850c75b6ea4101e15bbb4a3af1a203f1dc.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The GHCB specification defines a GHCB MSR protocol using the lower
12-bits of the GHCB MSR (in the hypervisor this corresponds to the
GHCB GPA field in the VMCB).
Function 0x004 is a request for CPUID information. Only a single CPUID
result register can be sent per invocation, so the protocol defines the
register that is requested. The GHCB MSR value is set to the CPUID
register value as per the specification via the VMCB GHCB GPA field.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <fd7ee347d3936e484c06e9001e340bf6387092cd.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The GHCB specification defines a GHCB MSR protocol using the lower
12-bits of the GHCB MSR (in the hypervisor this corresponds to the
GHCB GPA field in the VMCB).
Function 0x002 is a request to set the GHCB MSR value to the SEV INFO as
per the specification via the VMCB GHCB GPA field.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c23c163a505290a0d1b9efc4659b838c8c902cbc.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
SEV-ES adds a new VMEXIT reason code, VMGEXIT. Initial support for a
VMGEXIT includes mapping the GHCB based on the guest GPA, which is
obtained from a new VMCB field, and then validating the required inputs
for the VMGEXIT exit reason.
Since many of the VMGEXIT exit reasons correspond to existing VMEXIT
reasons, the information from the GHCB is copied into the VMCB control
exit code areas and KVM register areas. The standard exit handlers are
invoked, similar to standard VMEXIT processing. Before restarting the
vCPU, the GHCB is updated with any registers that have been updated by
the hypervisor.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c6a4ed4294a369bd75c44d03bd7ce0f0c3840e50.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is a pre-patch to consolidate some exit handling code into callable
functions. Follow-on patches for SEV-ES exit handling will then be able
to use them from the sev.c file.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5b8b0ffca8137f3e1e257f83df9f5c881c8a96a3.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|