Age | Commit message (Collapse) | Author |
|
KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
- Setup empty IRQ routing when creating a VM to avoid having to synchronize
SRCU when creating a split IRQCHIP on x86.
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
that arch code can use for hooking both sched_in() and sched_out().
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace detect bugs.
- Mark a vCPU as preempted if and only if it's scheduled out while in the
KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
memory when retrieving guest state during live migration blackout.
- A few minor cleanups
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
|
|
Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
memory range and calls the arch-specific function. The implementation is
optional and enabled by a Kconfig symbol.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a module description for kvm.ko to fix a 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/kvm/kvm.o
Opportunistically update kvm_main.c's comically stale file comment to
match the module description.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240622-md-kvm-v2-1-29a60f7c48b1@quicinc.com
[sean: split x86 changes to a separate commit, remove stale VT-x comment]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
KVM fixes for 6.10
- Fix a "shift too big" goof in the KVM_SEV_INIT2 selftest.
- Compute the max mappable gfn for KVM selftests on x86 using GuestMaxPhyAddr
from KVM's supported CPUID (if it's available).
- Fix a race in kvm_vcpu_on_spin() by ensuring loads and stores are atomic.
- Fix technically benign bug in __kvm_handle_hva_range() where KVM consumes
the return from a void-returning function as if it were a boolean.
|
|
Some allocations done by KVM are temporary, they are created as result
of program actions, but can't exists for arbitrary long times.
They should have been GFP_TEMPORARY (rip!).
OTOH, kvm-nx-lpage-recovery and kvm-pit kernel threads exist for as long
as VM exists but their task_struct memory is not accounted.
This is story for another day.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Message-ID: <c0122f66-f428-417e-a360-b25fc0f154a0@p183>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Mark a vCPU as preempted/ready if-and-only-if it's scheduled out while
running. i.e. Do not mark a vCPU preempted/ready if it's scheduled out
during a non-KVM_RUN ioctl() or when userspace is doing KVM_RUN with
immediate_exit.
Commit 54aa83c90198 ("KVM: x86: do not set st->preempted when going back
to user space") stopped marking a vCPU as preempted when returning to
userspace, but if userspace then invokes a KVM vCPU ioctl() that gets
preempted, the vCPU will be marked preempted/ready. This is arguably
incorrect behavior since the vCPU was not actually preempted while the
guest was running, it was preempted while doing something on behalf of
userspace.
Marking a vCPU preempted iff its running also avoids KVM dirtying guest
memory after userspace has paused vCPUs, e.g. for live migration, which
allows userspace to collect the final dirty bitmap before or in parallel
with saving vCPU state, without having to worry about saving vCPU state
triggering writes to guest memory.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-4-dmatlack@google.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Ensure that any new KVM code that references immediate_exit gets extra
scrutiny by renaming it to immediate_exit__unsafe in kernel code.
All fields in struct kvm_run are subject to TOCTOU races since they are
mapped into userspace, which may be malicious or buggy. To protect KVM,
introduces a new macro that appends __unsafe to select field names in
struct kvm_run, hinting to developers and reviewers that accessing such
fields must be done carefully.
Apply the new macro to immediate_exit, since userspace can make
immediate_exit inconsistent with vcpu->wants_to_run, i.e. accessing
immediate_exit directly could lead to unexpected bugs in the future.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-3-dmatlack@google.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run
loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit
was not set.
Replace all references to vcpu->run->immediate_exit with
!vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a
malicious userspace could invoked KVM_RUN with immediate_exit=true and
then after KVM reads it to set wants_to_run=false, flip it to false.
This would result in the vCPU running in KVM_RUN with
wants_to_run=false. This wouldn't cause any real bugs today but is a
dangerous landmine.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-2-dmatlack@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
If, on a 64 bit system, a vCPU ID is provided that has the upper 32 bits
set to a non-zero value, it may get accepted if the truncated to 32 bits
integer value is below KVM_MAX_VCPU_IDS and 'max_vcpus'. This feels very
wrong and triggered the reporting logic of PaX's SIZE_OVERFLOW plugin.
Instead of silently truncating and accepting such values, pass the full
value to kvm_vm_ioctl_create_vcpu() and make the existing limit checks
return an error.
Even if this is a userland ABI breaking change, no sane userland could
have ever relied on that behaviour.
Reported-by: PaX's SIZE_OVERFLOW plugin running on grsecurity's syzkaller
Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface")
Cc: Emese Revfy <re.emese@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Link: https://lore.kernel.org/r/20240614202859.3597745-2-minipli@grsecurity.net
[sean: tweak comment about INT_MAX assertion]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Bail from outer address space loop, not just the inner memslot loop, when
a "null" handler is encountered by __kvm_handle_hva_range(), which is the
intended behavior. On x86, which has multiple address spaces thanks to
SMM emulation, breaking from just the memslot loop results in undefined
behavior due to assigning the non-existent return value from kvm_null_fn()
to a bool.
In practice, the bug is benign as kvm_mmu_notifier_invalidate_range_end()
is the only caller that passes handler=kvm_null_fn, and it doesn't set
flush_on_ret, i.e. assigning garbage to r.ret is ultimately ignored. And
for most configuration the compiler elides the entire sequence, i.e. there
is no undefined behavior at runtime.
------------[ cut here ]------------
UBSAN: invalid-load in arch/x86/kvm/../../../virt/kvm/kvm_main.c:655:10
load of value 160 is not a valid value for type '_Bool'
CPU: 370 PID: 8246 Comm: CPU 0/KVM Not tainted 6.8.2-amdsos-build58-ubuntu-22.04+ #1
Hardware name: AMD Corporation Sh54p/Sh54p, BIOS WPC4429N 04/25/2024
Call Trace:
<TASK>
dump_stack_lvl+0x48/0x60
ubsan_epilogue+0x5/0x30
__ubsan_handle_load_invalid_value+0x79/0x80
kvm_mmu_notifier_invalidate_range_end.cold+0x18/0x4f [kvm]
__mmu_notifier_invalidate_range_end+0x63/0xe0
__split_huge_pmd+0x367/0xfc0
do_huge_pmd_wp_page+0x1cc/0x380
__handle_mm_fault+0x8ee/0xe50
handle_mm_fault+0xe4/0x4a0
__get_user_pages+0x190/0x840
get_user_pages_unlocked+0xe0/0x590
hva_to_pfn+0x114/0x550 [kvm]
kvm_faultin_pfn+0xed/0x5b0 [kvm]
kvm_tdp_page_fault+0x123/0x170 [kvm]
kvm_mmu_page_fault+0x244/0xaa0 [kvm]
vcpu_enter_guest+0x592/0x1070 [kvm]
kvm_arch_vcpu_ioctl_run+0x145/0x8a0 [kvm]
kvm_vcpu_ioctl+0x288/0x6d0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x77/0x120
entry_SYSCALL_64_after_hwframe+0x6e/0x76
</TASK>
---[ end trace ]---
Fixes: 071064f14d87 ("KVM: Don't take mmu_lock for range invalidation unless necessary")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/b8723d39903b64c241c50f5513f804390c7b5eec.1718203311.git.babu.moger@amd.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The error path for OOM when allocating buses used to return -ENOMEM using
the local variable 'r', where 'r' was initialized at the top of the
function. But a new "r = kvm_init_irq_routing(kvm);" was introduced in
the middle of the function, so now the error code is not set and it
eventually leads to a NULL dereference due to kvm_dev_ioctl_create_vm()
thinking kvm_create_vm() succeeded. Set the error code back to -ENOMEM.
Opportunistically tweak the logic to pre-set "r = -ENOMEM" immediately
before the flows that can fail due to memory allocation failure to make
it less likely that the bug recurs in the future.
Fixes: fbe4a7e881d4 ("KVM: Setup empty IRQ routing when creating a VM")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/02051e0a-09d8-49a2-917f-7c2f278a1ba1@moroto.mountain
[sean: tweak all of the "r = -ENOMEM" sites, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Delete kvm_arch_sched_in() now that all implementations are nops.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a kvm_vcpu.scheduled_out flag to track if a vCPU is in the process of
being scheduled out (vCPU put path), or if the vCPU is being reloaded
after being scheduled out (vCPU load path). In the short term, this will
allow dropping kvm_arch_sched_in(), as arch code can query scheduled_out
during kvm_arch_vcpu_load().
Longer term, scheduled_out opens up other potential optimizations, without
creating subtle/brittle dependencies. E.g. it allows KVM to keep guest
state (that is managed via kvm_arch_vcpu_{load,put}()) loaded across
kvm_sched_{out,in}(), if KVM knows the state isn't accessed by the host
kernel. Forcing arch code to coordinate between kvm_arch_sched_{in,out}()
and kvm_arch_vcpu_{load,put}() is awkward, not reusable, and relies on the
exact ordering of calls into arch code.
Adding scheduled_out also obviates the need for a kvm_arch_sched_out()
hook, e.g. if arch code needs to do something novel when putting vCPU
state.
And even if KVM never uses scheduled_out for anything beyond dropping
kvm_arch_sched_in(), just being able to remove all of the arch stubs makes
it worth adding the flag.
Link: https://lore.kernel.org/all/20240430224431.490139-1-seanjc@google.com
Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Setup empty IRQ routing during VM creation so that x86 and s390 don't need
to set empty/dummy IRQ routing during KVM_CREATE_IRQCHIP (in future
patches). Initializing IRQ routing before there are any potential readers
allows KVM to avoid the synchronize_srcu() in kvm_set_irq_routing(), which
can introduces 20+ milliseconds of latency in the VM creation path.
Ensuring that all VMs have non-NULL IRQ routing also hardens KVM against
misbehaving userspace VMMs, e.g. RISC-V dynamically instantiates its
interrupt controller, but doesn't override kvm_arch_intc_initialized() or
kvm_arch_irqfd_allowed(), and so can likely reach kvm_irq_map_gsi()
without fully initialized IRQ routing.
Signed-off-by: Yi Wang <foxywang@tencent.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20240506101751.3145407-2-foxywang@tencent.com
[sean: init refcount after IRQ routing, fix stub, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use {READ,WRITE}_ONCE() to access kvm->last_boosted_vcpu to ensure the
loads and stores are atomic. In the extremely unlikely scenario the
compiler tears the stores, it's theoretically possible for KVM to attempt
to get a vCPU using an out-of-bounds index, e.g. if the write is split
into multiple 8-bit stores, and is paired with a 32-bit load on a VM with
257 vCPUs:
CPU0 CPU1
last_boosted_vcpu = 0xff;
(last_boosted_vcpu = 0x100)
last_boosted_vcpu[15:8] = 0x01;
i = (last_boosted_vcpu = 0x1ff)
last_boosted_vcpu[7:0] = 0x00;
vcpu = kvm->vcpu_array[0x1ff];
As detected by KCSAN:
BUG: KCSAN: data-race in kvm_vcpu_on_spin [kvm] / kvm_vcpu_on_spin [kvm]
write to 0xffffc90025a92344 of 4 bytes by task 4340 on cpu 16:
kvm_vcpu_on_spin (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112) kvm
handle_pause (arch/x86/kvm/vmx/vmx.c:5929) kvm_intel
vmx_handle_exit (arch/x86/kvm/vmx/vmx.c:?
arch/x86/kvm/vmx/vmx.c:6606) kvm_intel
vcpu_run (arch/x86/kvm/x86.c:11107 arch/x86/kvm/x86.c:11211) kvm
kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:?) kvm
kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:?) kvm
__se_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:904 fs/ioctl.c:890)
__x64_sys_ioctl (fs/ioctl.c:890)
x64_sys_call (arch/x86/entry/syscall_64.c:33)
do_syscall_64 (arch/x86/entry/common.c:?)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
read to 0xffffc90025a92344 of 4 bytes by task 4342 on cpu 4:
kvm_vcpu_on_spin (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4069) kvm
handle_pause (arch/x86/kvm/vmx/vmx.c:5929) kvm_intel
vmx_handle_exit (arch/x86/kvm/vmx/vmx.c:?
arch/x86/kvm/vmx/vmx.c:6606) kvm_intel
vcpu_run (arch/x86/kvm/x86.c:11107 arch/x86/kvm/x86.c:11211) kvm
kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:?) kvm
kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:?) kvm
__se_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:904 fs/ioctl.c:890)
__x64_sys_ioctl (fs/ioctl.c:890)
x64_sys_call (arch/x86/entry/syscall_64.c:33)
do_syscall_64 (arch/x86/entry/common.c:?)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
value changed: 0x00000012 -> 0x00000000
Fixes: 217ece6129f2 ("KVM: use yield_to instead of sleep in kvm_vcpu_on_spin")
Cc: stable@vger.kernel.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240510092353.2261824-1-leitao@debian.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Default halt_poll_ns_shrink value of 0 always resets polling interval
to 0 on an un-successful poll where vcpu wakeup is not received. This is
mostly to avoid pointless polling for more number of shorter intervals. But
disabled shrink assumes vcpu wakeup is less likely to be received in
subsequent shorter polling intervals. Another side effect of 0 shrink value
is that, even on a successful poll if total block time was greater than
current polling interval, the polling interval starts over from 0 instead
of shrinking by a factor.
Enabling shrink with value of 2 allows the polling interval to gradually
decrement in case of un-successful poll events as well. This gives a fair
chance for successful polling events in subsequent polling intervals rather
than resetting it to 0 and starting over from grow_start.
Below kvm stat log snippet shows interleaved growth and shrinking of
polling interval:
87162647182125: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0)
87162647637763: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000)
87162649627943: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000)
87162650892407: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000)
87162651540378: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000)
87162652276768: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000)
87162652515037: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000)
87162653383787: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000)
87162653627670: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (shrink 20000)
87162653796321: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000)
87162656171645: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (shrink 20000)
87162661607487: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000)
Having both grow and shrink enabled creates a balance in polling interval
growth and shrink behavior. Tests show improved successful polling attempt
ratio which contribute to VM performance. Power penalty is quite negligible
as shrunk polling intervals create bursts of very short durations.
Performance assessment results show 3-6% improvements in CPU+GPU, Memory
and Storage Android VM workloads whereas 5-9% improvement in average FPS of
gaming VM workloads.
Power penalty is below 1% where host OS is either idle or running a
native workload having 2 VMs enabled. CPU/GPU intensive gaming workloads
as well do not show any increased power overhead with shrink enabled.
Co-developed-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com>
Signed-off-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com>
Signed-off-by: Parshuram Sangle <parshuram.sangle@intel.com>
Link: https://lore.kernel.org/r/20231102154628.2120-2-parshuram.sangle@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
After
faf01aef0570 ("KVM: PPC: Merge powerpc's debugfs entry content into generic entry")
kvm_debugfs_dir is not used anywhere else outside of kvm_main.c
Unexport it and make it static.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240515150804.9354-1-bp@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
|
|
KVM x86 misc changes for 6.10:
- Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which
is unused by hardware, so that KVM can communicate its inability to map GPAs
that set bits 51:48 due to lack of 5-level paging. Guest firmware is
expected to use the information to safely remap BARs in the uppermost GPA
space, i.e to avoid placing a BAR at a legal, but unmappable, GPA.
- Use vfree() instead of kvfree() for allocations that always use vcalloc()
or __vcalloc().
- Don't completely ignore same-value writes to immutable feature MSRs, as
doing so results in KVM failing to reject accesses to MSR that aren't
supposed to exist given the vCPU model and/or KVM configuration.
- Don't mark APICv as being inhibited due to ABSENT if APICv is disabled
KVM-wide to avoid confusing debuggers (KVM will never bother clearing the
ABSENT inhibit, even if userspace enables in-kernel local APIC).
|
|
KVM cleanups for 6.10:
- Misc cleanups extracted from the "exit on missing userspace mapping" series,
which has been put on hold in anticipation of a "KVM Userfault" approach,
which should provide a superset of functionality.
- Remove kvm_make_all_cpus_request_except(), which got added to hack around an
AVIC bug, and then became dead code when a more robust fix came along.
- Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 6.10
- Move a lot of state that was previously stored on a per vcpu
basis into a per-CPU area, because it is only pertinent to the
host while the vcpu is loaded. This results in better state
tracking, and a smaller vcpu structure.
- Add full handling of the ERET/ERETAA/ERETAB instructions in
nested virtualisation. The last two instructions also require
emulating part of the pointer authentication extension.
As a result, the trap handling of pointer authentication has
been greattly simplified.
- Turn the global (and not very scalable) LPI translation cache
into a per-ITS, scalable cache, making non directly injected
LPIs much cheaper to make visible to the vcpu.
- A batch of pKVM patches, mostly fixes and cleanups, as the
upstreaming process seems to be resuming. Fingers crossed!
- Allocate PPIs and SGIs outside of the vcpu structure, allowing
for smaller EL2 mapping and some flexibility in implementing
more or less than 32 private IRQs.
- Purge stale mpidr_data if a vcpu is created after the MPIDR
map has been created.
- Preserve vcpu-specific ID registers across a vcpu reset.
- Various minor cleanups and improvements.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.10
1. Add ParaVirt IPI support.
2. Add software breakpoint support.
3. Add mmio trace events support.
|
|
... and centralize the VM_IO/VM_PFNMAP sanity check in there. We'll
now also perform these sanity checks for direct follow_pte()
invocations.
For generic_access_phys(), we might now check multiple times: nothing to
worry about, really.
Link: https://lkml.kernel.org/r/20240410155527.474777-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Sean Christopherson <seanjc@google.com> [KVM]
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Fei Li <fei1.li@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Yonghua Huang <yonghua.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove kvm_make_all_cpus_request_except() as it effectively has no users,
and arguably should never have been added in the first place.
Commit 54163a346d4a ("KVM: Introduce kvm_make_all_cpus_request_except()")
added the "except" variation for use in SVM's AVIC update path, which used
it to skip sending a request to the current vCPU (commit 7d611233b016
("KVM: SVM: Disable AVIC before setting V_IRQ")).
But the AVIC usage of kvm_make_all_cpus_request_except() was essentially a
hack-a-fix that simply squashed the most likely scenario of a racy WARN
without addressing the underlying problem(s). Commit f1577ab21442 ("KVM:
SVM: svm_set_vintr don't warn if AVIC is active but is about to be
deactivated") eventually fixed the WARN itself, and the "except" usage was
subsequently dropped by df63202fe52b ("KVM: x86: APICv: drop immediate
APICv disablement on current vCPU").
That kvm_make_all_cpus_request_except() hasn't gained any users in the
last ~3 years isn't a coincidence. If a VM-wide broadcast *needs* to skip
the current vCPU, then odds are very good that there is underlying bug
that could be better fixed elsewhere.
Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>
Link: https://lore.kernel.org/r/20240404232651.1645176-1-venkateshs@chromium.org
[sean: rewrite changelog with --verbose]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
A subsequent change to KVM/arm64 will necessitate walking the device
list outside of the kvm->lock. Prepare by converting to an rculist. This
has zero effect on the VM destruction path, as it is expected every
reader is backed by a reference on the kvm struct.
On the other hand, ensure a given device is completely destroyed before
dropping the kvm->lock in the release() path, as certain devices expect
to be a singleton (e.g. the vfio-kvm device).
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20240422200158.2606761-2-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add support to MMU caches for initializing a page with a custom 64-bit
value, e.g. to pre-fill an entire page table with non-zero PTE values.
The functionality will be used by x86 to support Intel's TDX, which needs
to set bit 63 in all non-present PTEs in order to prevent !PRESENT page
faults from getting reflected into the guest (Intel's EPT Violation #VE
architecture made the less than brilliant decision of having the per-PTE
behavior be opt-out instead of opt-in).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <5919f685f109a1b0ebc6bd8fc4536ee94bcc172d.1705965635.git.isaku.yamahata@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Remove gfn_to_pfn_cache_invalidate_start()'s unused @may_block parameter,
which was leftover from KVM's abandoned (for now) attempt to support guest
usage of gfn_to_pfn caches.
Fixes: a4bff3df5147 ("KVM: pfncache: remove KVM_GUEST_USES_PFN usage")
Reported-by: Like Xu <like.xu.linux@gmail.com>
Cc: Paul Durrant <paul@xen.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20240305003742.245767-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The only user was kvm_mmu_notifier_change_pte(), which is now gone.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240405115815.3226315-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The .change_pte() MMU notifier callback was intended as an
optimization. The original point of it was that KSM could tell KVM to flip
its secondary PTE to a new location without having to first zap it. At
the time there was also an .invalidate_page() callback; both of them were
*not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(),
and .invalidate_page() also doubled as a fallback implementation of
.change_pte().
Later on, however, both callbacks were changed to occur within an
invalidate_range_start/end() block.
In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to
set_pte_at_notify with invalidate_range_start and invalidate_range_end",
2012-10-09) did so to remove the fallback from .invalidate_page() to
.change_pte() and allow sleepable .invalidate_page() hooks.
This however made KVM's usage of the .change_pte() callback completely
moot, because KVM unmaps the sPTEs during .invalidate_range_start()
and therefore .change_pte() has no hope of finding a sPTE to change.
Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as
well as all the architecture specific implementations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240405115815.3226315-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM_HVA_ERR_RO_BAD satisfies kvm_is_error_hva(), so there's no need to
duplicate the "if (writable)" block. Fix this by bringing all
kvm_is_error_hva() cases under one conditional.
Signed-off-by: Anish Moorthy <amoorthy@google.com>
Link: https://lore.kernel.org/r/20240215235405.368539-5-amoorthy@google.com
[sean: use ternary operator]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The (gfn, data, offset, len) order of parameters is a little strange
since "offset" applies to "gfn" rather than to "data". Add function
comments to make things perfectly clear.
Signed-off-by: Anish Moorthy <amoorthy@google.com>
Link: https://lore.kernel.org/r/20240215235405.368539-3-amoorthy@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The current description can be read as "atomic -> allowed to sleep,"
when in fact the intended statement is "atomic -> NOT allowed to sleep."
Make that clearer in the docstring.
Signed-off-by: Anish Moorthy <amoorthy@google.com>
Link: https://lore.kernel.org/r/20240215235405.368539-2-amoorthy@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
commit 37b2a6510a48("KVM: use __vcalloc for very large allocations")
replaced kvzalloc()/kvcalloc() with vcalloc(), but didn't replace kvfree()
with vfree().
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://lore.kernel.org/r/20240131012357.53563-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
KVM x86 misc changes for 6.9:
- Explicitly initialize a variety of on-stack variables in the emulator that
triggered KMSAN false positives (though in fairness in KMSAN, it's comically
difficult to see that the uninitialized memory is never truly consumed).
- Fix the deubgregs ABI for 32-bit KVM, and clean up code related to reading
DR6 and DR7.
- Rework the "force immediate exit" code so that vendor code ultimately
decides how and when to force the exit. This allows VMX to further optimize
handling preemption timer exits, and allows SVM to avoid sending a duplicate
IPI (SVM also has a need to force an exit).
- Fix a long-standing bug where kvm_has_noapic_vcpu could be left elevated if
vCPU creation ultimately failed, and add WARN to guard against similar bugs.
- Provide a dedicated arch hook for checking if a different vCPU was in-kernel
(for directed yield), and simplify the logic for checking if the currently
loaded vCPU is in-kernel.
- Misc cleanups and fixes.
|
|
KVM common MMU changes for 6.9:
- Harden KVM against underflowing the active mmu_notifier invalidation
count, so that "bad" invalidations (usually due to bugs elsehwere in the
kernel) are detected earlier and are less likely to hang the kernel.
- Fix a benign bug in __kvm_mmu_topup_memory_cache() where the object size
and number of objects parameters to kvmalloc_array() were swapped.
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.9
- Infrastructure for building KVM's trap configuration based on the
architectural features (or lack thereof) advertised in the VM's ID
registers
- Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
x86's WC) at stage-2, improving the performance of interacting with
assigned devices that can tolerate it
- Conversion of KVM's representation of LPIs to an xarray, utilized to
address serialization some of the serialization on the LPI injection
path
- Support for _architectural_ VHE-only systems, advertised through the
absence of FEAT_E2H0 in the CPU's ID register
- Miscellaneous cleanups, fixes, and spelling corrections to KVM and
selftests
|
|
https://github.com/kvm-x86/linux into HEAD
KVM GUEST_MEMFD fixes for 6.8:
- Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to
avoid creating ABI that KVM can't sanely support.
- Update documentation for KVM_SW_PROTECTED_VM to make it abundantly
clear that such VMs are purely a development and testing vehicle, and
come with zero guarantees.
- Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan
is to support confidential VMs with deterministic private memory (SNP
and TDX) only in the TDP MMU.
- Fix a bug in a GUEST_MEMFD negative test that resulted in false passes
when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
|
|
The general expectation with debugfs is that any initialization failure
is nonfatal. Nevertheless, kvm_arch_create_vm_debugfs() allows
implementations to return an error and kvm_create_vm_debugfs() allows
that to fail VM creation.
Change to a void return to discourage architectures from making debugfs
failures fatal for the VM. Seems like everyone already had the right
idea, as all implementations already return 0 unconditionally.
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20240216155941.2029458-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Disallow creating read-only memslots that support GUEST_MEMFD, as
GUEST_MEMFD is fundamentally incompatible with KVM's semantics for
read-only memslots. Read-only memslots allow the userspace VMM to emulate
option ROMs by filling the backing memory with readable, executable code
and data, while triggering emulated MMIO on writes. GUEST_MEMFD doesn't
currently support writes from userspace and KVM doesn't support emulated
MMIO on private accesses, i.e. the guest can only ever read zeros, and
writes will always be treated as errors.
Cc: Fuad Tabba <tabba@google.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Isaku Yamahata <isaku.yamahata@gmail.com>
Cc: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
Link: https://lore.kernel.org/r/20240222190612.2942589-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
gcc-14 notices that the arguments to kvmalloc_array() are mixed up:
arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function '__kvm_mmu_topup_memory_cache':
arch/x86/kvm/../../../virt/kvm/kvm_main.c:424:53: error: 'kvmalloc_array' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
424 | mc->objects = kvmalloc_array(sizeof(void *), capacity, gfp);
| ^~~~
arch/x86/kvm/../../../virt/kvm/kvm_main.c:424:53: note: earlier argument should specify number of elements, later size of each element
The code still works correctly, but the incorrect order prevents the compiler
from properly tracking the object sizes.
Fixes: 837f66c71207 ("KVM: Allow for different capacities in kvm_mmu_memory_cache structs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240212112419.1186065-1-arnd@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a comment to explain why KVM treats vCPUs with pending interrupts as
in-kernel when a vCPU wants to yield to a vCPU that was preempted while
running in kernel mode.
Link: https://lore.kernel.org/r/20240110003938.490206-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Plumb in a dedicated hook for querying whether or not a vCPU was preempted
in-kernel. Unlike literally every other architecture, x86's VMX can check
if a vCPU is in kernel context if and only if the vCPU is loaded on the
current pCPU.
x86's kvm_arch_vcpu_in_kernel() works around the limitation by querying
kvm_get_running_vcpu() and redirecting to vcpu->arch.preempted_in_kernel
as needed. But that's unnecessary, confusing, and fragile, e.g. x86 has
had at least one bug where KVM incorrectly used a stale
preempted_in_kernel.
No functional change intended.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240110003938.490206-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
KVM uses __KVM_HAVE_* symbols in the architecture-dependent uapi/asm/kvm.h to mask
unused definitions in include/uapi/linux/kvm.h. __KVM_HAVE_READONLY_MEM however
was nothing but a misguided attempt to define KVM_CAP_READONLY_MEM only on
architectures where KVM_CHECK_EXTENSION(KVM_CAP_READONLY_MEM) could possibly
return nonzero. This however does not make sense, and it prevented userspace
from supporting this architecture-independent feature without recompilation.
Therefore, these days __KVM_HAVE_READONLY_MEM does not mask anything and
is only used in virt/kvm/kvm_main.c. Userspace does not need to test it
and there should be no need for it to exist. Remove it and replace it
with a Kconfig symbol within Linux source code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When handling the end of an mmu_notifier invalidation, WARN if
mn_active_invalidate_count is already 0 do not decrement it further, i.e.
avoid causing mn_active_invalidate_count to underflow/wrap. In the worst
case scenario, effectively corrupting mn_active_invalidate_count could
cause kvm_swap_active_memslots() to hang indefinitely.
end() calls are *supposed* to be paired with start(), i.e. underflow can
only happen if there is a bug elsewhere in the kernel, but due to lack of
lockdep assertions in the mmu_notifier helpers, it's all too easy for a
bug to go unnoticed for some time, e.g. see the recently introduced
PAGEMAP_SCAN ioctl().
Ideally, mmu_notifiers would incorporate lockdep assertions, but users of
mmu_notifiers aren't required to hold any one specific lock, i.e. adding
the necessary annotations to make lockdep aware of all locks that are
mutally exclusive with mm_take_all_locks() isn't trivial.
Link: https://lore.kernel.org/all/000000000000f6d051060c6785bc@google.com
Link: https://lore.kernel.org/r/20240110004239.491290-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Common KVM changes for 6.8:
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.
|
|
KVM/riscv changes for 6.8 part #1
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Steal time account support along with selftest
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.8
1. Optimization for memslot hugepage checking.
2. Cleanup and fix some HW/SW timer issues.
3. Add LSX/LASX (128bit/256bit SIMD) support.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/arm64 fixes for 6.7, part #2
- Ensure a vCPU's redistributor is unregistered from the MMIO bus
if vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
|
|
Instead of having a comment indicating the need to hold slots_lock
when calling kvm_io_bus_register_dev(), make it explicit with
a lockdep assertion.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231207151201.3028710-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|