summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2021-03-16x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTARTOleg Nesterov
Save the current_thread_info()->status of X86 in the new restart_block->arch_data field so TS_COMPAT_RESTART can be removed again. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210201174716.GA17898@redhat.com
2021-03-16x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()Oleg Nesterov
The comment in get_nr_restart_syscall() says: * The problem is that we can get here when ptrace pokes * syscall-like values into regs even if we're not in a syscall * at all. Yes, but if not in a syscall then the status & (TS_COMPAT|TS_I386_REGS_POKED) check below can't really help: - TS_COMPAT can't be set - TS_I386_REGS_POKED is only set if regs->orig_ax was changed by 32bit debugger; and even in this case get_nr_restart_syscall() is only correct if the tracee is 32bit too. Suppose that a 64bit debugger plays with a 32bit tracee and * Tracee calls sleep(2) // TS_COMPAT is set * User interrupts the tracee by CTRL-C after 1 sec and does "(gdb) call func()" * gdb saves the regs by PTRACE_GETREGS * does PTRACE_SETREGS to set %rip='func' and %orig_rax=-1 * PTRACE_CONT // TS_COMPAT is cleared * func() hits int3. * Debugger catches SIGTRAP. * Restore original regs by PTRACE_SETREGS. * PTRACE_CONT get_nr_restart_syscall() wrongly returns __NR_restart_syscall==219, the tracee calls ia32_sys_call_table[219] == sys_madvise. Add the sticky TS_COMPAT_RESTART flag which survives after return to user mode. It's going to be removed in the next step again by storing the information in the restart block. As a further cleanup it might be possible to remove also TS_I386_REGS_POKED with that. Test-case: $ cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests $ gcc -o erestartsys-trap-debuggee ptrace-tests/tests/erestartsys-trap-debuggee.c --m32 $ gcc -o erestartsys-trap-debugger ptrace-tests/tests/erestartsys-trap-debugger.c -lutil $ ./erestartsys-trap-debugger Unexpected: retval 1, errno 22 erestartsys-trap-debugger: ptrace-tests/tests/erestartsys-trap-debugger.c:421 Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174709.GA17895@redhat.com
2021-03-16x86: Move TS_COMPAT back to asm/thread_info.hOleg Nesterov
Move TS_COMPAT back to asm/thread_info.h, close to TS_I386_REGS_POKED. It was moved to asm/processor.h by b9d989c7218a ("x86/asm: Move the thread_info::status field to thread_struct"), then later 37a8f7c38339 ("x86/asm: Move 'status' from thread_struct to thread_info") moved the 'status' field back but TS_COMPAT was forgotten. Preparatory patch to fix the COMPAT case for get_nr_restart_syscall() Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174649.GA17880@redhat.com
2021-03-16perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENTKan Liang
On a Haswell machine, the perf_fuzzer managed to trigger this message: [117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to write 0x0400000000000000) at rIP: 0xffffffff8106e4f4 (native_write_msr+0x4/0x20) [117248.089957] Call Trace: [117248.092685] intel_pmu_pebs_enable_all+0x31/0x40 [117248.097737] intel_pmu_enable_all+0xa/0x10 [117248.102210] __perf_event_task_sched_in+0x2df/0x2f0 [117248.107511] finish_task_switch.isra.0+0x15f/0x280 [117248.112765] schedule_tail+0xc/0x40 [117248.116562] ret_from_fork+0x8/0x30 A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if the precise_ip is set. The bit 58 is reserved by the HW. Accessing the bit causes the unchecked MSR access error. The fake event doesn't support PEBS. The case should be rejected. Fixes: 097e4311cda9 ("perf/x86: Add constraint to create guest LBR event without hw counter") Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-2-git-send-email-kan.liang@linux.intel.com
2021-03-16perf/x86/intel: Fix a crash caused by zero PEBS statusKan Liang
A repeatable crash can be triggered by the perf_fuzzer on some Haswell system. https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/ For some old CPUs (HSW and earlier), the PEBS status in a PEBS record may be mistakenly set to 0. To minimize the impact of the defect, the commit was introduced to try to avoid dropping the PEBS record for some cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates the local pebs_status accordingly. However, it doesn't correct the PEBS status in the PEBS record, which may trigger the crash, especially for the large PEBS. It's possible that all the PEBS records in a large PEBS have the PEBS status 0. If so, the first get_next_pebs_record_by_bit() in the __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large PEBS, the 'count' parameter must > 1. The second get_next_pebs_record_by_bit() will crash. Besides the local pebs_status, correct the PEBS status in the PEBS record as well. Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event") Reported-by: Vince Weaver <vincent.weaver@maine.edu> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
2021-03-16KVM: x86/mmu: Store the address space ID in the TDP iteratorSean Christopherson
Store the address space ID in the TDP iterator so that it can be retrieved without having to bounce through the root shadow page. This streamlines the code and fixes a Sparse warning about not properly using rcu_dereference() when grabbing the ID from the root on the fly. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-5-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-16KVM: x86/mmu: Factor out tdp_iter_return_to_rootBen Gardon
In tdp_mmu_iter_cond_resched there is a call to tdp_iter_start which causes the iterator to continue its walk over the paging structure from the root. This is needed after a yield as paging structure could have been freed in the interim. The tdp_iter_start call is not very clear and something of a hack. It requires exposing tdp_iter fields not used elsewhere in tdp_mmu.c and the effect is not obvious from the function name. Factor a more aptly named function out of tdp_iter_start and call it from tdp_mmu_iter_cond_resched and tdp_iter_start. No functional change intended. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-4-bgardon@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-16KVM: x86/mmu: Fix RCU usage when atomically zapping SPTEsBen Gardon
Fix a missing rcu_dereference in tdp_mmu_zap_spte_atomic. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-3-bgardon@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-16KVM: x86/mmu: Fix RCU usage in handle_removed_tdp_mmu_pageBen Gardon
The pt passed into handle_removed_tdp_mmu_page does not need RCU protection, as it is not at any risk of being freed by another thread at that point. However, the implicit cast from tdp_sptep_t to u64 * dropped the __rcu annotation without a proper rcu_derefrence. Fix this by passing the pt as a tdp_ptep_t and then rcu_dereferencing it in the function. Suggested-by: Sean Christopherson <seanjc@google.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-14Merge tag 'objtool-urgent-2021-03-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single objtool fix to handle the PUSHF/POPF validation correctly for the paravirt changes which modified arch_local_irq_restore not to use popf" * tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool,x86: Fix uaccess PUSHF/POPF validation
2021-03-14Merge tag 'perf_urgent_for_v5.12-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Make sure PMU internal buffers are flushed for per-CPU events too and properly handle PID/TID for large PEBS. - Handle the case properly when there's no PMU and therefore return an empty list of perf MSRs for VMX to switch instead of reading random garbage from the stack. * tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR perf/core: Flush PMU internal buffers for per-CPU events
2021-03-14Merge tag 'x86_urgent_for_v5.12_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - A couple of SEV-ES fixes and robustifications: verify usermode stack pointer in NMI is not coming from the syscall gap, correctly track IRQ states in the #VC handler and access user insn bytes atomically in same handler as latter cannot sleep. - Balance 32-bit fast syscall exit path to do the proper work on exit and thus not confuse audit and ptrace frameworks. - Two fixes for the ORC unwinder going "off the rails" into KASAN redzones and when ORC data is missing. * tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev-es: Use __copy_from_user_inatomic() x86/sev-es: Correctly track IRQ states in runtime #VC handler x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack x86/sev-es: Introduce ip_within_syscall_gap() helper x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls x86/unwind/orc: Silence warnings caused by missing ORC data x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
2021-03-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "More fixes for ARM and x86" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: LAPIC: Advancing the timer expiration on guest initiated write KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged kvm: x86: annotate RCU pointers KVM: arm64: Fix exclusive limit for IPA size KVM: arm64: Reject VM creation when the default IPA size is unsupported KVM: arm64: Ensure I-cache isolation between vcpus of a same VM KVM: arm64: Don't use cbz/adr with external symbols KVM: arm64: Fix range alignment when walking page tables KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config() KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key KVM: arm64: Fix nVHE hyp panic host context restore KVM: arm64: Avoid corrupting vCPU context register in guest exit KVM: arm64: nvhe: Save the SPE context early kvm: x86: use NULL instead of using plain integer as pointer KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled' KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
2021-03-12Merge tag 'for-linus-5.12b-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two fix series and a single cleanup: - a small cleanup patch to remove unneeded symbol exports - a series to cleanup Xen grant handling (avoiding allocations in some cases, and using common defines for "invalid" values) - a series to address a race issue in Xen event channel handling" * tag 'for-linus-5.12b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Xen/gntdev: don't needlessly use kvcalloc() Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF} Xen/gntdev: don't needlessly allocate k{,un}map_ops[] Xen: drop exports of {set,clear}_foreign_p2m_mapping() xen/events: avoid handling the same event on two cpus at the same time xen/events: don't unmask an event channel when an eoi is pending xen/events: reset affinity of 2-level event when tearing it down
2021-03-12KVM: LAPIC: Advancing the timer expiration on guest initiated writeWanpeng Li
Advancing the timer expiration should only be necessary on guest initiated writes. When we cancel the timer and clear .pending during state restore, clear expired_tscdeadline as well. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1614818118-965-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-12KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive modeSean Christopherson
If mmu_lock is held for write, don't bother setting !PRESENT SPTEs to REMOVED_SPTE when recursively zapping SPTEs as part of shadow page removal. The concurrent write protections provided by REMOVED_SPTE are not needed, there are no backing page side effects to record, and MMIO SPTEs can be left as is since they are protected by the memslot generation, not by ensuring that the MMIO SPTE is unreachable (which is racy with respect to lockless walks regardless of zapping behavior). Skipping !PRESENT drastically reduces the number of updates needed to tear down sparsely populated MMUs, e.g. when tearing down a 6gb VM that didn't touch much memory, 6929/7168 (~96.6%) of SPTEs were '0' and could be skipped. Avoiding the write itself is likely close to a wash, but avoiding __handle_changed_spte() is a clear-cut win as that involves saving and restoring all non-volatile GPRs (it's a subtly big function), as well as several conditional branches before bailing out. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210310003029.1250571-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-12KVM: kvmclock: Fix vCPUs > 64 can't be online/hotplugedWanpeng Li
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 88 On-line CPU(s) list: 0-63 Off-line CPU(s) list: 64-87 # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall # echo 1 > /sys/devices/system/cpu/cpu76/online -bash: echo: write error: Cannot allocate memory The per-cpu vsyscall pvclock data pointer assigns either an element of the static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall and sometimes you end up with vsyscall disabled if the host does something strange. This patch fixes it by allocating this dynamically memory unconditionally even if vsyscall is disabled. Fixes: 6a1cac56f4 ("x86/kvm: Use __bss_decrypted attribute in shared variables") Reported-by: Zelin Deng <zelin.deng@linux.alibaba.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: stable@vger.kernel.org#v4.19-rc5+ Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1614130683-24137-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-12kvm: x86: annotate RCU pointersMuhammad Usama Anjum
This patch adds the annotation to fix the following sparse errors: arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map * arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map * arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter * arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map * arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map * arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com> Message-Id: <20210305191123.GA497469@LEGION> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-12objtool,x86: Fix uaccess PUSHF/POPF validationPeter Zijlstra
Commit ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not use popf") replaced "push %reg; popf" with something like: "test $0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry that commit ea24213d8088 ("objtool: Add UACCESS validation") relies on. The result is: drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: PUSHF stack exhausted Meanwhile, commit c9c324dc22aa ("objtool: Support stack layout changes in alternatives") makes that we can actually use stack-ops in alternatives, which means we can revert 1ff865e343c2 ("x86,smap: Fix smap_{save,restore}() alternatives"). That in turn means we can limit the PUSHF/POPF handling of ea24213d8088 to those instructions that are in alternatives. Fixes: ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not use popf") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/YEY4rIbQYa5fnnEp@hirez.programming.kicks-ass.net
2021-03-11x86/paravirt: Have only one paravirt patch functionJuergen Gross
There is no need any longer to have different paravirt patch functions for native and Xen. Eliminate native_patch() and rename paravirt_patch_default() to paravirt_patch(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-15-jgross@suse.com
2021-03-11x86/paravirt: Switch functions with custom code to ALTERNATIVEJuergen Gross
Instead of using paravirt patching for custom code sequences use ALTERNATIVE for the functions with custom code replacements. Instead of patching an ud2 instruction for unpopulated vector entries into the caller site, use a simple function just calling BUG() as a replacement. Simplify the register defines for assembler paravirt calling, as there isn't much usage left. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-14-jgross@suse.com
2021-03-11x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEsJuergen Gross
Instead of using paravirt patching for custom code sequences add support for using ALTERNATIVE handling combined with paravirt call patching. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-13-jgross@suse.com
2021-03-11x86/paravirt: Switch iret pvops to ALTERNATIVEJuergen Gross
The iret paravirt op is rather special as it is using a jmp instead of a call instruction. Switch it to ALTERNATIVE. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-12-jgross@suse.com
2021-03-11x86/paravirt: Simplify paravirt macrosJuergen Gross
The central pvops call macros ____PVOP_CALL() and ____PVOP_VCALL() are looking very similar now. The main differences are using PVOP_VCALL_ARGS or PVOP_CALL_ARGS, which are identical, and the return value handling. So drop PVOP_VCALL_ARGS and instead of ____PVOP_VCALL() just use (void)____PVOP_CALL(long, ...). Note that it isn't easily possible to just redefine ____PVOP_VCALL() to use ____PVOP_CALL() instead, as this would require further hiding of commas in macro parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-11-jgross@suse.com
2021-03-11x86/paravirt: Remove no longer needed 32-bit pvops cruftJuergen Gross
PVOP_VCALL4() is only used for Xen PV, while PVOP_CALL4() isn't used at all. Keep PVOP_CALL4() for 64 bits due to symmetry reasons. This allows to remove the 32-bit definitions of those macros leading to a substantial simplification of the paravirt macros, as those were the only ones needing non-empty "pre" and "post" parameters. PVOP_CALLEE2() and PVOP_VCALLEE2() are used nowhere, so remove them. Another no longer needed case is special handling of return types larger than unsigned long. Replace that with a BUILD_BUG_ON(). DISABLE_INTERRUPTS() is used in 32-bit code only, so it can just be replaced by cli. INTERRUPT_RETURN in 32-bit code can be replaced by iret. ENABLE_INTERRUPTS is used nowhere, so it can be removed. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-10-jgross@suse.com
2021-03-11x86/paravirt: Add new features for paravirt patchingJuergen Gross
For being able to switch paravirt patching from special cased custom code sequences to ALTERNATIVE handling some X86_FEATURE_* are needed as new features. This enables to have the standard indirect pv call as the default code and to patch that with the non-Xen custom code sequence via ALTERNATIVE patching later. Make sure paravirt patching is performed before alternatives patching. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-9-jgross@suse.com
2021-03-11x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has()Juergen Gross
_static_cpu_has() contains a completely open coded version of ALTERNATIVE_TERNARY(). Replace that with the macro instead. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-8-jgross@suse.com
2021-03-11x86/alternative: Support ALTERNATIVE_TERNARYJuergen Gross
Add ALTERNATIVE_TERNARY support for replacing an initial instruction with either of two instructions depending on a feature: ALTERNATIVE_TERNARY "default_instr", FEATURE_NR, "feature_on_instr", "feature_off_instr" which will start with "default_instr" and at patch time will, depending on FEATURE_NR being set or not, patch that with either "feature_on_instr" or "feature_off_instr". [ bp: Add comment ontop. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-7-jgross@suse.com
2021-03-11x86/alternative: Support not-featureJuergen Gross
Add support for alternative patching for the case a feature is not present on the current CPU. For users of ALTERNATIVE() and friends, an inverted feature is specified by applying the ALT_NOT() macro to it, e.g.: ALTERNATIVE(old, new, ALT_NOT(feature)); Committer note: The decision to encode the NOT-bit in the feature bit itself is because a future change which would make objtool generate such alternative calls, would keep the code in objtool itself fairly simple. Also, this allows for the alternative macros to support the NOT feature without having to change them. Finally, the u16 cpuid member encoding the X86_FEATURE_ flags is not an ABI so if more bits are needed, cpuid itself can be enlarged or a flags field can be added to struct alt_instr after having considered the size growth in either cases. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-6-jgross@suse.com
2021-03-11x86/paravirt: Switch time pvops functions to use static_call()Juergen Gross
The time pvops functions are the only ones left which might be used in 32-bit mode and which return a 64-bit value. Switch them to use the static_call() mechanism instead of pvops, as this allows quite some simplification of the pvops implementation. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com
2021-03-11x86/alternative: Merge include filesJuergen Gross
Merge arch/x86/include/asm/alternative-asm.h into arch/x86/include/asm/alternative.h in order to make it easier to use common definitions later. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-2-jgross@suse.com
2021-03-11x86/setup: Remove unused RESERVE_BRK_ARRAY()Cao jin
Since a13f2ef168cb ("x86/xen: remove 32-bit Xen PV guest support"), RESERVE_BRK_ARRAY() has no user anymore so drop it. Update related comments too. Signed-off-by: Cao jin <jojing64@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311083919.27530-1-jojing64@gmail.com
2021-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2021-03-10 The following pull-request contains BPF updates for your *net* tree. We've added 8 non-merge commits during the last 5 day(s) which contain a total of 11 files changed, 136 insertions(+), 17 deletions(-). The main changes are: 1) Reject bogus use of vmlinux BTF as map/prog creation BTF, from Alexei Starovoitov. 2) Fix allocation failure splat in x86 JIT for large progs. Also fix overwriting percpu cgroup storage from tracing programs when nested, from Yonghong Song. 3) Fix rx queue retrieval in XDP for multi-queue veth, from Maciej Fijalkowski. 4) Fix bpf_check_mtu() helper API before freeze to have mtu_len as custom skb/xdp L3 input length, from Jesper Dangaard Brouer. 5) Fix inode_storage's lookup_elem return value upon having bad fd, from Tal Lossos. 6) Fix bpftool and libbpf cross-build on MacOS, from Georgi Valkov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-10Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}Jan Beulich
It's not helpful if every driver has to cook its own. Generalize xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which shouldn't have expanded to zero to begin with). Use the constants in p2m.c and gntdev.c right away, and update field types where necessary so they would match with the constants' types (albeit without touching struct ioctl_gntdev_grant_ref's ref field, as that's part of the public interface of the kernel and would require introducing a dependency on Xen's grant_table.h public header). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-10Xen: drop exports of {set,clear}_foreign_p2m_mapping()Jan Beulich
They're only used internally, and the layering violation they contain (x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly advise against any (uncontrolled) use from a module. The functions also never had users except the ones from drivers/xen/grant-table.c forever since their introduction in 3.15. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-10x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" caseSean Christopherson
Initialize x86_pmu.guest_get_msrs to return 0/NULL to handle the "nop" case. Patching in perf_guest_get_msrs_nop() during setup does not work if there is no PMU, as setup bails before updating the static calls, leaving x86_pmu.guest_get_msrs NULL and thus a complete nop. Ultimately, this causes VMX abort on VM-Exit due to KVM putting random garbage from the stack into the MSR load list. Add a comment in KVM to note that nr_msrs is valid if and only if the return value is non-NULL. Fixes: abd562df94d1 ("x86/perf: Use static_call for x86_pmu.guest_get_msrs") Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+cce9ef2dd25246f815ee@syzkaller.appspotmail.com Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309171019.1125243-1-seanjc@google.com
2021-03-10stacktrace: Move documentation for arch_stack_walk_reliable() to headerMark Brown
Currently arch_stack_walk_reliable() is documented with an identical comment in both x86 and S/390 implementations which is a bit redundant. Move this to the header and convert to kerneldoc while we're at it. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lkml.kernel.org/r/20210309194125.652-1-broonie@kernel.org
2021-03-09Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau. 2) TX skb error handling fix in mt76 driver, also from Felix. 3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman. 4) Avoid double free of percpu pointers when freeing a cloned bpf prog. From Cong Wang. 5) Use correct printf format for dma_addr_t in ath11k, from Geert Uytterhoeven. 6) Fix resolve_btfids build with older toolchains, from Kun-Chuan Hsieh. 7) Don't report truncated frames to mac80211 in mt76 driver, from Lorenzop Bianconi. 8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang. 9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann. 10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from Arjun Roy. 11) Ignore routes with deleted nexthop object in mlxsw, from Ido Schimmel. 12) Need to undo tcp early demux lookup sometimes in nf_nat, from Florian Westphal. 13) Fix gro aggregation for udp encaps with zero csum, from Daniel Borkmann. 14) Make sure to always use imp*_ndo_send when necessaey, from Jason A. Donenfeld. 15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov. 16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin. 17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir Oltean. 18) Make sure textsearch copntrol block is large enough, from Wilem de Bruijn. 19) Revert MAC changes to r8152 leading to instability, from Hates Wang. 20) Advance iov in 9p even for empty reads, from Jissheng Zhang. 21) Double hook unregister in nftables, from PabloNeira Ayuso. 22) Fix memleak in ixgbe, fropm Dinghao Liu. 23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne. 24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang Tang. 25) Fix DOI refcount bugs in cipso, from Paul Moore. 26) One too many irqsave in ibmvnic, from Junlin Yang. 27) Fix infinite loop with MPLS gso segmenting via virtio_net, from Balazs Nemeth. * git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits) s390/qeth: fix notification for pending buffers during teardown s390/qeth: schedule TX NAPI on QAOB completion s390/qeth: improve completion of pending TX buffers s390/qeth: fix memory leak after failed TX Buffer allocation net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 net: check if protocol extracted by virtio_net_hdr_set_proto is correct net: dsa: xrs700x: check if partner is same as port in hsr join net: lapbether: Remove netif_start_queue / netif_stop_queue atm: idt77252: fix null-ptr-dereference atm: uPD98402: fix incorrect allocation atm: fix a typo in the struct description net: qrtr: fix error return code of qrtr_sendmsg() mptcp: fix length of ADD_ADDR with port sub-option net: bonding: fix error return code of bond_neigh_init() net: enetc: allow hardware timestamping on TX queues with tc-etf enabled net: enetc: set MAC RX FIFO to recommended value net: davicom: Use platform_get_irq_optional() net: davicom: Fix regulator not turned off on driver removal net: davicom: Fix regulator not turned off on failed probe net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports ...
2021-03-10bpf, x86: Use kvmalloc_array instead kmalloc_array in bpf_jit_compYonghong Song
x86 bpf_jit_comp.c used kmalloc_array to store jited addresses for each bpf insn. With a large bpf program, we have see the following allocation failures in our production server: page allocation failure: order:5, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0" Call Trace: dump_stack+0x50/0x70 warn_alloc.cold.120+0x72/0xd2 ? __alloc_pages_direct_compact+0x157/0x160 __alloc_pages_slowpath+0xcdb/0xd00 ? get_page_from_freelist+0xe44/0x1600 ? vunmap_page_range+0x1ba/0x340 __alloc_pages_nodemask+0x2c9/0x320 kmalloc_order+0x18/0x80 kmalloc_order_trace+0x1d/0xa0 bpf_int_jit_compile+0x1e2/0x484 ? kmalloc_order_trace+0x1d/0xa0 bpf_prog_select_runtime+0xc3/0x150 bpf_prog_load+0x480/0x720 ? __mod_memcg_lruvec_state+0x21/0x100 __do_sys_bpf+0xc31/0x2040 ? close_pdeo+0x86/0xe0 do_syscall_64+0x42/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f2f300f7fa9 Code: Bad RIP value. Dumped assembly: ffffffff810b6d70 <bpf_int_jit_compile>: ; { ffffffff810b6d70: e8 eb a5 b4 00 callq 0xffffffff81c01360 <__fentry__> ffffffff810b6d75: 41 57 pushq %r15 ... ffffffff810b6f39: e9 72 fe ff ff jmp 0xffffffff810b6db0 <bpf_int_jit_compile+0x40> ; addrs = kmalloc_array(prog->len + 1, sizeof(*addrs), GFP_KERNEL); ffffffff810b6f3e: 8b 45 0c movl 12(%rbp), %eax ; return __kmalloc(bytes, flags); ffffffff810b6f41: be c0 0c 00 00 movl $3264, %esi ; addrs = kmalloc_array(prog->len + 1, sizeof(*addrs), GFP_KERNEL); ffffffff810b6f46: 8d 78 01 leal 1(%rax), %edi ; if (unlikely(check_mul_overflow(n, size, &bytes))) ffffffff810b6f49: 48 c1 e7 02 shlq $2, %rdi ; return __kmalloc(bytes, flags); ffffffff810b6f4d: e8 8e 0c 1d 00 callq 0xffffffff81287be0 <__kmalloc> ; if (!addrs) { ffffffff810b6f52: 48 85 c0 testq %rax, %rax Change kmalloc_array() to kvmalloc_array() to avoid potential allocation error for big bpf programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210309015647.3657852-1-yhs@fb.com
2021-03-09x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()Juergen Gross
The macro ALTINSTR_REPLACEMENT() doesn't make use of the feature parameter, so drop it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309134813.23912-4-jgross@suse.com
2021-03-09x86/sev-es: Use __copy_from_user_inatomic()Joerg Roedel
The #VC handler must run in atomic context and cannot sleep. This is a problem when it tries to fetch instruction bytes from user-space via copy_from_user(). Introduce a insn_fetch_from_user_inatomic() helper which uses __copy_from_user_inatomic() to safely copy the instruction bytes to kernel memory in the #VC handler. Fixes: 5e3427a7bc432 ("x86/sev-es: Handle instruction fetches from user-space") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-6-joro@8bytes.org
2021-03-09x86/sev-es: Correctly track IRQ states in runtime #VC handlerJoerg Roedel
Call irqentry_nmi_enter()/irqentry_nmi_exit() in the #VC handler to correctly track the IRQ state during its execution. Fixes: 0786138c78e79 ("x86/sev-es: Add a Runtime #VC Exception Handler") Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-5-joro@8bytes.org
2021-03-09x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stackJoerg Roedel
The code in the NMI handler to adjust the #VC handler IST stack is needed in case an NMI hits when the #VC handler is still using its IST stack. But the check for this condition also needs to look if the regs->sp value is trusted, meaning it was not set by user-space. Extend the check to not use regs->sp when the NMI interrupted user-space code or the SYSCALL gap. Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-3-joro@8bytes.org
2021-03-08x86/virtio: Have SEV guests enforce restricted virtio memory accessTom Lendacky
An SEV guest requires that virtio devices use the DMA API to allow the hypervisor to successfully access guest memory as needed. The VIRTIO_F_VERSION_1 and VIRTIO_F_ACCESS_PLATFORM features tell virtio to use the DMA API. Add arch_has_restricted_virtio_memory_access() for x86, to fail the device probe if these features have not been set for the device when running as an SEV guest. [ bp: Fix -Wmissing-prototypes warning Reported-by: kernel test robot <lkp@intel.com> ] Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/b46e0211f77ca1831f11132f969d470a6ffc9267.1614897610.git.thomas.lendacky@amd.com
2021-03-08clocksource/drivers/hyper-v: Move handling of STIMER0 interruptsMichael Kelley
STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main SITMER0 driver code, and bypass it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no STIMER0 interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-11-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08clocksource/drivers/hyper-v: Handle sched_clock differences inlineMichael Kelley
While the Hyper-V Reference TSC code is architecture neutral, the pv_ops.time.sched_clock() function is implemented for x86/x64, but not for ARM64. Current code calls a utility function under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether GENERIC_SCHED_CLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-9-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08clocksource/drivers/hyper-v: Handle vDSO differences inlineMichael Kelley
While the driver for the Hyper-V Reference TSC and STIMERs is architecture neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code calls into utility functions under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-8-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Move handling of VMbus interruptsMichael Kelley
VMbus interrupts are most naturally modelled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core VMbus interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main VMbus driver, and bypassing it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no VMbus interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-7-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Handle auto EOI quirk inlineMichael Kelley
On x86/x64, Hyper-V provides a flag to indicate auto EOI functionality, but it doesn't on ARM64. Handle this quirk inline instead of calling into code under arch/x86 (and coming, under arch/arm64). No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-6-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Move hyperv_report_panic_msg to arch neutral codeMichael Kelley
With the new Hyper-V MSR set function, hyperv_report_panic_msg() can be architecture neutral, so move it out from under arch/x86 and merge into hv_kmsg_dump(). This move also avoids needing a separate implementation under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-5-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>