summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2021-08-02x86/kvm: remove non-x86 stuff from arch/x86/kvm/ioapic.hJuergen Gross
The file has been moved to arch/x86 long time ago. Time to get rid of non-x86 stuff. Signed-off-by: Juergen Gross <jgross@suse.com> Message-Id: <20210701154105.23215-3-jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: X86: Add per-vm stat for max rmap list sizePeter Xu
Add a new statistic max_mmu_rmap_size, which stores the maximum size of rmap for the vm. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210625153214.43106-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: x86/mmu: Return old SPTE from mmu_spte_clear_track_bits()Sean Christopherson
Return the old SPTE when clearing a SPTE and push the "old SPTE present" check to the caller. Private shadow page support will use the old SPTE in rmap_remove() to determine whether or not there is a linked private shadow page. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <b16bac1fd1357aaf39e425aab2177d3f89ee8318.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: x86/mmu: Refactor shadow walk in __direct_map() to reduce indentationSean Christopherson
Employ a 'continue' to reduce the indentation for linking a new shadow page during __direct_map() in preparation for linking private pages. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <702419686d5700373123f6ea84e7a946c2cad8b4.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: x86: Hoist kvm_dirty_regs check out of sync_regs()Sean Christopherson
Move the kvm_dirty_regs vs. KVM_SYNC_X86_VALID_FIELDS check out of sync_regs() and into its sole caller, kvm_arch_vcpu_ioctl_run(). This allows a future patch to allow synchronizing select state for protected VMs. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <889017a8d31cea46472e0c64b234ef5919278ed9.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: x86/mmu: Mark VM as bugged if page fault returns RET_PF_INVALIDSean Christopherson
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <298980aa5fc5707184ac082287d13a800cd9c25f.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VMSean Christopherson
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <0e8760a26151f47dc47052b25ca8b84fffe0641e.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-31iscsi_ibft: fix crash due to KASLR physical memory remappingMaurizio Lombardi
Starting with commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") memory reservations have been moved earlier during the boot process, before the execution of the Kernel Address Space Layout Randomization code. setup_arch() calls the iscsi_ibft's find_ibft_region() function to find and reserve the memory dedicated to the iBFT and this function also saves a virtual pointer to the iBFT table for later use. The problem is that if KALSR is active, the physical memory gets remapped somewhere else in the virtual address space and the pointer is no longer valid, this will cause a kernel panic when the iscsi driver tries to dereference it. iBFT detected. BUG: unable to handle page fault for address: ffff888000099fd8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ..snip.. Call Trace: ? ibft_create_kobject+0x1d2/0x1d2 [iscsi_ibft] do_one_initcall+0x44/0x1d0 ? kmem_cache_alloc_trace+0x119/0x220 do_init_module+0x5c/0x270 __do_sys_init_module+0x12e/0x1b0 do_syscall_64+0x40/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this bug by saving the address of the physical location of the ibft; later the driver will use isa_bus_to_virt() to get the correct virtual address. N.B. On each reboot KASLR randomizes the virtual addresses so assuming phys_to_virt before KASLR does its deed is incorrect. Simplify the code by renaming find_ibft_region() to reserve_ibft_region() and remove all the wrappers. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
2021-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicting commits, all resolutions pretty trivial: drivers/bus/mhi/pci_generic.c 5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU") 56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean") drivers/nfc/s3fwrn5/firmware.c a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label") 46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") 801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") MAINTAINERS 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") 8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-30Merge tag 'net-5.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi (mac80211) and netfilter trees. Current release - regressions: - mac80211: fix starting aggregation sessions on mesh interfaces Current release - new code bugs: - sctp: send pmtu probe only if packet loss in Search Complete state - bnxt_en: add missing periodic PHC overflow check - devlink: fix phys_port_name of virtual port and merge error - hns3: change the method of obtaining default ptp cycle - can: mcba_usb_start(): add missing urb->transfer_dma initialization Previous releases - regressions: - set true network header for ECN decapsulation - mlx5e: RX, avoid possible data corruption w/ relaxed ordering and LRO - phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811 PHY - sctp: fix return value check in __sctp_rcv_asconf_lookup Previous releases - always broken: - bpf: - more spectre corner case fixes, introduce a BPF nospec instruction for mitigating Spectre v4 - fix OOB read when printing XDP link fdinfo - sockmap: fix cleanup related races - mac80211: fix enabling 4-address mode on a sta vif after assoc - can: - raw: raw_setsockopt(): fix raw_rcv panic for sock UAF - j1939: j1939_session_deactivate(): clarify lifetime of session object, avoid UAF - fix number of identical memory leaks in USB drivers - tipc: - do not blindly write skb_shinfo frags when doing decryption - fix sleeping in tipc accept routine" * tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) gve: Update MAINTAINERS list can: esd_usb2: fix memory leak can: ems_usb: fix memory leak can: usb_8dev: fix memory leak can: mcba_usb_start(): add missing urb->transfer_dma initialization can: hi311x: fix a signedness bug in hi3110_cmd() MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver bpf: Fix leakage due to insufficient speculative store bypass mitigation bpf: Introduce BPF nospec instruction for mitigating Spectre v4 sis900: Fix missing pci_disable_device() in probe and remove net: let flow have same hash in two directions nfc: nfcsim: fix use after free during module unload tulip: windbond-840: Fix missing pci_disable_device() in probe and remove sctp: fix return value check in __sctp_rcv_asconf_lookup nfc: s3fwrn5: fix undefined parameter values in dev_err() net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32 net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev() net/mlx5: Unload device upon firmware fatal error net/mlx5e: Fix page allocation failure for ptp-RQ over SF net/mlx5e: Fix page allocation failure for trap-RQ over SF ...
2021-07-30Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull libata fixlets from Jens Axboe: - A fix for PIO highmem (Christoph) - Kill HAVE_IDE as it's now unused (Lukas) * tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block: arch: Kconfig: clean up obsolete use of HAVE_IDE libata: fix ata_pio_sector for CONFIG_HIGHMEM
2021-07-30kfence, x86: only define helpers if !MODULEMarco Elver
x86's <asm/tlbflush.h> only declares non-module accessible functions (such as flush_tlb_one_kernel) if !MODULE. In preparation of including <asm/kfence.h> from the KFENCE test module, only define the helpers if !MODULE to avoid breaking the build with CONFIG_KFENCE_KUNIT_TEST=m. Signed-off-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/YQJdarx6XSUQ1tFZ@elver.google.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-30arch: Kconfig: clean up obsolete use of HAVE_IDELukas Bulwahn
The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is supported. As IDE support and the HAVE_IDE config vanishes with commit b7fb14d3ac63 ("ide: remove the legacy ide driver"), there is no need to mention HAVE_IDE in all those arch-specific Kconfig files. The issue was identified with ./scripts/checkkconfigsymbols.py. Fixes: b7fb14d3ac63 ("ide: remove the legacy ide driver") Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-30KVM: x86: accept userspace interrupt only if no event is injectedPaolo Bonzini
Once an exception has been injected, any side effects related to the exception (such as setting CR2 or DR6) have been taked place. Therefore, once KVM sets the VM-entry interruption information field or the AMD EVENTINJ field, the next VM-entry must deliver that exception. Pending interrupts are processed after injected exceptions, so in theory it would not be a problem to use KVM_INTERRUPT when an injected exception is present. However, DOSEMU is using run->ready_for_interrupt_injection to detect interrupt windows and then using KVM_SET_SREGS/KVM_SET_REGS to inject the interrupt manually. For this to work, the interrupt window must be delayed after the completion of the previous event injection. Cc: stable@vger.kernel.org Reported-by: Stas Sergeev <stsp2@yandex.ru> Tested-by: Stas Sergeev <stsp2@yandex.ru> Fixes: 71cc849b7093 ("KVM: x86: Fix split-irqchip vs interrupt injection window request") Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-30asm-generic: reverse GENERIC_{STRNCPY_FROM,STRNLEN}_USER symbolsArnd Bergmann
Most architectures do not need a custom implementation, and in most cases the generic implementation is preferred, so change the polariy on these Kconfig symbols to require architectures to select them when they provide their own version. The new name is CONFIG_ARCH_HAS_{STRNCPY_FROM,STRNLEN}_USER. The remaining architectures at the moment are: ia64, mips, parisc, um and xtensa. We should probably convert these as well, but I was not sure how far to take this series. Thomas Bogendoerfer had some concerns about converting mips but may still do some more detailed measurements to see which version is better. Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Helge Deller <deller@gmx.de> # parisc Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-07-30crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementationTianjia Zhang
This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4 block cipher. Through two affine transforms, we can use the AES S-Box to simulate the SM4 S-Box to achieve the effect of instruction acceleration. The main algorithm implementation comes from SM4 AES-NI work by libgcrypt and Markku-Juhani O. Saarinen at: https://github.com/mjosaarinen/sm4ni This optimization supports the four modes of SM4, ECB, CBC, CFB, and CTR. Since CBC and CFB do not support multiple block parallel encryption, the optimization effect is not obvious. Benchmark on Intel Xeon Cascadelake, the data comes from the 218 mode and 518 mode of tcrypt. The abscissas are blocks of different lengths. The data is tabulated and the unit is Mb/s: sm4-generic | 16 64 128 256 1024 1420 4096 ECB enc | 40.99 46.50 48.05 48.41 49.20 49.25 49.28 ECB dec | 41.07 46.99 48.15 48.67 49.20 49.25 49.29 CBC enc | 37.71 45.28 46.77 47.60 48.32 48.37 48.40 CBC dec | 36.48 44.82 46.43 47.45 48.23 48.30 48.36 CFB enc | 37.94 44.84 46.12 46.94 47.57 47.46 47.68 CFB dec | 37.50 42.84 43.74 44.37 44.85 44.80 44.96 CTR enc | 39.20 45.63 46.75 47.49 48.09 47.85 48.08 CTR dec | 39.64 45.70 46.72 47.47 47.98 47.88 48.06 sm4-aesni-avx ECB enc | 33.75 134.47 221.64 243.43 264.05 251.58 258.13 ECB dec | 34.02 134.92 223.11 245.14 264.12 251.04 258.33 CBC enc | 38.85 46.18 47.67 48.34 49.00 48.96 49.14 CBC dec | 33.54 131.29 223.88 245.27 265.50 252.41 263.78 CFB enc | 38.70 46.10 47.58 48.29 49.01 48.94 49.19 CFB dec | 32.79 128.40 223.23 244.87 265.77 253.31 262.79 CTR enc | 32.58 122.23 220.29 241.16 259.57 248.32 256.69 CTR dec | 32.81 122.47 218.99 241.54 258.42 248.58 256.61 Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-07-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix MTE shared page detection - Enable selftest's use of PMU registers when asked to s390: - restore 5.13 debugfs names x86: - fix sizes for vcpu-id indexed arrays - fixes for AMD virtualized LAPIC (AVIC) - other small bugfixes Generic: - access tracking performance test - dirty_log_perf_test command line parsing fix - Fix selftest use of obsolete pthread_yield() in favour of sched_yield() - use cpu_relax when halt polling - fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: add missing compat KVM_CLEAR_DIRTY_LOG KVM: use cpu_relax when halt polling KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl KVM: SVM: tweak warning about enabled AVIC on nested entry KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated KVM: s390: restore old debugfs names KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized KVM: selftests: Introduce access_tracking_perf_test KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing x86/kvm: fix vcpu-id indexed array sizes KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK access docs: virt: kvm: api.rst: replace some characters KVM: Documentation: Fix KVM_CAP_ENFORCE_PV_FEATURE_CPUID name KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state() KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state() KVM: arm64: selftests: get-reg-list: actually enable pmu regs in pmu sublist KVM: selftests: change pthread_yield to sched_yield KVM: arm64: Fix detection of shared VMAs on guest fault
2021-07-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2021-07-29 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 14 day(s) which contain a total of 20 files changed, 446 insertions(+), 138 deletions(-). The main changes are: 1) Fix UBSAN out-of-bounds splat for showing XDP link fdinfo, from Lorenz Bauer. 2) Fix insufficient Spectre v4 mitigation in BPF runtime, from Daniel Borkmann, Piotr Krysiuk and Benedict Schlueter. 3) Batch of fixes for BPF sockmap found under stress testing, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-29bpf: Introduce BPF nospec instruction for mitigating Spectre v4Daniel Borkmann
In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-28x86, prctl: Hook L1D flushing in via prctlBalbir Singh
Use the existing PR_GET/SET_SPECULATION_CTRL API to expose the L1D flush capability. For L1D flushing PR_SPEC_FORCE_DISABLE and PR_SPEC_DISABLE_NOEXEC are not supported. Enabling L1D flush does not check if the task is running on an SMT enabled core, rather a check is done at runtime (at the time of flush), if the task runs on a SMT sibling then the task is sent a SIGBUS which is executed before the task returns to user space or to a guest. This is better than the other alternatives of: a. Ensuring strict affinity of the task (hard to enforce without further changes in the scheduler) b. Silently skipping flush for tasks that move to SMT enabled cores. Hook up the core prctl and implement the x86 specific parts which in turn makes it functional. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-5-sblbir@amazon.com
2021-07-28x86/mm: Prepare for opt-in based L1D flush in switch_mm()Balbir Singh
The goal of this is to allow tasks that want to protect sensitive information, against e.g. the recently found snoop assisted data sampling vulnerabilites, to flush their L1D on being switched out. This protects their data from being snooped or leaked via side channels after the task has context switched out. This could also be used to wipe L1D when an untrusted task is switched in, but that's not a really well defined scenario while the opt-in variant is clearly defined. The mechanism is default disabled and can be enabled on the kernel command line. Prepare for the actual prctl based opt-in: 1) Provide the necessary setup functionality similar to the other mitigations and enable the static branch when the command line option is set and the CPU provides support for hardware assisted L1D flushing. Software based L1D flush is not supported because it's CPU model specific and not really well defined. This does not come with a sysfs file like the other mitigations because it is not bound to any specific vulnerability. Support has to be queried via the prctl(2) interface. 2) Add TIF_SPEC_L1D_FLUSH next to L1D_SPEC_IB so the two bits can be mangled into the mm pointer in one go which allows to reuse the existing mechanism in switch_mm() for the conditional IBPB speculation barrier efficiently. 3) Add the L1D flush specific functionality which flushes L1D when the outgoing task opted in. Also check whether the incoming task has requested L1D flush and if so validate that it is not accidentaly running on an SMT sibling as this makes the whole excercise moot because SMT siblings share L1D which opens tons of other attack vectors. If that happens schedule task work which signals the incoming task on return to user/guest with SIGBUS as this is part of the paranoid L1D flush contract. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
2021-07-28x86/process: Make room for TIF_SPEC_L1D_FLUSHBalbir Singh
The upcoming support for paranoid L1D flush in switch_mm() requires that TIF_SPEC_IB and the new TIF_SPEC_L1D_FLUSH are two consecutive bits in thread_info::flags. Move TIF_SPEC_FORCE_UPDATE to a spare bit to make room for the new one. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
2021-07-28x86/mm: Refactor cond_ibpb() to support other use casesBalbir Singh
cond_ibpb() has the necessary bits required to track the previous mm in switch_mm_irqs_off(). This can be reused for other use cases like L1D flushing on context switch. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-3-sblbir@amazon.com
2021-07-28x86/smp: Add a per-cpu view of SMT stateBalbir Singh
A new field smt_active in cpuinfo_x86 identifies if the current core/cpu is in SMT mode or not. This is helpful when the system has some of its cores with threads offlined and can be used for cases where action is taken based on the state of SMT. The upcoming support for paranoid L1D flush will make use of this information. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-2-sblbir@amazon.com
2021-07-27KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrlMaxim Levitsky
Currently when SVM is enabled in guest CPUID, AVIC is inhibited as soon as the guest CPUID is set. AVIC happens to be fully disabled on all vCPUs by the time any guest entry starts (if after migration the entry can be nested). The reason is that currently we disable avic right away on vCPU from which the kvm_request_apicv_update was called and for this case, it happens to be called on all vCPUs (by svm_vcpu_after_set_cpuid). After we stop doing this, AVIC will end up being disabled only when KVM_REQ_APICV_UPDATE is processed which is after we done switching to the nested guest. Fix this by just using vmcb01 in svm_refresh_apicv_exec_ctrl for avic (which is a right thing to do anyway). Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210713142023.106183-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-27KVM: SVM: tweak warning about enabled AVIC on nested entryMaxim Levitsky
It is possible that AVIC was requested to be disabled but not yet disabled, e.g if the nested entry is done right after svm_vcpu_after_set_cpuid. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210713142023.106183-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-27KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be ↵Maxim Levitsky
deactivated It is possible for AVIC inhibit and AVIC active state to be mismatched. Currently we disable AVIC right away on vCPU which started the AVIC inhibit request thus this warning doesn't trigger but at least in theory, if svm_set_vintr is called at the same time on multiple vCPUs, the warning can happen. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210713142023.106183-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-27KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initializedPaolo Bonzini
Right now, svm_hv_vmcb_dirty_nested_enlightenments has an incorrect dereference of vmcb->control.reserved_sw before the vmcb is checked for being non-NULL. The compiler is usually sinking the dereference after the check; instead of doing this ourselves in the source, ensure that svm_hv_vmcb_dirty_nested_enlightenments is only called with a non-NULL VMCB. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Untested for now due to issues with my AMD machine. - Paolo]
2021-07-27x86/kvm: fix vcpu-id indexed array sizesJuergen Gross
KVM_MAX_VCPU_ID is the maximum vcpu-id of a guest, and not the number of vcpu-ids. Fix array indexed by vcpu-id to have KVM_MAX_VCPU_ID+1 elements. Note that this is currently no real problem, as KVM_MAX_VCPU_ID is an odd number, resulting in always enough padding being available at the end of those arrays. Nevertheless this should be fixed in order to avoid rare problems in case someone is using an even number for KVM_MAX_VCPU_ID. Signed-off-by: Juergen Gross <jgross@suse.com> Message-Id: <20210701154105.23215-2-jgross@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-26KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK accessVitaly Kuznetsov
MSR_KVM_ASYNC_PF_ACK MSR is part of interrupt based asynchronous page fault interface and not the original (deprecated) KVM_FEATURE_ASYNC_PF. This is stated in Documentation/virt/kvm/msr.rst. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Oliver Upton <oupton@google.com> Message-Id: <20210722123018.260035-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-26KVM: nSVM: Swap the parameter order for ↵Vitaly Kuznetsov
svm_copy_vmrun_state()/svm_copy_vmloadsave_state() Make svm_copy_vmrun_state()/svm_copy_vmloadsave_state() interface match 'memcpy(dest, src)' to avoid any confusion. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210719090322.625277-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-26KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state()Vitaly Kuznetsov
To match svm_copy_vmrun_state(), rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state(). Opportunistically add missing braces to 'else' branch in vmload_vmsave_interception(). No functional change intended. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210716144104.465269-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-26Backmerge tag 'v5.14-rc3' into drm-nextDave Airlie
Linux 5.14-rc3 Daniel said we should pull the nouveau fix from fixes in here, probably a good plan. Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-25Merge tag 'locking-urgent-2021-07-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 jump label fix from Thomas Gleixner: "A single fix for jump labels to prevent the compiler from agressive un-inlining which results in a section mismatch" * tag 'locking-urgent-2021-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: jump_labels: Mark __jump_label_transform() as __always_inlined to work around aggressive compiler un-inlining
2021-07-23signal: Verify the alignment and size of siginfo_tEric W. Biederman
Update the static assertions about siginfo_t to also describe it's alignment and size. While investigating if it was possible to add a 64bit field into siginfo_t[1] it became apparent that the alignment of siginfo_t is as much a part of the ABI as the size of the structure. If the alignment changes siginfo_t when embedded in another structure can move to a different offset. Which is not acceptable from an ABI structure. So document that fact and add static assertions to notify developers if they change change the alignment by accident. [1] https://lkml.kernel.org/r/YJEZdhe6JGFNYlum@elver.google.com Acked-by: Marco Elver <elver@google.com> v1: https://lkml.kernel.org/r/20210505141101.11519-4-ebiederm@xmission.co Link: https://lkml.kernel.org/r/875yxaxmyl.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Conflicts are simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23compat: make linux/compat.h available everywhereArnd Bergmann
Parts of linux/compat.h are under an #ifdef, but we end up using more of those over time, moving things around bit by bit. To get it over with once and for all, make all of this file uncondititonal now so it can be accessed everywhere. There are only a few types left that are in asm/compat.h but not yet in the asm-generic version, so add those in the process. This requires providing a few more types in asm-generic/compat.h that were not already there. The only tricky one is compat_sigset_t, which needs a little help on 32-bit architectures and for x86. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23crypto: x86/aes-ni - add missing error checks in XTS codeArd Biesheuvel
The updated XTS code fails to check the return code of skcipher_walk_virt, which may lead to skcipher_walk_abort() or skcipher_walk_done() being called while the walk argument is in an inconsistent state. So check the return value after each such call, and bail on errors. Fixes: 2481104fe98d ("crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper") Reported-by: Dave Hansen <dave.hansen@intel.com> Reported-by: syzbot <syzbot+5d1bad8042a8f0e8117a@syzkaller.appspotmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-07-23Merge tag 'drm-misc-next-2021-07-22' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15-rc1: UAPI Changes: - Remove sysfs stats for dma-buf attachments, as it causes a performance regression. Previous merge is not in a rc kernel yet, so no userspace regression possible. Cross-subsystem Changes: - Sanitize user input in kyro's viewport ioctl. - Use refcount_t in fb_info->count - Assorted fixes to dma-buf. - Extend x86 efifb handling to all archs. - Fix neofb divide by 0. - Document corpro,gm7123 bridge dt bindings. Core Changes: - Slightly rework drm master handling. - Cleanup vgaarb handling. - Assorted fixes. Driver Changes: - Add support for ws2401 panel. - Assorted fixes to stm, ast, bochs. - Demidlayer ingenic irq. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2d0d2fe8-01fc-e216-c3fd-38db9e69944e@linux.intel.com
2021-07-22Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A pair of arm64 fixes for -rc3. The straightforward one is a fix to our firmware calling stub, which accidentally started corrupting the link register on machines with SVE. Since these machines don't really exist yet, it wasn't spotted in -next. The other fix is a revert-and-a-bit of a patch originally intended to allow PTE-level huge mappings for the VMAP area on 32-bit PPC 8xx. A side-effect of this change was that our pXd_set_huge() implementations could be replaced with generic dummy functions depending on the levels of page-table being used, which in turn broke the boot if we fail to create the linear mapping as a result of using these functions to operate on the pgd. Huge thanks to Michael Ellerman for modifying the revert so as not to regress PPC 8xx in terms of functionality. Anyway, that's the background and it's also available in the commit message along with Link tags pointing at all of the fun. Summary: - Fix hang when issuing SMC on SVE-capable system due to clobbered LR - Fix boot failure due to missing block mappings with folded page-table" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" arm64: smccc: Save lr before calling __arm_smccc_sve_check()
2021-07-22Merge tag 'hyperv-fixes-signed-20210722' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - bug fix from Haiyang for vmbus CPU assignment - revert of a bogus patch that went into 5.14-rc1 * tag 'hyperv-fixes-signed-20210722' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Revert "x86/hyperv: fix logical processor creation" Drivers: hv: vmbus: Fix duplicate CPU assignments within a device
2021-07-21Revert "x86/hyperv: fix logical processor creation"Wei Liu
This reverts commit 450605c28d571eddca39a65fdbc1338add44c6d9. Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-07-21Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"Jonathan Marek
This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf. c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") breaks arm64 in at least two ways for configurations where PUD or PMD folding occur: 1. We no longer install huge-vmap mappings and silently fall back to page-granular entries, despite being able to install block entries at what is effectively the PGD level. 2. If the linear map is backed with block mappings, these will now silently fail to be created in alloc_init_pud(), causing a panic early during boot. The pgtable selftests caught this, although a fix has not been forthcoming and Christophe is AWOL at the moment, so just revert the change for now to get a working -rc3 on which we can queue patches for 5.15. A simple revert breaks the build for 32-bit PowerPC 8xx machines, which rely on the default function definitions when the corresponding page-table levels are folded, since commit a6a8f7c4aa7e ("powerpc/8xx: add support for huge pages on VMAP and VMALLOC"), eg: powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range': linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge' To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in arch/powerpc/mm/nohash/8xx.c as suggested by Christophe. Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> [mpe: Fold in 8xx.c changes from Christophe and mention in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au Signed-off-by: Will Deacon <will@kernel.org>
2021-07-21drivers/firmware: move x86 Generic System Framebuffers supportJavier Martinez Canillas
The x86 architecture has generic support to register a system framebuffer platform device. It either registers a "simple-framebuffer" if the config option CONFIG_X86_SYSFB is enabled, or a legacy VGA/VBE/EFI FB device. But the code is generic enough to be reused by other architectures and can be moved out of the arch/x86 directory. This will allow to also support the simple{fb,drm} drivers on non-x86 EFI platforms, such as aarch64 where these drivers are only supported with DT. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210625130947.1803678-2-javierm@redhat.com
2021-07-19printk: Userspace format indexing supportChris Down
We have a number of systems industry-wide that have a subset of their functionality that works as follows: 1. Receive a message from local kmsg, serial console, or netconsole; 2. Apply a set of rules to classify the message; 3. Do something based on this classification (like scheduling a remediation for the machine), rinse, and repeat. As a couple of examples of places we have this implemented just inside Facebook, although this isn't a Facebook-specific problem, we have this inside our netconsole processing (for alarm classification), and as part of our machine health checking. We use these messages to determine fairly important metrics around production health, and it's important that we get them right. While for some kinds of issues we have counters, tracepoints, or metrics with a stable interface which can reliably indicate the issue, in order to react to production issues quickly we need to work with the interface which most kernel developers naturally use when developing: printk. Most production issues come from unexpected phenomena, and as such usually the code in question doesn't have easily usable tracepoints or other counters available for the specific problem being mitigated. We have a number of lines of monitoring defence against problems in production (host metrics, process metrics, service metrics, etc), and where it's not feasible to reliably monitor at another level, this kind of pragmatic netconsole monitoring is essential. As one would expect, monitoring using printk is rather brittle for a number of reasons -- most notably that the message might disappear entirely in a new version of the kernel, or that the message may change in some way that the regex or other classification methods start to silently fail. One factor that makes this even harder is that, under normal operation, many of these messages are never expected to be hit. For example, there may be a rare hardware bug which one wants to detect if it was to ever happen again, but its recurrence is not likely or anticipated. This precludes using something like checking whether the printk in question was printed somewhere fleetwide recently to determine whether the message in question is still present or not, since we don't anticipate that it should be printed anywhere, but still need to monitor for its future presence in the long-term. This class of issue has happened on a number of occasions, causing unhealthy machines with hardware issues to remain in production for longer than ideal. As a recent example, some monitoring around blk_update_request fell out of date and caused semi-broken machines to remain in production for longer than would be desirable. Searching through the codebase to find the message is also extremely fragile, because many of the messages are further constructed beyond their callsite (eg. btrfs_printk and other module-specific wrappers, each with their own functionality). Even if they aren't, guessing the format and formulation of the underlying message based on the aesthetics of the message emitted is not a recipe for success at scale, and our previous issues with fleetwide machine health checking demonstrate as much. This provides a solution to the issue of silently changed or deleted printks: we record pointers to all printk format strings known at compile time into a new .printk_index section, both in vmlinux and modules. At runtime, this can then be iterated by looking at <debugfs>/printk/index/<module>, which emits the following format, both readable by humans and able to be parsed by machines: $ head -1 vmlinux; shuf -n 5 vmlinux # <level[,flags]> filename:line function "format" <5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n" <4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n" <6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n" <6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n" <6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n" This mitigates the majority of cases where we have a highly-specific printk which we want to match on, as we can now enumerate and check whether the format changed or the printk callsite disappeared entirely in userspace. This allows us to catch changes to printks we monitor earlier and decide what to do about it before it becomes problematic. There is no additional runtime cost for printk callers or printk itself, and the assembly generated is exactly the same. Signed-off-by: Chris Down <chris@chrisdown.name> Cc: Petr Mladek <pmladek@suse.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h} Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
2021-07-16perf/x86/intel/uncore: Fix IIO cleanup mapping procedure for SNR/ICXAlexander Antonov
skx_iio_cleanup_mapping() is re-used for snr and icx, but in those cases it fails to use the appropriate XXX_iio_mapping_group and as such fails to free previously allocated resources, leading to memory leaks. Fixes: 10337e95e04c ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX") Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> [peterz: Changelog] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210706090723.41850-1-alexander.antonov@linux.intel.com
2021-07-16x86/hyperv: add comment describing TSC_INVARIANT_CONTROL MSR setting bit 0Ani Sinha
Commit dce7cd62754b5 ("x86/hyperv: Allow guests to enable InvariantTSC") added the support for HV_X64_MSR_TSC_INVARIANT_CONTROL. Setting bit 0 of this synthetic MSR will allow hyper-v guests to report invariant TSC CPU feature through CPUID. This comment adds this explanation to the code and mentions where the Intel's generic platform init code reads this feature bit from CPUID. The comment will help developers understand how the two parts of the initialization (hyperV specific and non-hyperV specific generic hw init) are related. Signed-off-by: Ani Sinha <ani@anisinha.ca> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210716133245.3272672-1-ani@anisinha.ca Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-07-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-07-15 The following pull-request contains BPF updates for your *net-next* tree. We've added 45 non-merge commits during the last 15 day(s) which contain a total of 52 files changed, 3122 insertions(+), 384 deletions(-). The main changes are: 1) Introduce bpf timers, from Alexei. 2) Add sockmap support for unix datagram socket, from Cong. 3) Fix potential memleak and UAF in the verifier, from He. 4) Add bpf_get_func_ip helper, from Jiri. 5) Improvements to generic XDP mode, from Kumar. 6) Support for passing xdp_md to XDP programs in bpf_prog_run, from Zvi. =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15bpf, x86: Store caller's ip in trampoline stackJiri Olsa
Storing caller's ip in trampoline's stack. Trampoline programs can reach the IP in (ctx - 8) address, so there's no change in program's arguments interface. The IP address is takes from [fp + 8], which is return address from the initial 'call fentry' call to trampoline. This IP address will be returned via bpf_get_func_ip helper helper, which is added in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-2-jolsa@kernel.org
2021-07-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: - Allow again loading KVM on 32-bit non-PAE builds - Fixes for host SMIs on AMD - Fixes for guest SMIs on AMD - Fixes for selftests on s390 and ARM - Fix memory leak - Enforce no-instrumentation area on vmentry when hardware breakpoints are in use. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: selftests: smm_test: Test SMM enter from L2 KVM: nSVM: Restore nested control upon leaving SMM KVM: nSVM: Fix L1 state corruption upon return from SMM KVM: nSVM: Introduce svm_copy_vmrun_state() KVM: nSVM: Check that VM_HSAVE_PA MSR was set before VMRUN KVM: nSVM: Check the value written to MSR_VM_HSAVE_PA KVM: SVM: Fix sev_pin_memory() error checks in SEV migration utilities KVM: SVM: Return -EFAULT if copy_to_user() for SEV mig packet header fails KVM: SVM: add module param to control the #SMI interception KVM: SVM: remove INIT intercept handler KVM: SVM: #SMI interception must not skip the instruction KVM: VMX: Remove vmx_msr_index from vmx.h KVM: X86: Disable hardware breakpoints unconditionally before kvm_x86->run() KVM: selftests: Address extra memslot parameters in vm_vaddr_alloc kvm: debugfs: fix memory leak in kvm_create_vm_debugfs KVM: x86/pmu: Clear anythread deprecated bit when 0xa leaf is unsupported on the SVM KVM: mmio: Fix use-after-free Read in kvm_vm_ioctl_unregister_coalesced_mmio KVM: SVM: Revert clearing of C-bit on GPA in #NPF handler KVM: x86/mmu: Do not apply HPA (memory encryption) mask to GPAs KVM: x86: Use kernel's x86_phys_bits to handle reduced MAXPHYADDR ...