summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-21drm/panfrost: Add the panfrost_gem_mapping conceptBoris Brezillon
With the introduction of per-FD address space, the same BO can be mapped in different address space if the BO is globally visible (GEM_FLINK) and opened in different context or if the dmabuf is self-imported. The current implementation does not take case into account, and attaches the mapping directly to the panfrost_gem_object. Let's create a panfrost_gem_mapping struct and allow multiple mappings per BO. The mappings are refcounted which helps solve another problem where mappings were torn down (GEM handle closed by userspace) while GPU jobs accessing those BOs were still in-flight. Jobs now keep a reference on the mappings they use. v2 (robh): - Minor review comment clean-ups from Steven - Use list_is_singular helper - Just WARN if we add a mapping when madvise state is not WILLNEED. With that, drop the use of object_name_lock. v3 (robh): - Revert returning list iterator in panfrost_gem_mapping_get() Fixes: a5efb4c9a562 ("drm/panfrost: Restructure the GEM object creation") Fixes: 7282f7645d06 ("drm/panfrost: Implement per FD address spaces") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116021554.15090-1-robh@kernel.org
2020-01-21arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'Dirk Behme
Since v4.3-rc1 commit 0723c05fb75e44 ("arm64: enable more compressed Image formats"), it is possible to build Image.{bz2,lz4,lzma,lzo} AArch64 images. However, the commit missed adding support for removing those images on 'make ARCH=arm64 (dist)clean'. Fix this by adding them to the target list. Make sure to match the order of the recipes in the makefile. Cc: stable@vger.kernel.org # v4.3+ Fixes: 0723c05fb75e44 ("arm64: enable more compressed Image formats") Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21fs/reiserfs: remove unused macrosAlex Shi
these macros are never used from introduced. better to remove them. Link: https://lore.kernel.org/r/1579602338-57079-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Bharath Vedartham <linux.bhar@gmail.com> Cc: Hariprasad Kelam <hariprasad.kelam@gmail.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: zhengbin <zhengbin13@huawei.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: reiserfs-devel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-21fs/quota: remove unused macroAlex Shi
__QUOTA_V2_PARANOIA macro is never used. better to remove it. Link: https://lore.kernel.org/r/1579602334-57039-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Jan Kara <jack@suse.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-21net, ip_tunnel: fix namespaces moveWilliam Dauchy
in the same manner as commit 690afc165bb3 ("net: ip6_gre: fix moving ip6gre between namespaces"), fix namespace moving as it was broken since commit 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata."). Indeed, the ip6_gre commit removed the local flag for collect_md condition, so there is no reason to keep it for ip_gre/ip_tunnel. this patch will fix both ip_tunnel and ip_gre modules. Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: William Dauchy <w.dauchy@criteo.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21kvm: Refactor handling of VM debugfs filesMilan Pandurov
We can store reference to kvm_stats_debugfs_item instead of copying its values to kvm_stat_data. This allows us to remove duplicated code and usage of temporary kvm_stat_data inside vm_stat_get et al. Signed-off-by: Milan Pandurov <milanpa@amazon.de> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVMSean Christopherson
Remove the bogus 64-bit only condition from the check that disables MMIO spte optimization when the system supports the max PA, i.e. doesn't have any reserved PA bits. 32-bit KVM always uses PAE paging for the shadow MMU, and per Intel's SDM: PAE paging translates 32-bit linear addresses to 52-bit physical addresses. The kernel's restrictions on max physical addresses are limits on how much memory the kernel can reasonably use, not what physical addresses are supported by hardware. Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: nVMX: vmread should not set rflags to specify success in case of #PFMiaohe Lin
In case writing to vmread destination operand result in a #PF, vmread should not call nested_vmx_succeed() to set rflags to specify success. Similar to as done in VMPTRST (See handle_vmptrst()). Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checksSean Christopherson
Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize the checks as much as possible. Move the check to a separate helper, __is_bad_mt_xwr(), which allows the guest_rsvd_check usage in paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is always zero for non-nEPT) while retaining the bitwise-OR of the current code for the shadow_zero_check in walk_shadow_page_get_mmio_spte(). Add a comment for the bitwise-OR usage in the mmio spte walk to avoid future attempts to "fix" the code, which is what prompted this optimization in the first place[*]. Opportunistically remove the superfluous '!= 0' and parantheses, and use BIT_ULL() instead of open coding its equivalent. The net effect is that code generation is largely unchanged for walk_shadow_page_get_mmio_spte(), marginally better for ept_prefetch_invalid_gpte(), and significantly improved for paging32/64_prefetch_invalid_gpte(). Note, walk_shadow_page_get_mmio_spte() can't use a templated version of the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that KVM hasn't borked its EPT tables. Even if it could be templated, the benefits of having a single implementation far outweight the few uops that would be saved for NPT or non-TDP paging, e.g. most compilers inline it all the way to up kvm_mmu_page_fault(). [*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com Cc: Jim Mattson <jmattson@google.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte()Sean Christopherson
Move the !PRESENT and !ACCESSED checks in FNAME(prefetch_invalid_gpte) above the call to is_rsvd_bits_set(). For a well behaved guest, the !PRESENT and !ACCESSED are far more likely to evaluate true than the reserved bit checks, and they do not require additional memory accesses. Before: Dump of assembler code for function paging32_prefetch_invalid_gpte: 0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5> 0x0000000000044245 <+5>: mov %rcx,%rax 0x0000000000044248 <+8>: shr $0x7,%rax 0x000000000004424c <+12>: and $0x1,%eax 0x000000000004424f <+15>: lea 0x0(,%rax,4),%r8 0x0000000000044257 <+23>: add %r8,%rax 0x000000000004425a <+26>: mov %rcx,%r8 0x000000000004425d <+29>: and 0x120(%rsi,%rax,8),%r8 0x0000000000044265 <+37>: mov 0x170(%rsi),%rax 0x000000000004426c <+44>: shr %cl,%rax 0x000000000004426f <+47>: and $0x1,%eax 0x0000000000044272 <+50>: or %rax,%r8 0x0000000000044275 <+53>: jne 0x4427c <paging32_prefetch_invalid_gpte+60> 0x0000000000044277 <+55>: test $0x1,%cl 0x000000000004427a <+58>: jne 0x4428a <paging32_prefetch_invalid_gpte+74> 0x000000000004427c <+60>: mov %rdx,%rsi 0x000000000004427f <+63>: callq 0x44080 <drop_spte> 0x0000000000044284 <+68>: mov $0x1,%eax 0x0000000000044289 <+73>: retq 0x000000000004428a <+74>: xor %eax,%eax 0x000000000004428c <+76>: and $0x20,%ecx 0x000000000004428f <+79>: jne 0x44289 <paging32_prefetch_invalid_gpte+73> 0x0000000000044291 <+81>: mov %rdx,%rsi 0x0000000000044294 <+84>: callq 0x44080 <drop_spte> 0x0000000000044299 <+89>: mov $0x1,%eax 0x000000000004429e <+94>: jmp 0x44289 <paging32_prefetch_invalid_gpte+73> End of assembler dump. After: Dump of assembler code for function paging32_prefetch_invalid_gpte: 0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5> 0x0000000000044245 <+5>: test $0x1,%cl 0x0000000000044248 <+8>: je 0x4424f <paging32_prefetch_invalid_gpte+15> 0x000000000004424a <+10>: test $0x20,%cl 0x000000000004424d <+13>: jne 0x4425d <paging32_prefetch_invalid_gpte+29> 0x000000000004424f <+15>: mov %rdx,%rsi 0x0000000000044252 <+18>: callq 0x44080 <drop_spte> 0x0000000000044257 <+23>: mov $0x1,%eax 0x000000000004425c <+28>: retq 0x000000000004425d <+29>: mov %rcx,%rax 0x0000000000044260 <+32>: mov (%rsi),%rsi 0x0000000000044263 <+35>: shr $0x7,%rax 0x0000000000044267 <+39>: and $0x1,%eax 0x000000000004426a <+42>: lea 0x0(,%rax,4),%r8 0x0000000000044272 <+50>: add %r8,%rax 0x0000000000044275 <+53>: mov %rcx,%r8 0x0000000000044278 <+56>: and 0x120(%rsi,%rax,8),%r8 0x0000000000044280 <+64>: mov 0x170(%rsi),%rax 0x0000000000044287 <+71>: shr %cl,%rax 0x000000000004428a <+74>: and $0x1,%eax 0x000000000004428d <+77>: mov %rax,%rcx 0x0000000000044290 <+80>: xor %eax,%eax 0x0000000000044292 <+82>: or %rcx,%r8 0x0000000000044295 <+85>: je 0x4425c <paging32_prefetch_invalid_gpte+28> 0x0000000000044297 <+87>: mov %rdx,%rsi 0x000000000004429a <+90>: callq 0x44080 <drop_spte> 0x000000000004429f <+95>: mov $0x1,%eax 0x00000000000442a4 <+100>: jmp 0x4425c <paging32_prefetch_invalid_gpte+28> End of assembler dump. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: SVM: Override default MMIO mask if memory encryption is enabledTom Lendacky
The KVM MMIO support uses bit 51 as the reserved bit to cause nested page faults when a guest performs MMIO. The AMD memory encryption support uses a CPUID function to define the encryption bit position. Given this, it is possible that these bits can conflict. Use svm_hardware_setup() to override the MMIO mask if memory encryption support is enabled. Various checks are performed to ensure that the mask is properly defined and rsvd_bits() is used to generate the new mask (as was done prior to the change that necessitated this patch). Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: vmx: delete meaningless nested_vmx_prepare_msr_bitmap() declarationMiaohe Lin
The function nested_vmx_prepare_msr_bitmap() declaration is below its implementation. So this is meaningless and should be removed. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Refactor and rename bit() to feature_bit() macroSean Christopherson
Rename bit() to __feature_bit() to give it a more descriptive name, and add a macro, feature_bit(), to stuff the X68_FEATURE_ prefix to keep line lengths manageable for code that hardcodes the bit to be retrieved. No functional change intended. Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Expand build-time assertion on reverse CPUID usageSean Christopherson
Add build-time checks to ensure KVM isn't trying to do a reverse CPUID lookup on Linux-defined feature bits, along with comments to explain the gory details of X86_FEATUREs and bit(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add CPUID_7_1_EAX to the reverse CPUID tableSean Christopherson
Add an entry for CPUID_7_1_EAX in the reserve_cpuid array in preparation for incorporating the array in bit() build-time assertions, specifically to avoid an assertion on F(AVX512_BF16) in do_cpuid_7_mask(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Move bit() helper to cpuid.hSean Christopherson
Move bit() to cpuid.h in preparation for incorporating the reverse_cpuid array in bit() build-time assertions. Opportunistically use the BIT() macro instead of open-coding the shift. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add dedicated emulator helpers for querying CPUID featuresSean Christopherson
Add feature-specific helpers for querying guest CPUID support from the emulator instead of having the emulator do a full CPUID and perform its own bit tests. The primary motivation is to eliminate the emulator's usage of bit() so that future patches can add more extensive build-time assertions on the usage of bit() without having to expose yet more code to the emulator. Note, providing a generic guest_cpuid_has() to the emulator doesn't work due to the existing built-time assertions in guest_cpuid_has(), which require the feature being checked to be a compile-time constant. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add macro to ensure reserved cr4 bits checks stay in syncSean Christopherson
Add a helper macro to generate the set of reserved cr4 bits for both host and guest to ensure that adding a check on guest capabilities is also added for host capabilities, and vice versa. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Drop special XSAVE handling from guest_cpuid_has()Sean Christopherson
Now that KVM prevents setting host-reserved CR4 bits, drop the dedicated XSAVE check in guest_cpuid_has() in favor of open coding similar checks in the SVM/VMX XSAVES enabling flows. Note, checking boot_cpu_has(X86_FEATURE_XSAVE) in the XSAVES flows is technically redundant with respect to the CR4 reserved bit checks, e.g. XSAVES #UDs if CR4.OSXSAVE=0 and arch.xsaves_enabled is consumed if and only if CR4.OXSAVE=1 in guest. Keep (add?) the explicit boot_cpu_has() checks to help document KVM's usage of arch.xsaves_enabled. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Ensure all logical CPUs have consistent reserved cr4 bitsSean Christopherson
Check the current CPU's reserved cr4 bits against the mask calculated for the boot CPU to ensure consistent behavior across all CPUs. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Don't let userspace set host-reserved cr4 bitsSean Christopherson
Calculate the host-reserved cr4 bits at runtime based on the system's capabilities (using logic similar to __do_cpuid_func()), and use the dynamically generated mask for the reserved bit check in kvm_set_cr4() instead using of the static CR4_RESERVED_BITS define. This prevents userspace from "enabling" features in cr4 that are not supported by the system, e.g. by ignoring KVM_GET_SUPPORTED_CPUID and specifying a bogus CPUID for the vCPU. Allowing userspace to set unsupported bits in cr4 can lead to a variety of undesirable behavior, e.g. failed VM-Enter, and in general increases KVM's attack surface. A crafty userspace can even abuse CR4.LA57 to induce an unchecked #GP on a WRMSR. On a platform without LA57 support: KVM_SET_CPUID2 // CPUID_7_0_ECX.LA57 = 1 KVM_SET_SREGS // CR4.LA57 = 1 KVM_SET_MSRS // KERNEL_GS_BASE = 0x0004000000000000 KVM_RUN leads to a #GP when writing KERNEL_GS_BASE into hardware: unchecked MSR access error: WRMSR to 0xc0000102 (tried to write 0x0004000000000000) at rIP: 0xffffffffa00f239a (vmx_prepare_switch_to_guest+0x10a/0x1d0 [kvm_intel]) Call Trace: kvm_arch_vcpu_ioctl_run+0x671/0x1c70 [kvm] kvm_vcpu_ioctl+0x36b/0x5d0 [kvm] do_vfs_ioctl+0xa1/0x620 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc08133bf47 Note, the above sequence fails VM-Enter due to invalid guest state. Userspace can allow VM-Enter to succeed (after the WRMSR #GP) by adding a KVM_SET_SREGS w/ CR4.LA57=0 after KVM_SET_MSRS, in which case KVM will technically leak the host's KERNEL_GS_BASE into the guest. But, as KERNEL_GS_BASE is a userspace-defined value/address, the leak is largely benign as a malicious userspace would simply be exposing its own data to the guest, and attacking a benevolent userspace would require multiple bugs in the userspace VMM. Cc: stable@vger.kernel.org Cc: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: Add helper to consolidate up PT/RTIT WRMSR fault logicSean Christopherson
Add a helper to consolidate the common checks for writing PT MSRs, and opportunistically clean up the formatting of the affected code. No functional change intended. Cc: Chao Peng <chao.p.peng@linux.intel.com> Cc: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: Add non-canonical check on writes to RTIT address MSRsSean Christopherson
Reject writes to RTIT address MSRs if the data being written is a non-canonical address as the MSRs are subject to canonical checks, e.g. KVM will trigger an unchecked #GP when loading the values to hardware during pt_guest_enter(). Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some writing mistakesMiaohe Lin
Fix some writing mistakes in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: hyperv: Fix some typos in vcpu unimpl infoMiaohe Lin
Fix some typos in vcpu unimpl info. It should be unhandled rather than uhandled. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some grammar mistakesMiaohe Lin
Fix some grammar mistakes in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some comment typos and missing parenthesesMiaohe Lin
Fix some typos and add missing parentheses in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some out-dated function names in commentMiaohe Lin
Since commit b1346ab2afbe ("KVM: nVMX: Rename prepare_vmcs02_*_full to prepare_vmcs02_*_rare"), prepare_vmcs02_full has been renamed to prepare_vmcs02_rare. nested_vmx_merge_msr_bitmap is renamed to nested_vmx_prepare_msr_bitmap since commit c992384bde84 ("KVM: vmx: speed up MSR bitmap merge"). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some wrong function names in commentMiaohe Lin
Fix some wrong function names in comment. mmu_check_roots is a typo for mmu_check_root, vmcs_read_any should be vmcs12_read_any and so on. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: check kvm_pit outside kvm_vm_ioctl_reinject()Miaohe Lin
check kvm_pit outside kvm_vm_ioctl_reinject() to keep codestyle consistent with other kvm_pit func and prepare for futher cleanups. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: LAPIC: micro-optimize fixed mode ipi deliveryWanpeng Li
This patch optimizes redundancy logic before fixed mode ipi is delivered in the fast path, broadcast handling needs to go slow path, so the delivery mode repair can be delayed to before slow path. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpathWanpeng Li
ICR and TSCDEADLINE MSRs write cause the main MSRs write vmexits in our product observation, multicast IPIs are not as common as unicast IPI like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. This patch introduce a mechanism to handle certain performance-critical WRMSRs in a very early stage of KVM VMExit handler. This mechanism is specifically used for accelerating writes to x2APIC ICR that attempt to send a virtual IPI with physical destination-mode, fixed delivery-mode and single target. Which was found as one of the main causes of VMExits for Linux workloads. The reason this mechanism significantly reduce the latency of such virtual IPIs is by sending the physical IPI to the target vCPU in a very early stage of KVM VMExit handler, before host interrupts are enabled and before expensive operations such as reacquiring KVM’s SRCU lock. Latency is reduced even more when KVM is able to use APICv posted-interrupt mechanism (which allows to deliver the virtual IPI directly to target vCPU without the need to kick it to host). Testing on Xeon Skylake server: The virtual IPI latency from sender send to receiver receive reduces more than 200+ cpu cycles. Reviewed-by: Liran Alon <liran.alon@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21sparc/console: kill off obsolete declarationsArvind Sankar
commit 09d3f3f0e02c ("sparc: Kill PROM console driver.") missed removing the declarations of the deleted prom_con structure and prom_con_init function from console.h. Kill them off now. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21sparc32: fix struct ipc64_perm type definitionArnd Bergmann
As discussed in the strace issue tracker, it appears that the sparc32 sysvipc support has been broken for the past 11 years. It was however working in compat mode, which is how it must have escaped most of the regular testing. The problem is that a cleanup patch inadvertently changed the uid/gid fields in struct ipc64_perm from 32-bit types to 16-bit types in uapi headers. Both glibc and uclibc-ng still use the original types, so they should work fine with compat mode, but not natively. Change the definitions to use __kernel_uid32_t and __kernel_gid32_t again. Fixes: 83c86984bff2 ("sparc: unify ipcbuf.h") Link: https://github.com/strace/strace/issues/116 Cc: <stable@vger.kernel.org> # v2.6.29 Cc: Sam Ravnborg <sam@ravnborg.org> Cc: "Dmitry V . Levin" <ldv@altlinux.org> Cc: Rich Felker <dalias@libc.org> Cc: libc-alpha@sourceware.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21sparc32, leon: Stop adding vendor and device id to prom ambapp path componentsAndreas Larsson
These extra fields before the @ are not handled in of_node_name_eq, making commit b3e46d1a0590500335f0b95e669ad6d84b12b03a break node name comparisons for ambapp path components, thereby making LEON systems unable to boot. As there is no need for the tacked on vendor and device ID fields in the path component, resolve this situation by removing them. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-01-21 1) Add support for TCP encapsulation of IKE and ESP messages, as defined by RFC 8229. Patchset from Sabrina Dubroca. Please note that there is a merge conflict in: net/unix/af_unix.c between commit: 3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo") from the net-next tree and commit: b50b0580d27b ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram") from the ipsec-next tree. The conflict can be solved as done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21drivers: net: declance: fix comparing pointer to 0Chen Zhou
Fixes coccicheck warning: ./drivers/net/ethernet/amd/declance.c:611:14-15: WARNING comparing pointer to 0 Replace "skb == 0" with "!skb". Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21tcp/ipv4: remove AF_INET_FAMILYAlex Shi
After commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") the macro isn't used anymore. remove it. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net/hsr: remove seq_nr_after_or_eqAlex Shi
It's never used after introduced. So maybe better to remove. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Arvid Brodin <arvid.brodin@alten.se> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21hdlx_x25: Fix backwards compat test.David S. Miller
drivers/net/wan/hdlc_x25.c: In function ‘x25_ioctl’: drivers/net/wan/hdlc_x25.c:256:7: warning: suggest parentheses around assignment used as truth value [-Wparentheses] 256 | if (ifr->ifr_settings.size = 0) { | ^~~ Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21Merge branch 'hns3-next'David S. Miller
Huazhong Tan says: ==================== net: hns3: misc updates for -net-next This series includes some misc updates for the HNS3 ethernet driver. [patch 1] adds a limitation for the error log in the hns3_clean_tx_ring(). [patch 2] adds a check for pfmemalloc flag before reusing pages since these pages may be used some special case. [patch 3] assigns a default reset type 'HNAE3_NONE_RESET' to VF's reset_type after initializing or reset. [patch 4] unifies macro HCLGE_DFX_REG_TYPE_CNT's definition into header file. [patch 5] refines the parameter 'size' of snprintf() in the hns3_init_module(). [patch 6] rewrites a debug message in hclge_put_vector(). [patch 7~9] adds some cleanups related to coding style. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: cleanup some coding style issueHuazhong Tan
This patch removes some unnecessary return value assignments, some duplicated printing in the caller, refines the judgment of 0 and uses le16_to_cpu to replace __le16_to_cpu. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: remove redundant print on ENOMEMHuazhong Tan
All kmalloc-based functions print enough information on failures. So this patch removes the log in hclge_get_dfx_reg() when returns ENOMEM. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: delete unnecessary blank line and space for cleanupGuangbin Huang
This patch deletes some unnecessary blank lines and spaces to clean up code, and in hclgevf_set_vlan_filter() moves the comment to the front of hclgevf_send_mbx_msg(). Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: rewrite a log in hclge_put_vector()Yonglong Liu
When gets vector fails, hclge_put_vector() should print out the vector instead of vector_id in the log and return the wrong vector_id to its caller. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: refine the input parameter 'size' for snprintf()Guojia Liao
The function snprintf() writes at most size bytes (including the terminating null byte ('\0') to str. Now, We can guarantee that the parameter of size is lager than the length of str to be formatting including its terminating null byte. So it's unnecessary to minus 1 for the input parameter 'size'. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: move duplicated macro definition into headerGuojia Liao
Macro HCLGE_GET_DFX_REG_TYPE_CNT in hclge_dbg_get_dfx_bd_num() and macro HCLGE_DFX_REG_BD_NUM in hclge_get_dfx_reg_bd_num() have the same meaning, so just defines HCLGE_GET_DFX_REG_TYPE_CNT in hclge_main.h. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: set VF's default reset_type to HNAE3_NONE_RESETHuazhong Tan
reset_type means what kind of reset the driver is handling now, so after initializing or reset, the reset_type of VF should be set to HNAE3_NONE_RESET, otherwise, this unknown default value may be a little misleading when the device is running. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: do not reuse pfmemalloc pagesYunsheng Lin
HNS3 driver allocates pages for DMA with dev_alloc_pages(), which calls alloc_pages_node() with the __GFP_MEMALLOC flag. So, in case of OOM condition, HNS3 can get pages with pfmemalloc flag set. So do not reuse the pages with pfmemalloc flag set because those pages are reserved for special cases, such as low memory case. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: hns3: limit the error logging in the hns3_clean_tx_ring()Yunsheng Lin
The error log printed by netdev_err() in the hns3_clean_tx_ring() may spam the kernel log. This patch uses hns3_rl_err() to ratelimit the error log in the hns3_clean_tx_ring(). Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>