summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-06-09irqchip/gic-v3: Fix refcount leak in gic_populate_ppi_partitionsMiaoqian Lin
of_find_node_by_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: e3825ba1af3a ("irqchip/gic-v3: Add support for partitioned PPIs") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220601080930.31005-6-linmq006@gmail.com
2022-06-09irqchip/gic-v3: Fix error handling in gic_populate_ppi_partitionsMiaoqian Lin
of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. When kcalloc fails, it missing of_node_put() and results in refcount leak. Fix this by goto out_put_node label. Fixes: 52085d3f2028 ("irqchip/gic-v3: Dynamically allocate PPI partition descriptors") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220601080930.31005-5-linmq006@gmail.com
2022-06-09irqchip/apple-aic: Fix refcount leak in aic_of_ic_initMiaoqian Lin
of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: a5e8801202b3 ("irqchip/apple-aic: Parse FIQ affinities from device-tree") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220601080930.31005-4-linmq006@gmail.com
2022-06-09irqchip/apple-aic: Fix refcount leak in build_fiq_affinityMiaoqian Lin
of_find_node_by_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: a5e8801202b3 ("irqchip/apple-aic: Parse FIQ affinities from device-tree") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220601080930.31005-3-linmq006@gmail.com
2022-06-09irqchip/gic/realview: Fix refcount leak in realview_gic_of_initMiaoqian Lin
of_find_matching_node_and_match() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: 82b0a434b436 ("irqchip/gic/realview: Support more RealView DCC variants") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220601080930.31005-2-linmq006@gmail.com
2022-06-09irqchip/xilinx: Remove microblaze+zynq dependencyJamie Iles
The Xilinx IRQ controller doesn't really have any architecture dependencies - it's a generic AXI component that can be used for any FPGA core from Zynq hard processor systems to microblaze+riscv soft cores and more. Signed-off-by: Jamie Iles <jamie@jamieiles.com> Acked-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220606213952.298686-1-jamie@jamieiles.com
2022-06-09docs: Move the HTE documentation to driver-api/Jonathan Corbet
The hardware timestamp engine documentation is driver API material, and really belongs in the driver-API book; move it there. Cc: Thierry Reding <treding@nvidia.com> Acked-by: Dipen Patel <dipenp@nvidia.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-06-09iavf: Fix issue with MAC address of VF shown as zeroMichal Wilczynski
After reinitialization of iavf, ice driver gets VIRTCHNL_OP_ADD_ETH_ADDR message with incorrectly set type of MAC address. Hardware address should have is_primary flag set as true. This way ice driver knows what it has to set as a MAC address. Check if the address is primary in iavf_add_filter function and set flag accordingly. To test set all-zero MAC on a VF. This triggers iavf re-initialization and VIRTCHNL_OP_ADD_ETH_ADDR message gets sent to PF. For example: ip link set dev ens785 vf 0 mac 00:00:00:00:00:00 This triggers re-initialization of iavf. New MAC should be assigned. Now check if MAC is non-zero: ip link show dev ens785 Fixes: a3e839d539e0 ("iavf: Add usage of new virtchnl format to set default MAC") Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09i40e: Fix call trace in setup_tx_descriptorsAleksandr Loktionov
After PF reset and ethtool -t there was call trace in dmesg sometimes leading to panic. When there was some time, around 5 seconds, between reset and test there were no errors. Problem was that pf reset calls i40e_vsi_close in prep_for_reset and ethtool -t calls i40e_vsi_close in diag_test. If there was not enough time between those commands the second i40e_vsi_close starts before previous i40e_vsi_close was done which leads to crash. Add check to diag_test if pf is in reset and don't start offline tests if it is true. Add netif_info("testing failed") into unhappy path of i40e_diag_test() Fixes: e17bc411aea8 ("i40e: Disable offline diagnostics if VFs are enabled") Fixes: 510efb2682b3 ("i40e: Fix ethtool offline diagnostic with netqueues") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09i40e: Fix calculating the number of queue pairsGrzegorz Szczurek
If ADQ is enabled for a VF, then actual number of queue pair is a number of currently available traffic classes for this VF. Without this change the configuration of the Rx/Tx queues fails with error. Fixes: d29e0d233e0d ("i40e: missing input validation on VF message handling by the PF") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09i40e: Fix adding ADQ filter to TC0Grzegorz Szczurek
Procedure of configure tc flower filters erroneously allows to create filters on TC0 where unfiltered packets are also directed by default. Issue was caused by insufficient checks of hw_tc parameter specifying the hardware traffic class to pass matching packets to. Fix checking hw_tc parameter which blocks creation of filters on TC0. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09docs: usb: fix literal block marker in usbmon verification exampleJustin Swartz
The "Verify that bus sockets are present" example was not properly formatted due to a typo in the literal block marker. Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za> Link: https://lore.kernel.org/r/20220604155431.23246-1-justin.swartz@risingedge.co.za Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-06-09Documentation/features: Update the arch support status filesZheng Zengkai
The arch support status files don't match reality as of v5.19-rc1, use the features-refresh.sh to refresh all the arch-support.txt files in place. The main effect is to add entries for the new loong architecture. Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Link: https://lore.kernel.org/r/20220609025656.143460-1-zhengzengkai@huawei.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-06-09genirq: PM: Use runtime PM for chained interruptsMarc Zyngier
When requesting an interrupt, we correctly call into the runtime PM framework to guarantee that the underlying interrupt controller is up and running. However, we fail to do so for chained interrupt controllers, as the mux interrupt is not requested along the same path. Augment __irq_do_set_handler() to call into the runtime PM code in this case, making sure the PM flow is the same for all interrupts. Reported-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/26973cddee5f527ea17184c0f3fccb70bc8969a0.camel@pengutronix.de
2022-06-09KVM: selftests: Restrict test region to 48-bit physical addresses when using ↵David Matlack
nested The selftests nested code only supports 4-level paging at the moment. This means it cannot map nested guest physical addresses with more than 48 bits. Allow perf_test_util nested mode to work on hosts with more than 48 physical addresses by restricting the guest test region to 48-bits. While here, opportunistically fix an off-by-one error when dealing with vm_get_max_gfn(). perf_test_util.c was treating this as the maximum number of GFNs, rather than the maximum allowed GFN. This didn't result in any correctness issues, but it did end up shifting the test region down slightly when using huge pages. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-12-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2David Matlack
Add an option to dirty_log_perf_test that configures the vCPUs to run in L2 instead of L1. This makes it possible to benchmark the dirty logging performance of nested virtualization, which is particularly interesting because KVM must shadow L1's EPT/NPT tables. For now this support only works on x86_64 CPUs with VMX. Otherwise passing -n results in the test being skipped. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-11-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Clean up LIBKVM files in MakefileDavid Matlack
Break up the long lines for LIBKVM and alphabetize each architecture. This makes reading the Makefile easier, and will make reading diffs to LIBKVM easier. No functional change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-10-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Link selftests directly with lib object filesDavid Matlack
The linker does obey strong/weak symbols when linking static libraries, it simply resolves an undefined symbol to the first-encountered symbol. This means that defining __weak arch-generic functions and then defining arch-specific strong functions to override them in libkvm will not always work. More specifically, if we have: lib/generic.c: void __weak foo(void) { pr_info("weak\n"); } void bar(void) { foo(); } lib/x86_64/arch.c: void foo(void) { pr_info("strong\n"); } And a selftest that calls bar(), it will print "weak". Now if you make generic.o explicitly depend on arch.o (e.g. add function to arch.c that is called directly from generic.c) it will print "strong". In other words, it seems that the linker is free to throw out arch.o when linking because generic.o does not explicitly depend on it, which causes the linker to lose the strong symbol. One solution is to link libkvm.a with --whole-archive so that the linker doesn't throw away object files it thinks are unnecessary. However that is a bit difficult to plumb since we are using the common selftests makefile rules. An easier solution is to drop libkvm.a just link selftests with all the .o files that were originally in libkvm.a. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-9-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Drop unnecessary rule for STATIC_LIBSDavid Matlack
Drop the "all: $(STATIC_LIBS)" rule. The KVM selftests already depend on $(STATIC_LIBS), so there is no reason to have an extra "all" rule. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-8-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Add a helper to check EPT/VPID capabilitiesDavid Matlack
Create a small helper function to check if a given EPT/VPID capability is supported. This will be re-used in a follow-up commit to check for 1G page support. No functional change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.hDavid Matlack
This is a VMX-related macro so move it to vmx.h. While here, open code the mask like the rest of the VMX bitmask macros. No functional change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-6-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Refactor nested_map() to specify target levelDavid Matlack
Refactor nested_map() to specify that it explicityl wants 4K mappings (the existing behavior) and push the implementation down into __nested_map(), which can be used in subsequent commits to create huge page mappings. No function change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Drop stale function parameter comment for nested_map()David Matlack
nested_map() does not take a parameter named eptp_memslot. Drop the comment referring to it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Add option to create 2M and 1G EPT mappingsDavid Matlack
The current EPT mapping code in the selftests only supports mapping 4K pages. This commit extends that support with an option to map at 2M or 1G. This will be used in a future commit to create large page mappings to test eager page splitting. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: selftests: Replace x86_page_size with PG_LEVEL_XXDavid Matlack
x86_page_size is an enum used to communicate the desired page size with which to map a range of memory. Under the hood they just encode the desired level at which to map the page. This ends up being clunky in a few ways: - The name suggests it encodes the size of the page rather than the level. - In other places in x86_64/processor.c we just use a raw int to encode the level. Simplify this by adopting the kernel style of PG_LEVEL_XX enums and pass around raw ints when referring to the level. This makes the code easier to understand since these macros are very common in KVM MMU code. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSEPaolo Bonzini
Commit 74fd41ed16fd ("KVM: x86: nSVM: support PAUSE filtering when L0 doesn't intercept PAUSE") introduced passthrough support for nested pause filtering, (when the host doesn't intercept PAUSE) (either disabled with kvm module param, or disabled with '-overcommit cpu-pm=on') Before this commit, L1 KVM didn't intercept PAUSE at all; afterwards, the feature was exposed as supported by KVM cpuid unconditionally, thus if L1 could try to use it even when the L0 KVM can't really support it. In this case the fallback caused KVM to intercept each PAUSE instruction; in some cases, such intercept can slow down the nested guest so much that it can fail to boot. Instead, before the problematic commit KVM was already setting both thresholds to 0 in vmcb02, but after the first userspace VM exit shrink_ple_window was called and would reset the pause_filter_count to the default value. To fix this, change the fallback strategy - ignore the guest threshold values, but use/update the host threshold values unless the guest specifically requests disabling PAUSE filtering (either simple or advanced). Also fix a minor bug: on nested VM exit, when PAUSE filter counter were copied back to vmcb01, a dirty bit was not set. Thanks a lot to Suravee Suthikulpanit for debugging this! Fixes: 74fd41ed16fd ("KVM: x86: nSVM: support PAUSE filtering when L0 doesn't intercept PAUSE") Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220518072709.730031-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/putMaxim Levitsky
Now that these functions are always called with preemption disabled, remove the preempt_disable()/preempt_enable() pair inside them. No functional change intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-8-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blockingMaxim Levitsky
On SVM, if preemption happens right after the call to finish_rcuwait but before call to kvm_arch_vcpu_unblocking on SVM/AVIC, it itself will re-enable AVIC, and then we will try to re-enable it again in kvm_arch_vcpu_unblocking which will lead to a warning in __avic_vcpu_load. The same problem can happen if the vCPU is preempted right after the call to kvm_arch_vcpu_blocking but before the call to prepare_to_rcuwait and in this case, we will end up with AVIC enabled during sleep - Ooops. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-7-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: disable preemption while updating apicv inhibitionMaxim Levitsky
Currently nothing prevents preemption in kvm_vcpu_update_apicv. On SVM, If the preemption happens after we update the vcpu->arch.apicv_active, the preemption itself will 'update' the inhibition since the AVIC will be first disabled on vCPU unload and then enabled, when the current task is loaded again. Then we will try to update it again, which will lead to a warning in __avic_vcpu_load, that the AVIC is already enabled. Fix this by disabling preemption in this code. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-6-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: SVM: fix avic_kick_target_vcpus_fastMaxim Levitsky
There are two issues in avic_kick_target_vcpus_fast 1. It is legal to issue an IPI request with APIC_DEST_NOSHORT and a physical destination of 0xFF (or 0xFFFFFFFF in case of x2apic), which must be treated as a broadcast destination. Fix this by explicitly checking for it. Also don’t use ‘index’ in this case as it gives no new information. 2. It is legal to issue a logical IPI request to more than one target. Index field only provides index in physical id table of first such target and therefore can't be used before we are sure that only a single target was addressed. Instead, parse the ICRL/ICRH, double check that a unicast interrupt was requested, and use that info to figure out the physical id of the target vCPU. At that point there is no need to use the index field as well. In addition to fixing the above issues, also skip the call to kvm_apic_match_dest. It is possible to do this now, because now as long as AVIC is not inhibited, it is guaranteed that none of the vCPUs changed their apic id from its default value. This fixes boot of windows guest with AVIC enabled because it uses IPI with 0xFF destination and no destination shorthand. Fixes: 7223fd2d5338 ("KVM: SVM: Use target APIC ID to complete AVIC IRQs when possible") Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: SVM: remove avic's broken code that updated APIC IDMaxim Levitsky
AVIC is now inhibited if the guest changes the apic id, and therefore this code is no longer needed. There are several ways this code was broken, including: 1. a vCPU was only allowed to change its apic id to an apic id of an existing vCPU. 2. After such change, the vCPU whose apic id entry was overwritten, could not correctly change its own apic id, because its own entry is already overwritten. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC baseMaxim Levitsky
Neither of these settings should be changed by the guest and it is a burden to support it in the acceleration code, so just inhibit this code instead. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86: document AVIC/APICv inhibit reasonsMaxim Levitsky
These days there are too many AVIC/APICv inhibit reasons, and it doesn't hurt to have some documentation for them. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220606180829.102503-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRsYuan Yao
Assign shadow_me_value, not shadow_me_mask, to PAE root entries, a.k.a. shadow PDPTRs, when host memory encryption is supported. The "mask" is the set of all possible memory encryption bits, e.g. MKTME KeyIDs, whereas "value" holds the actual value that needs to be stuffed into host page tables. Using shadow_me_mask results in a failed VM-Entry due to setting reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to physical addresses with non-zero MKTME bits sending to_shadow_page() into the weeds: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state. BUG: unable to handle page fault for address: ffd43f00063049e8 PGD 86dfd8067 P4D 0 Oops: 0000 [#1] PREEMPT SMP RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm] kvm_mmu_free_roots+0xd1/0x200 [kvm] __kvm_mmu_unload+0x29/0x70 [kvm] kvm_mmu_unload+0x13/0x20 [kvm] kvm_arch_destroy_vm+0x8a/0x190 [kvm] kvm_put_kvm+0x197/0x2d0 [kvm] kvm_vm_release+0x21/0x30 [kvm] __fput+0x8e/0x260 ____fput+0xe/0x10 task_work_run+0x6f/0xb0 do_exit+0x327/0xa90 do_group_exit+0x35/0xa0 get_signal+0x911/0x930 arch_do_signal_or_restart+0x37/0x720 exit_to_user_mode_prepare+0xb2/0x140 syscall_exit_to_user_mode+0x16/0x30 do_syscall_64+0x4e/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: e54f1ff244ac ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask") Signed-off-by: Yuan Yao <yuan.yao@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-Id: <20220608012015.19566-1-yuan.yao@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-09Merge tag 'kvmarm-fixes-5.19-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.19, take #1 - Properly reset the SVE/SME flags on vcpu load - Fix a vgic-v2 regression regarding accessing the pending state of a HW interrupt from userspace (and make the code common with vgic-v3) - Fix access to the idreg range for protected guests - Ignore 'kvm-arm.mode=protected' when using VHE - Return an error from kvm_arch_init_vm() on allocation failure - A bunch of small cleanups (comments, annotations, indentation)
2022-06-09drm/atomic: fix warning of unused variableGONG, Ruiqi
Fix the `unused-but-set-variable` warning as how other iteration wrappers do. Link: https://lore.kernel.org/all/202206071049.pofHsRih-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220607110848.941486-1-gongruiqi1@huawei.com
2022-06-09Merge tag 'kvm-riscv-fixes-5.19-1' of https://github.com/kvm-riscv/linux ↵Paolo Bonzini
into HEAD KVM/riscv fixes for 5.19, take #1 - Typo fix in arch/riscv/kvm/vmid.c - Remove broken reference pattern from MAINTAINERS entry
2022-06-09powerpc/32: Fix overread/overwrite of thread_struct via ptraceMichael Ellerman
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process to read/write registers of another process. To get/set a register, the API takes an index into an imaginary address space called the "USER area", where the registers of the process are laid out in some fashion. The kernel then maps that index to a particular register in its own data structures and gets/sets the value. The API only allows a single machine-word to be read/written at a time. So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels. The way floating point registers (FPRs) are addressed is somewhat complicated, because double precision float values are 64-bit even on 32-bit CPUs. That means on 32-bit kernels each FPR occupies two word-sized locations in the USER area. On 64-bit kernels each FPR occupies one word-sized location in the USER area. Internally the kernel stores the FPRs in an array of u64s, or if VSX is enabled, an array of pairs of u64s where one half of each pair stores the FPR. Which half of the pair stores the FPR depends on the kernel's endianness. To handle the different layouts of the FPRs depending on VSX/no-VSX and big/little endian, the TS_FPR() macro was introduced. Unfortunately the TS_FPR() macro does not take into account the fact that the addressing of each FPR differs between 32-bit and 64-bit kernels. It just takes the index into the "USER area" passed from userspace and indexes into the fp_state.fpr array. On 32-bit there are 64 indexes that address FPRs, but only 32 entries in the fp_state.fpr array, meaning the user can read/write 256 bytes past the end of the array. Because the fp_state sits in the middle of the thread_struct there are various fields than can be overwritten, including some pointers. As such it may be exploitable. It has also been observed to cause systems to hang or otherwise misbehave when using gdbserver, and is probably the root cause of this report which could not be easily reproduced: https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@keymile.com/ Rather than trying to make the TS_FPR() macro even more complicated to fix the bug, or add more macros, instead add a special-case for 32-bit kernels. This is more obvious and hopefully avoids a similar bug happening again in future. Note that because 32-bit kernels never have VSX enabled the code doesn't need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to ensure that 32-bit && VSX is never enabled. Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds") Cc: stable@vger.kernel.org # v3.13+ Reported-by: Ariel Miculas <ariel.miculas@belden.com> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
2022-06-09Revert "mtd: rawnand: add support for Toshiba TC58NVG0S3HTA00 NAND flash"Christian Lamparter
This reverts commit 3380557fc7e28d9bce7607e16d98f123d36da4ca. It turned out this "4-byte" ID might have been an honest mistake. Regrettably, the chip Andreas has might be a counterfeit or is damaged in some other way and shouldn't have ended up in a router. Andreas reported his chip is returning just four bytes: "98 f1 80 15 00 00 00 00". However, according to Kioxia/Toshiba's datasheet, there should have been at least another byte that would have contained the correct OOB size that Andreas needed. Miquel and Andreas are both favoring reverting the patch over further, possibly hacky modifications: "[Reverting] is the safest option here. Apart from this device, we do not know how many devices have these damaged/counterfeit chips. If it is just a couple and only on Fritzboxes, as suggested in the Github issue the patch could be carried through OpenWrt[...]" Thanks to several users on the openwrt forum and github issue, who stayed along for the ride: - Peter-vdL for reporting the issue and testing patches. - neg2led and Hannu Nyman who did all the datasheet digging and debugging. Cc: Andreas Boehler <dev@aboehler.at> Suggested-by: Andreas Boehler <dev@aboehler.at> Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://github.com/openwrt/openwrt/issues/9962 Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220607185918.1048204-1-chunkeey@gmail.com
2022-06-09USB: serial: option: add support for Cinterion MV31 with new baselineSlark Xiao
Adding support for Cinterion device MV31 with Qualcomm new baseline. Use different PIDs to separate it from previous base line products. All interfaces settings keep same as previous. Below is test evidence: T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1e2d ProdID=00b8 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B8 USB Mobile Broadband S: SerialNumber=90418e79 C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1e2d ProdID=00b9 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B9 USB Mobile Broadband S: SerialNumber=90418e79 C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option For PID 00b8, interface 3 is GNSS port which don't use serial driver. Signed-off-by: Slark Xiao <slark_xiao@163.com> Link: https://lore.kernel.org/r/20220601034740.5438-1-slark_xiao@163.com [ johan: rename defines using a "2" infix ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-09exfat: use updated exfat_chain directly during renamingSungjong Seo
In order for a file to access its own directory entry set, exfat_inode_info(ei) has two copied values. One is ei->dir, which is a snapshot of exfat_chain of the parent directory, and the other is ei->entry, which is the offset of the start of the directory entry set in the parent directory. Since the parent directory can be updated after the snapshot point, it should be used only for accessing one's own directory entry set. However, as of now, during renaming, it could try to traverse or to allocate clusters via snapshot values, it does not make sense. This potential problem has been revealed when exfat_update_parent_info() was removed by commit d8dad2588add ("exfat: fix referencing wrong parent directory information after renaming"). However, I don't think it's good idea to bring exfat_update_parent_info() back. Instead, let's use the updated exfat_chain of parent directory diectly. Fixes: d8dad2588add ("exfat: fix referencing wrong parent directory information after renaming") Reported-by: Wang Yugui <wangyugui@e16-tech.com> Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Tested-by: Wang Yugui <wangyugui@e16-tech.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-06-09KVM: arm64: Drop stale commentMarc Zyngier
The layout of 'struct kvm_vcpu_arch' has evolved significantly since the initial port of KVM/arm64, so remove the stale comment suggesting that a prefix of the structure is used exclusively from assembly code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-7-will@kernel.org
2022-06-09KVM: arm64: Remove redundant hyp_assert_lock_held() assertionsWill Deacon
host_stage2_try() asserts that the KVM host lock is held, so there's no need to duplicate the assertion in its wrappers. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-6-will@kernel.org
2022-06-09KVM: arm64: Extend comment in has_vhe()Will Deacon
has_vhe() expands to a compile-time constant when evaluated from the VHE or nVHE code, alternatively checking a static key when called from elsewhere in the kernel. On face value, this looks like a case of premature optimization, but in fact this allows symbol references on VHE-specific code paths to be dropped from the nVHE object. Expand the comment in has_vhe() to make this clearer, hopefully discouraging anybody from simplifying the code. Cc: David Brazdil <dbrazdil@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-5-will@kernel.org
2022-06-09KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHEWill Deacon
Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode() only returns KVM_MODE_PROTECTED on systems where the feature is available. Cc: David Brazdil <dbrazdil@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org
2022-06-09KVM: arm64: Handle all ID registers trapped for a protected VMMarc Zyngier
A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF, while it really should only get a zero back if the register is not handled by the hypervisor emulation (as mandated by the architecture). Introduce all the missing ID registers (including the unallocated ones), and have them to return 0. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-3-will@kernel.org
2022-06-09KVM: arm64: Return error from kvm_arch_init_vm() on allocation failureWill Deacon
If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm() then be sure to return -ENOMEM instead of success (0) on the failure path. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org
2022-06-09USB: serial: io_ti: add Agilent E5805A supportRobert Eckelmann
Add support for Agilent E5805A (rebranded ION Edgeport/4) to io_ti. Signed-off-by: Robert Eckelmann <longnoserob@gmail.com> Link: https://lore.kernel.org/r/20220521230808.30931eca@octoberrain Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2022-06-09watchdog: gxp: Add missing MODULE_LICENSEGuenter Roeck
The build system says: ERROR: modpost: missing MODULE_LICENSE() in drivers/watchdog/gxp-wdt.o Add the missing MODULE_LICENSE. Signed-off-by: Nick Hawkins <nick.hawkins@hpe.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/all/20220603131419.2948578-1-linux@roeck-us.net/ Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2022-06-09Merge tag 'intel-gpio-v5.19-2' of ↵Bartosz Golaszewski
gitolite.kernel.org:pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current intel-gpio for v5.19-2 * Convert IRQ chips in Diolan and Intel GPIO drivers to be immutable