summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-06-24KVM: x86/mmu: Always pass 0 for @quadrant when gptes are 8 bytesDavid Matlack
The quadrant is only used when gptes are 4 bytes, but mmu_alloc_{direct,shadow}_roots() pass in a non-zero quadrant for PAE page directories regardless. Make this less confusing by only passing in a non-zero quadrant when it is actually necessary. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-6-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Derive shadow MMU page role from parentDavid Matlack
Instead of computing the shadow page role from scratch for every new page, derive most of the information from the parent shadow page. This eliminates the dependency on the vCPU root role to allocate shadow page tables, and reduces the number of parameters to kvm_mmu_get_page(). Preemptively split out the role calculation to a separate function for use in a following commit. Note that when calculating the MMU root role, we can take @role.passthrough, @role.direct, and @role.access directly from @vcpu->arch.mmu->root_role. Only @role.level and @role.quadrant still must be overridden for PAE page directories, when shadowing 32-bit guest page tables with PAE page tables. No functional change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Stop passing "direct" to mmu_alloc_root()David Matlack
The "direct" argument is vcpu->arch.mmu->root_role.direct, because unlike non-root page tables, it's impossible to have a direct root in an indirect MMU. So just use that. Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Use a bool for directDavid Matlack
The parameter "direct" can either be true or false, and all of the callers pass in a bool variable or true/false literal, so just use the type bool. No functional change intended. Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPsDavid Matlack
Commit fb58a9c345f6 ("KVM: x86/mmu: Optimize MMU page cache lookup for fully direct MMUs") skipped the unsync checks and write flood clearing for full direct MMUs. We can extend this further to skip the checks for all direct shadow pages. Direct shadow pages in indirect MMUs (i.e. shadow paging) are used when shadowing a guest huge page with smaller pages. Such direct shadow pages, like their counterparts in fully direct MMUs, are never marked unsynced or have a non-zero write-flooding count. Checking sp->role.direct also generates better code than checking direct_map because, due to register pressure, direct_map has to get shoved onto the stack and then pulled back off. No functional change intended. Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Cache binary stats metadata for duration of testBen Gardon
In order to improve performance across multiple reads of VM stats, cache the stats metadata in the VM struct. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-11-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Test disabling NX hugepages on a VMBen Gardon
Add an argument to the NX huge pages test to test disabling the feature on a VM using the new capability. Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-10-bgardon@google.com> [Handle failure of sudo or setcap more gracefully. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Add NX huge pages testBen Gardon
There's currently no test coverage of NX hugepages in KVM selftests, so add a basic test to ensure that the feature works as intended. The test creates a VM with a data slot backed with huge pages. The memory in the data slot is filled with op-codes for the return instruction. The guest then executes a series of accesses on the memory, some reads, some instruction fetches. After each operation, the guest exits and the test performs some checks on the backing page counts to ensure that NX page splitting an reclaim work as expected. Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-7-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basisBen Gardon
In some cases, the NX hugepage mitigation for iTLB multihit is not needed for all guests on a host. Allow disabling the mitigation on a per-VM basis to avoid the performance hit of NX hugepages on trusted workloads. In order to disable NX hugepages on a VM, ensure that the userspace actor has permission to reboot the system. Since disabling NX hugepages would allow a guest to crash the system, it is similar to reboot permissions. Ideally, KVM would require userspace to prove it has access to KVM's nx_huge_pages module param, e.g. so that userspace can opt out without needing full reboot permissions. But getting access to the module param file info is difficult because it is buried in layers of sysfs and module glue. Requiring CAP_SYS_BOOT is sufficient for all known use cases. Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-9-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Fix errant brace in KVM capability handlingBen Gardon
The braces around the KVM_CAP_XSAVE2 block also surround the KVM_CAP_PMU_CAPABILITY block, likely the result of a merge issue. Simply move the curly brace back to where it belongs. Fixes: ba7bb663f5547 ("KVM: x86: Provide per VM capability for disabling PMU virtualization") Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-8-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Read binary stat data in libBen Gardon
Move the code to read the binary stats data to the KVM selftests library. It will be re-used by other tests to check KVM behavior. Also opportunistically remove an unnecessary calculation with "size_data" in stats_test. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-6-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Clean up coding style in binary stats testSean Christopherson
Fix a variety of code style violations and/or inconsistencies in the binary stats test. The 80 char limit is a soft limit and can and should be ignored/violated if doing so improves the overall code readability. Specifically, provide consistent indentation and don't split expressions at arbitrary points just to honor the 80 char limit. Opportunistically expand/add comments to call out the more subtle aspects of the code. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-5-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Read binary stats desc in libBen Gardon
Move the code to read the binary stats descriptors to the KVM selftests library. It will be re-used by other tests to check KVM behavior. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-4-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Read binary stats header in libBen Gardon
Move the code to read the binary stats header to the KVM selftests library. It will be re-used by other tests to check KVM behavior. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-3-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Remove dynamic memory allocation for stats headerBen Gardon
There's no need to allocate dynamic memory for the stats header since its size is known at compile time. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: SEV: Init target VMCBs in sev_migrate_fromPeter Gonda
The target VMCBs during an intra-host migration need to correctly setup for running SEV and SEV-ES guests. Add sev_init_vmcb() function and make sev_es_init_vmcb() static. sev_init_vmcb() uses the now private function to init SEV-ES guests VMCBs when needed. Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration") Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20220623173406.744645-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user()Mingwei Zhang
Adding the accounting flag when allocating pages within the SEV function, since these memory pages should belong to individual VM. No functional change intended. Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20220623171858.2083637-1-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24virtio: disable notification hardening by defaultJason Wang
We try to harden virtio device notifications in 8b4ec69d7e09 ("virtio: harden vring IRQ"). It works with the assumption that the driver or core can properly call virtio_device_ready() at the right place. Unfortunately, this seems to be not true and uncover various bugs of the existing drivers, mainly the issue of using virtio_device_ready() incorrectly. So let's add a Kconfig option and disable it by default. It gives us time to fix the drivers and then we can consider re-enabling it. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220622012940.21441-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2022-06-24virtio: Remove unnecessary variable assignmentsBo Liu
In function vp_modern_probe(), "pci_dev" is initialized with the value of "mdev->pci_dev", so assigning "pci_dev" to "mdev->pci_dev" is unnecessary since they store the same value. Signed-off-by: Bo Liu <liubo03@inspur.com> Message-Id: <20220617055952.5364-1-liubo03@inspur.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-06-24virtio_ring : keep used_wrap_counter in vq->last_used_idxhuangjie.albert
the used_wrap_counter and the vq->last_used_idx may get out of sync if they are separate assignment,and interrupt might use an incorrect value to check for the used index. for example:OOB access ksoftirqd may consume the packet and it will call: virtnet_poll -->virtnet_receive -->virtqueue_get_buf_ctx -->virtqueue_get_buf_ctx_packed and in virtqueue_get_buf_ctx_packed: vq->last_used_idx += vq->packed.desc_state[id].num; if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { vq->last_used_idx -= vq->packed.vring.num; vq->packed.used_wrap_counter ^= 1; } if at the same time, there comes a vring interrupt,in vring_interrupt: we will call: vring_interrupt -->more_used -->more_used_packed -->is_used_desc_packed in is_used_desc_packed, the last_used_idx maybe >= vq->packed.vring.num. so this could case a memory out of bounds bug. this patch is to keep the used_wrap_counter in vq->last_used_idx so we can get the correct value to check for used index in interrupt. v3->v4: - use READ_ONCE/WRITE_ONCE to get/set vq->last_used_idx v2->v3: - add inline function to get used_wrap_counter and last_used - when use vq->last_used_idx, only read once if vq->last_used_idx is read twice, the values can be inconsistent. - use last_used_idx & ~(-(1 << VRING_PACKED_EVENT_F_WRAP_CTR)) to get the all bits below VRING_PACKED_EVENT_F_WRAP_CTR v1->v2: - reuse the VRING_PACKED_EVENT_F_WRAP_CTR - Remove parameter judgment in is_used_desc_packed, because it can't be illegal Signed-off-by: huangjie.albert <huangjie.albert@bytedance.com> Message-Id: <20220617020411.80367-1-huangjie.albert@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-24vduse: Tie vduse mgmtdev and its deviceParav Pandit
vduse devices are not backed by any real devices such as PCI. Hence it doesn't have any parent device linked to it. Kernel driver model in [1] suggests to avoid an empty device release callback. Hence tie the mgmtdevice object's life cycle to an allocate dummy struct device instead of static one. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284 Signed-off-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220613195223.473966-1-parav@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-24vdpa/mlx5: Initialize CVQ vringh only onceEli Cohen
Currently, CVQ vringh is initialized inside setup_virtqueues() which is called every time a memory update is done. This is undesirable since it resets all the context of the vring, including the available and used indices. Move the initialization to mlx5_vdpa_set_status() when VIRTIO_CONFIG_S_DRIVER_OK is set. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-06-24vdpa/mlx5: Update Control VQ callback informationEli Cohen
The control VQ specific information is stored in the dedicated struct mlx5_control_vq. When the callback is updated through mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct. Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com)
2022-06-23ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATANamjae Jeon
FileOffset should not be greater than BeyondFinalZero in FSCTL_ZERO_DATA. And don't call ksmbd_vfs_zero_data() if length is zero. Cc: stable@vger.kernel.org Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23ksmbd: set the range of bytes to zero without extending file size in ↵Namjae Jeon
FSCTL_ZERO_DATA generic/091, 263 test failed since commit f66f8b94e7f2 ("cifs: when extending a file with falloc we should make files not-sparse"). FSCTL_ZERO_DATA sets the range of bytes to zero without extending file size. The VFS_FALLOCATE_FL_KEEP_SIZE flag should be used even on non-sparse files. Cc: stable@vger.kernel.org Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23ksmbd: remove duplicate flag set in smb2_writeHyunchul Lee
The writethrough flag is set again if is_rdma_channel is false. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-23selftests/net: pass ipv6_args to udpgso_bench's IPv6 TCP testDimitris Michailidis
udpgso_bench.sh has been running its IPv6 TCP test with IPv4 arguments since its initial conmit. Looks like a typo. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Cc: willemb@google.com Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20220623000234.61774-1-dmichail@fungible.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()Eric Dumazet
syzbot reported uninit-value in tcp_recvmsg() [1] Issue here is that msg->msg_get_inq should have been cleared, otherwise tcp_recvmsg() might read garbage and perform more work than needed, or have undefined behavior. Given CONFIG_INIT_STACK_ALL_ZERO=y is probably going to be the default soon, I chose to change __sys_recvfrom() to clear all fields but msghdr.addr which might be not NULL. For __copy_msghdr_from_user(), I added an explicit clear of kmsg->msg_get_inq. [1] BUG: KMSAN: uninit-value in tcp_recvmsg+0x6cf/0xb60 net/ipv4/tcp.c:2557 tcp_recvmsg+0x6cf/0xb60 net/ipv4/tcp.c:2557 inet_recvmsg+0x13a/0x5a0 net/ipv4/af_inet.c:850 sock_recvmsg_nosec net/socket.c:995 [inline] sock_recvmsg net/socket.c:1013 [inline] __sys_recvfrom+0x696/0x900 net/socket.c:2176 __do_sys_recvfrom net/socket.c:2194 [inline] __se_sys_recvfrom net/socket.c:2190 [inline] __x64_sys_recvfrom+0x122/0x1c0 net/socket.c:2190 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Local variable msg created at: __sys_recvfrom+0x81/0x900 net/socket.c:2154 __do_sys_recvfrom net/socket.c:2194 [inline] __se_sys_recvfrom net/socket.c:2190 [inline] __x64_sys_recvfrom+0x122/0x1c0 net/socket.c:2190 CPU: 0 PID: 3493 Comm: syz-executor170 Not tainted 5.19.0-rc3-syzkaller-30868-g4b28366af7d9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: f94fd25cb0aa ("tcp: pass back data left in socket after receive") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Tested-by: Alexander Potapenko<glider@google.com> Link: https://lore.kernel.org/r/20220622150220.1091182-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23net/ncsi: use proper "mellanox" DT vendor prefixKrzysztof Kozlowski
"mlx" Devicetree vendor prefix is not documented and instead "mellanox" should be used. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220622115416.7400-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24powerpc/prom_init: Fix kernel config grepLiam Howlett
When searching for config options, use the KCONFIG_CONFIG shell variable so that builds using non-standard config locations work. Fixes: 26deb04342e3 ("powerpc: prepare string/mem functions for KASAN") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220624011745.4060795-1-Liam.Howlett@oracle.com
2022-06-23net: dsa: bcm_sf2: force pause link settingsDoug Berger
The pause settings reported by the PHY should also be applied to the GMII port status override otherwise the switch will not generate pause frames towards the link partner despite the advertisement saying otherwise. Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220623030204.1966851-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23net/dsa/hirschmann: Add missing of_node_get() in hellcreek_led_setup()Liang He
of_find_node_by_name() will decrease the refcount of its first arg and we need a of_node_get() to keep refcount balance. Fixes: 7d9ee2e8ff15 ("net: dsa: hellcreek: Add PTP status LEDs") Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220622040621.4094304-1-windhl@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24powerpc/book3e: Fix PUD allocation size in map_kernel_page()Christophe Leroy
Commit 2fb4706057bc ("powerpc: add support for folded p4d page tables") erroneously changed PUD setup to a mix of PMD and PUD. Fix it. While at it, use PTE_TABLE_SIZE instead of PAGE_SIZE for PTE tables in order to avoid any confusion. Fixes: 2fb4706057bc ("powerpc: add support for folded p4d page tables") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/95ddfd6176d53e6c85e13bd1c358359daa56775f.1655974558.git.christophe.leroy@csgroup.eu
2022-06-24powerpc/xive/spapr: correct bitmap allocation sizeNathan Lynch
kasan detects access beyond the end of the xibm->bitmap allocation: BUG: KASAN: slab-out-of-bounds in _find_first_zero_bit+0x40/0x140 Read of size 8 at addr c00000001d1d0118 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc2-00001-g90df023b36dd #28 Call Trace: [c00000001d98f770] [c0000000012baab8] dump_stack_lvl+0xac/0x108 (unreliable) [c00000001d98f7b0] [c00000000068faac] print_report+0x37c/0x710 [c00000001d98f880] [c0000000006902c0] kasan_report+0x110/0x354 [c00000001d98f950] [c000000000692324] __asan_load8+0xa4/0xe0 [c00000001d98f970] [c0000000011c6ed0] _find_first_zero_bit+0x40/0x140 [c00000001d98f9b0] [c0000000000dbfbc] xive_spapr_get_ipi+0xcc/0x260 [c00000001d98fa70] [c0000000000d6d28] xive_setup_cpu_ipi+0x1e8/0x450 [c00000001d98fb30] [c000000004032a20] pSeries_smp_probe+0x5c/0x118 [c00000001d98fb60] [c000000004018b44] smp_prepare_cpus+0x944/0x9ac [c00000001d98fc90] [c000000004009f9c] kernel_init_freeable+0x2d4/0x640 [c00000001d98fd90] [c0000000000131e8] kernel_init+0x28/0x1d0 [c00000001d98fe10] [c00000000000cd54] ret_from_kernel_thread+0x5c/0x64 Allocated by task 0: kasan_save_stack+0x34/0x70 __kasan_kmalloc+0xb4/0xf0 __kmalloc+0x268/0x540 xive_spapr_init+0x4d0/0x77c pseries_init_irq+0x40/0x27c init_IRQ+0x44/0x84 start_kernel+0x2a4/0x538 start_here_common+0x1c/0x20 The buggy address belongs to the object at c00000001d1d0118 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of 8-byte region [c00000001d1d0118, c00000001d1d0120) The buggy address belongs to the physical page: page:c00c000000074740 refcount:1 mapcount:0 mapping:0000000000000000 index:0xc00000001d1d0558 pfn:0x1d1d flags: 0x7ffff000000200(slab|node=0|zone=0|lastcpupid=0x7ffff) raw: 007ffff000000200 c00000001d0003c8 c00000001d0003c8 c00000001d010480 raw: c00000001d1d0558 0000000001e1000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: c00000001d1d0000: fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0080: fc fc 00 fc fc fc fc fc fc fc fc fc fc fc fc fc >c00000001d1d0100: fc fc fc 02 fc fc fc fc fc fc fc fc fc fc fc fc ^ c00000001d1d0180: fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc fc c00000001d1d0200: fc fc fc fc fc 04 fc fc fc fc fc fc fc fc fc fc This happens because the allocation uses the wrong unit (bits) when it should pass (BITS_TO_LONGS(count) * sizeof(long)) or equivalent. With small numbers of bits, the allocated object can be smaller than sizeof(long), which results in invalid accesses. Use bitmap_zalloc() to allocate and initialize the irq bitmap, paired with bitmap_free() for consistency. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220623182509.3985625-1-nathanl@linux.ibm.com
2022-06-23memregion: Fix memregion_free() fallback definitionDan Williams
In the CONFIG_MEMREGION=n case, memregion_free() is meant to be a static inline. 0day reports: In file included from drivers/cxl/core/port.c:4: include/linux/memregion.h:19:6: warning: no previous prototype for function 'memregion_free' [-Wmissing-prototypes] Mark memregion_free() static. Fixes: 33dd70752cd7 ("lib: Uplevel the pmem "region" ida to a global allocator") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165601455171.4042645.3350844271068713515.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-24Merge tag 'drm-msm-fixes-2022-06-20' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v5.19-rc4 - Workaround for parade DSI bridge power sequencing - Fix for multi-planar YUV format offsets - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a being hit frequently in CI. - Fix to add minimum ICC vote in the msm_mdss pm_resume path to address bootup splats - Fix to avoid dereferencing without checking in WB encoder - Fix to avoid crash during suspend in DP driver by ensuring interrupt mask bits are updated - Remove unused code from dpu_encoder_virt_atomic_check() - Fix to remove redundant init of dsc variable - Fix to ensure mmap offset is initialized to avoid memory corruption from unpin/evict - Fix double runpm disable in probe-defer path - VMA fenced-unpin fixes - Fix for WB max-width - Fix for rare dp resolution change issue Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvdsOF1-+WfTWyEyu33XPcvxOCU00G-dz7EF2J+fdyUHg@mail.gmail.com
2022-06-24Merge tag 'drm-intel-fixes-2022-06-22' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.19-rc4: - Revert low voltage SKU check removal to fix display issues - Apply PLL DCO fraction workaround for ADL-S - Don't show engine classes not present in client fdinfo Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87a6a4syrr.fsf@intel.com
2022-06-24Merge tag 'drm-misc-fixes-2022-06-23' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Multiple fixes in sun4i for suspend, DDC, DMA setup; A rework of vc4 to properly split the driver between hardware capabilities that wasn't done properly causing multiple crashes; and a panel quirk for Aya Neo Next Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220623064152.ubjmnpj7tdejdcw6@houat
2022-06-24Merge tag 'amd-drm-fixes-5.19-2022-06-22' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-06-22: amdgpu: - Adjust GTT size logic - eDP fix for RMB - DCN 3.15 fix - DP training fix - Color encoding fix for DCN2+ Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220622214106.5984-1-alexander.deucher@amd.com
2022-06-23gpio: mxs: Fix header commentStefan Wahren
This driver is about MXS GPIO support. MXC is a different platform. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23xfs: introduce xfs_inodegc_push()Dave Chinner
The current blocking mechanism for pushing the inodegc queue out to disk can result in systems becoming unusable when there is a long running inodegc operation. This is because the statfs() implementation currently issues a blocking flush of the inodegc queue and a significant number of common system utilities will call statfs() to discover something about the underlying filesystem. This can result in userspace operations getting stuck on inodegc progress, and when trying to remove a heavily reflinked file on slow storage with a full journal, this can result in delays measuring in hours. Avoid this problem by adding "push" function that expedites the flushing of the inodegc queue, but doesn't wait for it to complete. Convert xfs_fs_statfs() and xfs_qm_scall_getquota() to use this mechanism so they don't block but still ensure that queued operations are expedited. Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Reported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Dave Chinner <dchinner@redhat.com> [djwong: fix _getquota_next to use _inodegc_push too] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23xfs: bound maximum wait time for inodegc workDave Chinner
Currently inodegc work can sit queued on the per-cpu queue until the workqueue is either flushed of the queue reaches a depth that triggers work queuing (and later throttling). This means that we could queue work that waits for a long time for some other event to trigger flushing. Hence instead of just queueing work at a specific depth, use a delayed work that queues the work at a bound time. We can still schedule the work immediately at a given depth, but we no long need to worry about leaving a number of items on the list that won't get processed until external events prevail. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-06-23gpio: Fix kernel-doc comments to nested unionAkira Yokosawa
Commit 48ec13d36d3f ("gpio: Properly document parent data union") is supposed to have fixed a warning from "make htmldocs" regarding kernel-doc comments to union members. However, the same warning still remains [1]. Fix the issue by following the example found in section "Nested structs/unions" of Documentation/doc-guide/kernel-doc.rst. Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 48ec13d36d3f ("gpio: Properly document parent data union") Link: https://lore.kernel.org/r/20220606093302.21febee3@canb.auug.org.au/ [1] Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-06-23cpufreq: amd-pstate: Add resume and suspend callbacksJinzhou Su
When system resumes from S3, the CPPC enable register will be cleared and reset to 0. So enable the CPPC interface by writing 1 to this register on system resume and disable it during system suspend. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> [ rjw: Subject and changelog edits ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-23Merge tag 'pm-5.19-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a recent regression preventing some systems from powering off after saving a hibernation image (Dmitry Osipenko)" * tag 'pm-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: Use kernel_can_power_off()
2022-06-23Merge tag 'random-5.19-rc4-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fixes from Jason Donenfeld: - A change to schedule the interrupt randomness mixing less often, yet credit a little more each time, to reduce overhead during interrupt storms. - Squelch an undesired pr_warn() from __ratelimit(), which was causing problems in the reporters' CI. - A trivial comment fix. * tag 'random-5.19-rc4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: update comment from copy_to_user() -> copy_to_iter() random: quiet urandom warning ratelimit suppression message random: schedule mix_interrupt_randomness() less often
2022-06-23dm mirror log: clear log bits up to BITS_PER_LONG boundaryMikulas Patocka
Commit 85e123c27d5c ("dm mirror log: round up region bitmap size to BITS_PER_LONG") introduced a regression on 64-bit architectures in the lvm testsuite tests: lvcreate-mirror, mirror-names and vgsplit-operation. If the device is shrunk, we need to clear log bits beyond the end of the device. The code clears bits up to a 32-bit boundary and then calculates lc->sync_count by summing set bits up to a 64-bit boundary (the commit changed that; previously, this boundary was 32-bit too). So, it was using some non-zeroed bits in the calculation and this caused misbehavior. Fix this regression by clearing bits up to BITS_PER_LONG boundary. Fixes: 85e123c27d5c ("dm mirror log: round up region bitmap size to BITS_PER_LONG") Cc: stable@vger.kernel.org Reported-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-23dm: fix BLK_STS_DM_REQUEUE handling when dm_io represents split bioMing Lei
Commit 7dd76d1feec7 ("dm: improve bio splitting and associated IO accounting") removed using cloned bio when dm io splitting is needed. Using bio_trim()+bio_inc_remaining() rather than bio_split()+bio_chain() causes multiple dm_io instances to share the same original bio, and it works fine if IOs are completed successfully. But a regression was caused for the case when BLK_STS_DM_REQUEUE is returned from any one of DM's cloned bios (whose dm_io share the same orig_bio). In this BLK_STS_DM_REQUEUE case only the mapped subset of the original bio for the current exact dm_io needs to be re-submitted. However, since the original bio is shared among all dm_io instances, the ->orig_bio actually only represents the last dm_io instance, so requeue can't work as expected. Also when more than one dm_io is requeued, the same original bio is requeued from all dm_io's completion handler, then race is caused. Fix this issue by still allocating one clone bio for completing io only, then io accounting can rely on ->orig_bio being unmodified. This is needed because the dm_io's sector_offset and sectors members are recorded relative to an unmodified ->orig_bio. In the future, we can go back to using bio_trim()+bio_inc_remaining() for dm's io splitting but then delay needing a bio clone only when handling BLK_STS_DM_REQUEUE, but that approach is a bit complicated (so it needs a development cycle): 1) bio clone needs to be done in task context 2) a block interface for unwinding bio is required Fixes: 7dd76d1feec7 ("dm: improve bio splitting and associated IO accounting") Reported-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-23drm/msm/dpu: Fix variable dereferenced before checksunliming
Fixes the following smatch warning: drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c:261 dpu_encoder_phys_wb_atomic_check() warn: variable dereferenced before check 'conn_state' Fixes: d7d0e73f7de3 ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback") Signed-off-by: sunliming <sunliming@kylinos.cn> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/490850/ Link: https://lore.kernel.org/r/20220623012707.453972-1-sunliming@kylinos.cn Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2022-06-23drm/msm/dp: reset drm_dev to NULL at dp_display_unbind()Kuogee Hsieh
During msm initialize phase, dp_display_unbind() will be called to undo initializations had been done by dp_display_bind() previously if there is error happen at msm_drm_bind. Under this kind of circumstance, drm_device may not be populated completed which causes system crash at drm_dev_dbg(). This patch reset drm_dev to NULL so that following drm_dev_dbg() will not refer to any internal fields of drm_device to prevent system from crashing. Below are panic stack trace, [ 53.584904] Unable to handle kernel paging request at virtual address 0000000070018001 . [ 53.702212] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD platform (rev5+) (DT) [ 53.710445] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 53.717596] pc : string_nocheck+0x1c/0x64 [ 53.721738] lr : string+0x54/0x60 [ 53.725162] sp : ffffffc013d6b650 [ 53.728590] pmr_save: 000000e0 [ 53.731743] x29: ffffffc013d6b650 x28: 0000000000000002 x27: 0000000000ffffff [ 53.739083] x26: ffffffc013d6b710 x25: ffffffd07a066ae0 x24: ffffffd07a419f97 [ 53.746420] x23: ffffffd07a419f99 x22: ffffff81fef360d4 x21: ffffff81fef364d4 [ 53.753760] x20: ffffffc013d6b6f8 x19: ffffffd07a06683c x18: 0000000000000000 [ 53.761093] x17: 4020386678302f30 x16: 00000000000000b0 x15: ffffffd0797523c8 [ 53.768429] x14: 0000000000000004 x13: ffff0000ffffff00 x12: ffffffd07a066b2c [ 53.775780] x11: 0000000000000000 x10: 000000000000013c x9 : 0000000000000000 [ 53.783117] x8 : ffffff81fef364d4 x7 : 0000000000000000 x6 : 0000000000000000 [ 53.790445] x5 : 0000000000000000 x4 : ffff0a00ffffff04 x3 : ffff0a00ffffff04 [ 53.797783] x2 : 0000000070018001 x1 : ffffffffffffffff x0 : ffffff81fef360d4 [ 53.805136] Call trace: [ 53.807667] string_nocheck+0x1c/0x64 [ 53.811439] string+0x54/0x60 [ 53.814498] vsnprintf+0x374/0x53c [ 53.818009] pointer+0x3dc/0x40c [ 53.821340] vsnprintf+0x398/0x53c [ 53.824854] vscnprintf+0x3c/0x88 [ 53.828274] __trace_array_vprintk+0xcc/0x2d4 [ 53.832768] trace_array_printk+0x8c/0xb4 [ 53.836900] drm_trace_printf+0x74/0x9c [ 53.840875] drm_dev_dbg+0xfc/0x1b8 [ 53.844480] dp_pm_suspend+0x70/0xf8 [ 53.848164] dpm_run_callback+0x60/0x1a0 [ 53.852222] __device_suspend+0x304/0x3f4 [ 53.856363] dpm_suspend+0xf8/0x3a8 [ 53.859959] dpm_suspend_start+0x8c/0xc0 Fixes: 570d3e5d28db ("drm/msm/dp: stop event kernel thread when DP unbind") Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/490756/ Link: https://lore.kernel.org/r/1655927731-22396-1-git-send-email-quic_khsieh@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>