summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-12-28nvme: also return I/O command effects from nvme_command_effectsChristoph Hellwig
To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28nvmet: don't defer passthrough commands with trivial effects to the workqueueChristoph Hellwig
Mask out the "Command Supported" and "Logical Block Content Change" bits and only defer execution of commands that have non-trivial effects to the workqueue for synchronous execution. This allows to execute admin commands asynchronously on controllers that provide a Command Supported and Effects log page, and will keep allowing to execute Write commands asynchronously once command effects on I/O commands are taken into account. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28nvmet: set the LBCC bit for commands that modify dataChristoph Hellwig
Write, Write Zeroes, Zone append and a Zone Reset through Zone Management Send modify the logical block content of a namespace, so make sure the LBCC bit is reported for them. Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-28nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding itChristoph Hellwig
Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-12-28nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definitionChristoph Hellwig
3 << 16 does not generate the correct mask for bits 16, 17 and 18. Use the GENMASK macro to generate the correct mask instead. Fixes: 84fef62d135b ("nvme: check admin passthru command effects") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2022-12-28docs, nvme: add a feature and quirk policy documentChristoph Hellwig
This adds a document about what specification features are supported by the Linux NVMe driver, and what qualifies for a quirk if an implementation has problems following the specification. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jonathan Corbet <corbet@lwn.net>
2022-12-28ALSA: hda/hdmi: Static PCM mapping again with AMD HDMI codecsTakashi Iwai
The recent code refactoring for HD-audio HDMI codec driver caused a regression on AMD/ATI HDMI codecs; namely, PulseAudioand pipewire don't recognize HDMI outputs any longer while the direct output via ALSA raw access still works. The problem turned out that, after the code refactoring, the driver assumes only the dynamic PCM assignment, and when a PCM stream that still isn't assigned to any pin gets opened, the driver tries to assign any free converter to the PCM stream. This behavior is OK for Intel and other codecs, as they have arbitrary connections between pins and converters. OTOH, on AMD chips that have a 1:1 mapping between pins and converters, this may end up with blocking the open of the next PCM stream for the pin that is tied with the formerly taken converter. Also, with the code refactoring, more PCM streams are exposed than necessary as we assume all converters can be used, while this isn't true for AMD case. This may change the PCM stream assignment and confuse users as well. This patch fixes those problems by: - Introducing a flag spec->static_pcm_mapping, and if it's set, the driver applies the static mapping between pins and converters at the probe time - Limiting the number of PCM streams per pins, too; this avoids the superfluous PCM streams Fixes: ef6f5494faf6 ("ALSA: hda/hdmi: Use only dynamic PCM device allocation") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216836 Co-developed-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20221228125714.16329-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-12-28Merge branch 'kvm-late-6.1-fixes' into HEADPaolo Bonzini
x86: * several fixes to nested VMX execution controls * fixes and clarification to the documentation for Xen emulation * do not unnecessarily release a pmu event with zero period * MMU fixes * fix Coverity warning in kvm_hv_flush_tlb() selftests: * fixes for the ucall mechanism in selftests * other fixes mostly related to compilation with clang
2022-12-28KVM: selftests: restore special vmmcall code layout needed by the harnessPaolo Bonzini
Commit 8fda37cf3d41 ("KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()", 2022-11-21) broke the svm_nested_soft_inject_test because it placed a "pop rbp" instruction after vmmcall. While this is correct and mimics what is done in the VMX case, this particular test expects a ud2 instruction right after the vmmcall, so that it can skip over it in the L1 part of the test. Inline a suitably-modified version of vmmcall() to restore the functionality of the test. Fixes: 8fda37cf3d41 ("KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()" Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221130181147.9911-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-28Documentation: kvm: clarify SRCU locking orderPaolo Bonzini
Currently only the locking order of SRCU vs kvm->slots_arch_lock and kvm->slots_lock is documented. Extend this to kvm->lock since Xen emulation got it terribly wrong. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-28KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESETPaolo Bonzini
While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running, if that happened it could cause a deadlock. This is due to kvm_xen_eventfd_reset() doing a synchronize_srcu() inside a kvm->lock critical section. To avoid this, first collect all the evtchnfd objects in an array and free all of them once the kvm->lock critical section is over and th SRCU grace period has expired. Reported-by: Michal Luczaj <mhal@rbox.co> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-28virtio_blk: Fix signedness bug in virtblk_prep_rq()Rafael Mendonca
The virtblk_map_data() function returns negative error codes, however, the 'nents' field of vbr->sg_table is an unsigned int, which causes the error handling not to work correctly. Cc: stable@vger.kernel.org Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Suwan Kim <suwan.kim027@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28vdpa_sim_net: should not drop the multicast/broadcast packetCindy Lu
In the receive_filter(), should not drop the packet with the broadcast/multicast address. Add the check for this Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221214054306.24145-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28vdpasim: fix memory leak when freeing IOTLBsJason Wang
After commit bda324fd037a ("vdpasim: control virtqueue support"), vdpasim->iommu became an array of IOTLB, so we should clean the mappings of each free one by one instead of just deleting the ranges in the first IOTLB which may leak maps. Fixes: bda324fd037a ("vdpasim: control virtqueue support") Cc: Gautam Dawar <gautam.dawar@xilinx.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221213090717.61529-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gautam Dawar <gautam.dawar@amd.com>
2022-12-28vdpa: conditionally fill max max queue pair for statsJason Wang
For the device without multiqueue feature, we will read 0 as max_virtqueue_pairs from the config. So if we fill VDPA_ATTR_DEV_NET_CFG_MAX_VQP with the value we read from the config we will confuse the user. Fixing this by only filling the value when multiqueue is offered by the device so userspace can assume 1 when the attr is not provided. Fixes: 13b00b135665c("vdpa: Add support for querying vendor statistics") Cc: Eli Cohen <elic@nvidia.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220907060110.4511-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>
2022-12-28vdpa/vp_vdpa: fix kfree a wrong pointer in vp_vdpa_removeRong Wang
In vp_vdpa_remove(), the code kfree(&vp_vdpa_mgtdev->mgtdev.id_table) uses a reference of pointer as the argument of kfree, which is the wrong pointer and then may hit crash like this: Unable to handle kernel paging request at virtual address 00ffff003363e30c Internal error: Oops: 96000004 [#1] SMP Call trace: rb_next+0x20/0x5c ext4_readdir+0x494/0x5c4 [ext4] iterate_dir+0x168/0x1b4 __se_sys_getdents64+0x68/0x170 __arm64_sys_getdents64+0x24/0x30 el0_svc_common.constprop.0+0x7c/0x1bc do_el0_svc+0x2c/0x94 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Code: 54000220 f9400441 b4000161 aa0103e0 (f9400821) SMP: stopping secondary CPUs Starting crashdump kernel... Fixes: ffbda8e9df10 ("vdpa/vp_vdpa : add vdpa tool support in vp_vdpa") Signed-off-by: Rong Wang <wangrong68@huawei.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Message-Id: <20221207120813.2837529-1-sunnanyong@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cindy Lu <lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28vduse: Validate vq_num in vduse_validate_config()Harshit Mogalapalli
Add a limit to 'config->vq_num' which is user controlled data which comes from an vduse_ioctl to prevent large memory allocations. Micheal says - This limit is somewhat arbitrary. However, currently virtio pci and ccw are limited to a 16 bit vq number. While MMIO isn't it is also isn't used with lots of VQs due to current lack of support for per-vq interrupts. Thus, the 0xffff limit on number of VQs corresponding to a 16-bit VQ number seems sufficient for now. This is found using static analysis with smatch. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Message-Id: <20221128155717.2579992-1-harshit.m.mogalapalli@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28tools/virtio: remove smp_read_barrier_depends()Davidlohr Bueso
This gets rid of the last references to smp_read_barrier_depends() which for the kernel side was removed in v5.9. The serialization required for Alpha is done inside READ_ONCE() instead of having users deal with it. Simply use a full barrier, the architecture does not have rmb in the first place. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Message-Id: <20221128034347.990-3-dave@stgolabs.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-12-28tools/virtio: remove stray charactersDavidlohr Bueso
__read_once_size() is not a macro, remove those '/'s. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Message-Id: <20221128034347.990-2-dave@stgolabs.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-12-28vhost_vdpa: fix the crash in unmap a large memoryCindy Lu
While testing in vIOMMU, sometimes Guest will unmap very large memory, which will cause the crash. To fix this, add a new function vhost_vdpa_general_unmap(). This function will only unmap the memory that saved in iotlb. Call Trace: [ 647.820144] ------------[ cut here ]------------ [ 647.820848] kernel BUG at drivers/iommu/intel/iommu.c:1174! [ 647.821486] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 647.822082] CPU: 10 PID: 1181 Comm: qemu-system-x86 Not tainted 6.0.0-rc1home_lulu_2452_lulu7_vhost+ #62 [ 647.823139] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qem4 [ 647.824365] RIP: 0010:domain_unmap+0x48/0x110 [ 647.825424] Code: 48 89 fb 8d 4c f6 1e 39 c1 0f 4f c8 83 e9 0c 83 f9 3f 7f 18 48 89 e8 48 d3 e8 48 85 c0 75 59 [ 647.828064] RSP: 0018:ffffae5340c0bbf0 EFLAGS: 00010202 [ 647.828973] RAX: 0000000000000001 RBX: ffff921793d10540 RCX: 000000000000001b [ 647.830083] RDX: 00000000080000ff RSI: 0000000000000001 RDI: ffff921793d10540 [ 647.831214] RBP: 0000000007fc0100 R08: ffffae5340c0bcd0 R09: 0000000000000003 [ 647.832388] R10: 0000007fc0100000 R11: 0000000000100000 R12: 00000000080000ff [ 647.833668] R13: ffffae5340c0bcd0 R14: ffff921793d10590 R15: 0000008000100000 [ 647.834782] FS: 00007f772ec90640(0000) GS:ffff921ce7a80000(0000) knlGS:0000000000000000 [ 647.836004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 647.836990] CR2: 00007f02c27a3a20 CR3: 0000000101b0c006 CR4: 0000000000372ee0 [ 647.838107] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 647.839283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 647.840666] Call Trace: [ 647.841437] <TASK> [ 647.842107] intel_iommu_unmap_pages+0x93/0x140 [ 647.843112] __iommu_unmap+0x91/0x1b0 [ 647.844003] iommu_unmap+0x6a/0x95 [ 647.844885] vhost_vdpa_unmap+0x1de/0x1f0 [vhost_vdpa] [ 647.845985] vhost_vdpa_process_iotlb_msg+0xf0/0x90b [vhost_vdpa] [ 647.847235] ? _raw_spin_unlock+0x15/0x30 [ 647.848181] ? _copy_from_iter+0x8c/0x580 [ 647.849137] vhost_chr_write_iter+0xb3/0x430 [vhost] [ 647.850126] vfs_write+0x1e4/0x3a0 [ 647.850897] ksys_write+0x53/0xd0 [ 647.851688] do_syscall_64+0x3a/0x90 [ 647.852508] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 647.853457] RIP: 0033:0x7f7734ef9f4f [ 647.854408] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 76 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c8 [ 647.857217] RSP: 002b:00007f772ec8f040 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 647.858486] RAX: ffffffffffffffda RBX: 00000000fef00000 RCX: 00007f7734ef9f4f [ 647.859713] RDX: 0000000000000048 RSI: 00007f772ec8f090 RDI: 0000000000000010 [ 647.860942] RBP: 00007f772ec8f1a0 R08: 0000000000000000 R09: 0000000000000000 [ 647.862206] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000010 [ 647.863446] R13: 0000000000000002 R14: 0000000000000000 R15: ffffffff01100000 [ 647.864692] </TASK> [ 647.865458] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs v] [ 647.874688] ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend") Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221219073331.556140-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28virtio: Implementing attribute show with sysfs_emitDawei Li
Replace sprintf with sysfs_emit or its variants for their built-in PAGE_SIZE awareness. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Message-Id: <TYCP286MB23232A999FE7DBDF50BA0FAACA0F9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28virtio-crypto: fix memory leak in virtio_crypto_alg_skcipher_close_session()Wei Yongjun
'vc_ctrl_req' is alloced in virtio_crypto_alg_skcipher_close_session(), and should be freed in the invalid ctrl_status->status error handling case. Otherwise there is a memory leak. Fixes: 0756ad15b1fe ("virtio-crypto: use private buffer for control request") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Message-Id: <20221114110740.537276-1-weiyongjun@huaweicloud.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Acked-by: zhenwei pi<pizhenwei@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28tools/virtio: Variable type completionwangjianli
Replace "unsigned" with "unsigned int" Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Message-Id: <20221113070742.48271-1-wangjianli@cdjrlc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28vdpa_sim: fix vringh initialization in vdpasim_queue_ready()Stefano Garzarella
When we initialize vringh, we should pass the features and the number of elements in the virtqueue negotiated with the driver, otherwise operations with vringh may fail. This was discovered in a case where the driver sets a number of elements in the virtqueue different from the value returned by .get_vq_num_max(). In vdpasim_vq_reset() is safe to initialize the vringh with default values, since the virtqueue will not be used until vdpasim_queue_ready() is called again. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221110141335.62171-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-12-28virtio_blk: use UINT_MAX instead of -1UAngus Chen
We use UINT_MAX to limit max_discard_sectors in virtblk_probe, we can use UINT_MAX to limit max_hw_sectors for consistencies. No functional change intended. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Message-Id: <20221110030124.1986-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-12-28vhost-vdpa: fix an iotlb memory leakStefano Garzarella
Before commit 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") we called vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1) during release to free all the resources allocated when processing user IOTLB messages through vhost_vdpa_process_iotlb_update(). That commit changed the handling of IOTLB a bit, and we accidentally removed some code called during the release. We partially fixed this with commit 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") but a potential memory leak is still there as showed by kmemleak if the application does not send VHOST_IOTLB_INVALIDATE or crashes: unreferenced object 0xffff888007fbaa30 (size 16): comm "blkio-bench", pid 914, jiffies 4294993521 (age 885.500s) hex dump (first 16 bytes): 40 73 41 07 80 88 ff ff 00 00 00 00 00 00 00 00 @sA............. backtrace: [<0000000087736d2a>] kmem_cache_alloc_trace+0x142/0x1c0 [<0000000060740f50>] vhost_vdpa_process_iotlb_msg+0x68c/0x901 [vhost_vdpa] [<0000000083e8e205>] vhost_chr_write_iter+0xc0/0x4a0 [vhost] [<000000008f2f414a>] vhost_vdpa_chr_write_iter+0x18/0x20 [vhost_vdpa] [<00000000de1cd4a0>] vfs_write+0x216/0x4b0 [<00000000a2850200>] ksys_write+0x71/0xf0 [<00000000de8e720b>] __x64_sys_write+0x19/0x20 [<0000000018b12cbb>] do_syscall_64+0x3f/0x90 [<00000000986ec465>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Let's fix this calling vhost_vdpa_iotlb_unmap() on the whole range in vhost_vdpa_remove_as(). We move that call before vhost_dev_cleanup() since we need a valid v->vdev.mm in vhost_vdpa_pa_unmap(). vhost_iotlb_reset() call can be removed, since vhost_vdpa_iotlb_unmap() on the whole range removes all the entries. The kmemleak log reported was observed with a vDPA device that has `use_va` set to true (e.g. VDUSE). This patch has been tested with both types of devices. Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Fixes: 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109154213.146789-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28vhost: fix range used in translate_desc()Stefano Garzarella
vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In translate_desc() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: a9709d6874d5 ("vhost: convert pre sorted vhost memory array to interval tree") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109102503.18816-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28vringh: fix range used in iotlb_translate()Stefano Garzarella
vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In iotlb_translate() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: 9ad9c49cfe97 ("vringh: IOTLB support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221109102503.18816-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28vhost/vsock: Fix error handling in vhost_vsock_init()Yuan Can
A problem about modprobe vhost_vsock failed is triggered with the following log given: modprobe: ERROR: could not insert 'vhost_vsock': Device or resource busy The reason is that vhost_vsock_init() returns misc_register() directly without checking its return value, if misc_register() failed, it returns without calling vsock_core_unregister() on vhost_transport, resulting the vhost_vsock can never be installed later. A simple call graph is shown as below: vhost_vsock_init() vsock_core_register() # register vhost_transport misc_register() device_create_with_groups() device_create_groups_vargs() dev = kzalloc(...) # OOM happened # return without unregister vhost_transport Fix by calling vsock_core_unregister() when misc_register() returns error. Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Yuan Can <yuancan@huawei.com> Message-Id: <20221108101705.45981-1-yuancan@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init()ruanjinjie
Inject fault while probing module, if device_register() fails in vdpasim_net_init() or vdpasim_blk_init(), but the refcount of kobject is not decreased to 0, the name allocated in dev_set_name() is leaked. Fix this by calling put_device(), so that name can be freed in callback function kobject_cleanup(). (vdpa_sim_net) unreferenced object 0xffff88807eebc370 (size 16): comm "modprobe", pid 3848, jiffies 4362982860 (age 18.153s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 6e 65 74 00 6b 6b 6b a5 vdpasim_net.kkk. backtrace: [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 [<ffffffff81731d53>] kstrdup+0x33/0x60 [<ffffffff83a5d421>] kobject_set_name_vargs+0x41/0x110 [<ffffffff82d87aab>] dev_set_name+0xab/0xe0 [<ffffffff82d91a23>] device_add+0xe3/0x1a80 [<ffffffffa0270013>] 0xffffffffa0270013 [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 [<ffffffff813739cb>] do_init_module+0x1ab/0x640 [<ffffffff81379d20>] load_module+0x5d00/0x77f0 [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 [<ffffffff83c4d505>] do_syscall_64+0x35/0x80 [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 (vdpa_sim_blk) unreferenced object 0xffff8881070c1250 (size 16): comm "modprobe", pid 6844, jiffies 4364069319 (age 17.572s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 62 6c 6b 00 6b 6b 6b a5 vdpasim_blk.kkk. backtrace: [<ffffffff8174f19e>] __kmalloc_node_track_caller+0x4e/0x150 [<ffffffff81731d53>] kstrdup+0x33/0x60 [<ffffffff83a5d421>] kobject_set_name_vargs+0x41/0x110 [<ffffffff82d87aab>] dev_set_name+0xab/0xe0 [<ffffffff82d91a23>] device_add+0xe3/0x1a80 [<ffffffffa0220013>] 0xffffffffa0220013 [<ffffffff81001c27>] do_one_initcall+0x87/0x2e0 [<ffffffff813739cb>] do_init_module+0x1ab/0x640 [<ffffffff81379d20>] load_module+0x5d00/0x77f0 [<ffffffff8137bc40>] __do_sys_finit_module+0x110/0x1b0 [<ffffffff83c4d505>] do_syscall_64+0x35/0x80 [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 899c4d187f6a ("vdpa_sim_blk: add support for vdpa management tool") Fixes: a3c06ae158dd ("vdpa_sim_net: Add support for user supported devices") Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20221110082348.4105476-1-ruanjinjie@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-28tools: Delete the unneeded semicolon after curly bracesShaomin Deng
Unneeded semicolon after curly braces, so delete it. Signed-off-by: Shaomin Deng <dengshaomin@cdjrlc.com> Message-Id: <20221105155151.12155-1-dengshaomin@cdjrlc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28virtio_pci: modify ENOENT to EINVALAngus Chen
Virtio_crypto use max_data_queues+1 to setup vqs, we use vp_modern_get_num_queues to protect the vq range in setup_vq. We could enter index >= vp_modern_get_num_queues(mdev) in setup_vq if common->num_queues is not set well,and it return -ENOENT. It is better to use -EINVAL instead. Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Message-Id: <20221101111655.1947-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28RDMA/mlx5: remove variable iColin Ian King
Variable i is just being incremented and it's never used anywhere else. The variable and the increment are redundant so remove it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-Id: <20221024133756.2158497-1-colin.i.king@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28virtio_ring: use helper function is_power_of_2()Shaoqin Huang
Use helper function is_power_of_2() to check if num is power of two. Minor readability improvement. Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com> Message-Id: <20221021062734.228881-3-shaoqin.huang@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-12-28virtio_pci: use helper function is_power_of_2()Shaoqin Huang
Use helper function is_power_of_2() to check if num is power of two. Minor readability improvement. Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com> Message-Id: <20221021062734.228881-2-shaoqin.huang@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-12-28vdpa/mlx5: Avoid overwriting CVQ iotlbEli Cohen
When qemu uses different address spaces for data and control virtqueues, the current code would overwrite the control virtqueue iotlb through the dup_iotlb call. Fix this by referring to the address space identifier and the group to asid mapping to determine which mapping needs to be updated. We also move the address space logic from mlx5 net to core directory. Reported-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-6-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-12-28vdpa/mlx5: Avoid using reslock in event_handlerEli Cohen
event_handler runs under atomic context and may not acquire reslock. We can still guarantee that the handler won't be called after suspend by clearing nb_registered, unregistering the handler and flushing the workqueue. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-5-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28vdpa/mlx5: Fix wrong mac address deletionEli Cohen
Delete the old MAC from the table and not the new one which is not there yet. Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28vdpa/mlx5: Return error on vlan ctrl commands if not supportedEli Cohen
Check if VIRTIO_NET_F_CTRL_VLAN is negotiated and return error if control VQ command is received. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-3-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-12-28vdpa/mlx5: Fix rule forwarding VLAN to TIREli Cohen
Set the VLAN id to the header values field instead of overwriting the headers criteria field. Before this fix, VLAN filtering would not really work and tagged packets would be forwarded unfiltered to the TIR. Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20221114131759.57883-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28virtio-blk: use a helper to handle request queuing errorsDmitry Fomichev
Define a new helper function, virtblk_fail_to_queue(), to clean up the error handling code in virtio_queue_rq(). Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Message-Id: <20221016034127.330942-2-dmitry.fomichev@wdc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-28tools/virtio: initialize spinlocks in vring_test.cRicardo Cañuelo
The virtio_device vqs_list spinlocks must be initialized before use to prevent functions that manipulate the device virtualqueues, such as vring_new_virtqueue(), from blocking indefinitely. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Message-Id: <20221012062949.1526176-1-ricardo.canuelo@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
2022-12-28vdpa: merge functionally duplicated dev_features attributesSi-Wei Liu
We can merge VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES with VDPA_ATTR_DEV_FEATURES which is functionally equivalent. While at it, tweak the comment in header file to make user provioned device features distinguished from those supported by the parent mgmtdev device: the former of which can be inherited as a whole from the latter, or can be a subset of the latter if explicitly specified. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1665422823-18364-1-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-27uapi:io_uring.h: allow linux/time_types.h to be skippedStefan Metzmacher
include/uapi/linux/io_uring.h is synced 1:1 into liburing:src/include/liburing/io_uring.h. liburing has a configure check to detect the need for linux/time_types.h. It can opt-out by defining UAPI_LINUX_IO_URING_H_SKIP_LINUX_TIME_TYPES_H Fixes: 78a861b94959 ("io_uring: add sync cancelation API through io_uring_register()") Link: https://github.com/axboe/liburing/issues/708 Link: https://github.com/axboe/liburing/pull/709 Link: https://lore.kernel.org/io-uring/20221115212614.1308132-1-ammar.faizi@intel.com/T/#m9f5dd571cd4f6a5dee84452dbbca3b92ba7a4091 CC: Jens Axboe <axboe@kernel.dk> Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Link: https://lore.kernel.org/r/7071a0a1d751221538b20b63f9160094fc7e06f4.1668630247.git.metze@samba.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-27futex: Fix futex_waitv() hrtimer debug object leak on kcalloc errorMathieu Desnoyers
In a scenario where kcalloc() fails to allocate memory, the futex_waitv system call immediately returns -ENOMEM without invoking destroy_hrtimer_on_stack(). When CONFIG_DEBUG_OBJECTS_TIMERS=y, this results in leaking a timer debug object. Fixes: bf69bad38cf6 ("futex: Implement sys_futex_waitv()") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Cc: stable@vger.kernel.org Cc: stable@vger.kernel.org # v5.16+ Link: https://lore.kernel.org/r/20221214222008.200393-1-mathieu.desnoyers@efficios.com
2022-12-27x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNKMasami Hiramatsu (Google)
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping speculative execution after function return, kprobe jump optimization always fails on the functions with such INT3 inside the function body. (It already checks the INT3 padding between functions, but not inside the function) To avoid this issue, as same as kprobes, check whether the INT3 comes from kgdb or not, and if so, stop decoding and make it fail. The other INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be treated as a one-byte instruction. Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/167146051929.1374301.7419382929328081706.stgit@devnote3
2022-12-27x86/kprobes: Fix kprobes instruction boudary check with CONFIG_RETHUNKMasami Hiramatsu (Google)
Since the CONFIG_RETHUNK and CONFIG_SLS will use INT3 for stopping speculative execution after RET instruction, kprobes always failes to check the probed instruction boundary by decoding the function body if the probed address is after such sequence. (Note that some conditional code blocks will be placed after function return, if compiler decides it is not on the hot path.) This is because kprobes expects kgdb puts the INT3 as a software breakpoint and it will replace the original instruction. But these INT3 are not such purpose, it doesn't need to recover the original instruction. To avoid this issue, kprobes checks whether the INT3 is owned by kgdb or not, and if so, stop decoding and make it fail. The other INT3 will come from CONFIG_RETHUNK/CONFIG_SLS and those can be treated as a one-byte instruction. Fixes: e463a09af2f0 ("x86: Add straight-line-speculation mitigation") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/167146051026.1374301.392728975473572291.stgit@devnote3
2022-12-27x86/calldepth: Fix incorrect init section referencesArnd Bergmann
The addition of callthunks_translate_call_dest means that skip_addr() and patch_dest() can no longer be discarded as part of the __init section freeing: WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> skip_addr (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> patch_dest (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: is_callthunk.cold (section: .text.unlikely) -> skip_addr (section: .init.text) ERROR: modpost: Section mismatches detected. Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them. Fixes: b2e9dfe54be4 ("x86/bpf: Emit call depth accounting if required") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221215164334.968863-1-arnd@kernel.org
2022-12-27perf/core: Call LSM hook after copying perf_event_attrNamhyung Kim
It passes the attr struct to the security_perf_event_open() but it's not initialized yet. Fixes: da97e18458fb ("perf_event: Add support for LSM and SELinux checks") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20221220223140.4020470-1-namhyung@kernel.org
2022-12-27perf: Fix use-after-free in error pathPeter Zijlstra
The syscall error path has a use-after-free; put_pmu_ctx() will reference ctx, therefore we must ensure ctx is destroyed after pmu_ctx is. Fixes: bd2756811766 ("perf: Rewrite core context handling") Reported-by: syzbot+b8e8c01c8ade4fe6e48f@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Chengming Zhou <zhouchengming@bytedance.com> Link: https://lkml.kernel.org/r/Y6B3xEgkbmFUCeni@hirez.programming.kicks-ass.net