Age | Commit message (Collapse) | Author |
|
create_user_mr() has correct code to count the number of null keys
used to fill in a hole for the memory map. However, fill_indir()
does not follow the same to cap the range up to the 1GB limit
correspondingly. Fill in more null keys for the gaps in between,
so that null keys are correctly populated.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Cc: stable@vger.kernel.org
Reported-by: Cong Meng <cong.meng@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20250220193732.521462-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
The starting iova address to iterate iotlb map entry within a range
was set to an irrelevant value when passing to the itree_next()
iterator, although luckily it doesn't affect the outcome of finding
out the granule of the smallest iotlb map size. Fix the code to make
it consistent with the following for-loop.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20241021134040.975221-3-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
When calculating the physical address range based on the iotlb and mr
[start,end) ranges, the offset of mr->start relative to map->start
is not taken into account. This leads to some incorrect and duplicate
mappings.
For the case when mr->start < map->start the code is already correct:
the range in [mr->start, map->start) was handled by a different
iteration.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Cc: stable@vger.kernel.org
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Currently, when a new MR is set up, the old MR is deleted. MR deletion
is about 30-40% the time of MR creation. As deleting the old MR is not
important for the process of setting up the new MR, this operation
can be postponed.
This series adds a workqueue that does MR garbage collection at a later
point. If the MR lock is taken, the handler will back off and
reschedule. The exception during shutdown: then the handler must
not postpone the work.
Note that this is only a speculative optimization: if there is some
mapping operation that is triggered while the garbage collector handler
has the lock taken, this operation it will have to wait for the handler
to finish.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20240830105838.2666587-9-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
There's currently not a lot of action happening during
the init/destroy of MR resources. But more will be added
in the upcoming patches.
As the mr mutex lock init/destroy has been moved to these
new functions, the lifetime has now shifted away from
mlx5_vdpa_alloc_resources() / mlx5_vdpa_free_resources()
into these new functions. However, the lifetime at the
outer scope remains the same:
mlx5_vdpa_dev_add() / mlx5_vdpa_dev_free()
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20240830105838.2666587-8-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Now that the mr resources have their own namespace in the
struct, give the lock a clearer name.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240830105838.2666587-7-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Group all mapping related resources into their own structure.
Upcoming patches will add more members in this new structure.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240830105838.2666587-6-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
A followup patch will use this name for something else.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20240830105838.2666587-5-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Use the async interface to issue MTT MKEY deletion.
This makes destroy_user_mr() on average 8x times faster.
This number is also dependent on the size of the MR being
deleted.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240830105838.2666587-4-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Use the async interface to issue MTT MKEY creation.
Extra care is taken at the allocation of FW input commands
due to the MTT tables having variable sizes depending on
MR.
The indirect MKEY is still created synchronously at the
end as the direct MKEYs need to be filled in.
This makes create_user_mr() 3-5x faster, depending on
the size of the MR.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20240830105838.2666587-3-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Switch firmware vq query command to be issued via the async API to
allow future parallelization.
For now the command is still serial but the infrastructure is there
to issue commands in parallel, including ratelimiting the number
of issued async commands to firmware.
A later patch will switch to issuing more commands at a time.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Message-Id: <20240816090159.1967650-5-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Introduce a new function mlx5_vdpa_exec_async_cmds() which
wraps the mlx5_core async firmware command API in a way
that will be used to parallelize certain operation in this
driver.
The wrapper deals with the case when mlx5_cmd_exec_cb() returns
EBUSY due to the command being throttled.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Message-Id: <20240816090159.1967650-4-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
mlx5_vdpa_err() was missing. This patch adds it and uses it in the
necessary places.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240816090159.1967650-3-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Certain error paths from mlx5_vdpa_dev_add() can end up releasing mr
resources which never got initialized in the first place.
This patch adds the missing check in mlx5_vdpa_destroy_mr_resources()
to block releasing non-initialized mr resources.
Reference trace:
mlx5_core 0000:08:00.2: mlx5_vdpa_dev_add:3274:(pid 2700) warning: No mac address provisioned?
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 140216067 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 8 PID: 2700 Comm: vdpa Kdump: loaded Not tainted 5.14.0-496.el9.x86_64 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:vhost_iotlb_del_range+0xf/0xe0 [vhost_iotlb]
Code: [...]
RSP: 0018:ff1c823ac23077f0 EFLAGS: 00010246
RAX: ffffffffc1a21a60 RBX: ffffffff899567a0 RCX: 0000000000000000
RDX: ffffffffffffffff RSI: 0000000000000000 RDI: 0000000000000000
RBP: ff1bda1f7c21e800 R08: 0000000000000000 R09: ff1c823ac2307670
R10: ff1c823ac2307668 R11: ffffffff8a9e7b68 R12: 0000000000000000
R13: 0000000000000000 R14: ff1bda1f43e341a0 R15: 00000000ffffffea
FS: 00007f56eba7c740(0000) GS:ff1bda269f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000104d90001 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? mlx5_vdpa_free+0x3d/0x150 [mlx5_vdpa]
? __die_body.cold+0x8/0xd
? page_fault_oops+0x134/0x170
? __irq_work_queue_local+0x2b/0xc0
? irq_work_queue+0x2c/0x50
? exc_page_fault+0x62/0x150
? asm_exc_page_fault+0x22/0x30
? __pfx_mlx5_vdpa_free+0x10/0x10 [mlx5_vdpa]
? vhost_iotlb_del_range+0xf/0xe0 [vhost_iotlb]
mlx5_vdpa_free+0x3d/0x150 [mlx5_vdpa]
vdpa_release_dev+0x1e/0x50 [vdpa]
device_release+0x31/0x90
kobject_cleanup+0x37/0x130
mlx5_vdpa_dev_add+0x2d2/0x7a0 [mlx5_vdpa]
vdpa_nl_cmd_dev_add_set_doit+0x277/0x4c0 [vdpa]
genl_family_rcv_msg_doit+0xd9/0x130
genl_family_rcv_msg+0x14d/0x220
? __pfx_vdpa_nl_cmd_dev_add_set_doit+0x10/0x10 [vdpa]
? _copy_to_user+0x1a/0x30
? move_addr_to_user+0x4b/0xe0
genl_rcv_msg+0x47/0xa0
? __import_iovec+0x46/0x150
? __pfx_genl_rcv_msg+0x10/0x10
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x245/0x370
netlink_sendmsg+0x206/0x440
__sys_sendto+0x1dc/0x1f0
? do_read_fault+0x10c/0x1d0
? do_pte_missing+0x10d/0x190
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x5c/0xf0
? __count_memcg_events+0x4f/0xb0
? mm_account_fault+0x6c/0x100
? handle_mm_fault+0x116/0x270
? do_user_addr_fault+0x1d6/0x6a0
? do_syscall_64+0x6b/0xf0
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
entry_SYSCALL_64_after_hwframe+0x78/0x80
Fixes: 512c0cdd80c1 ("vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Message-Id: <20240827160808.2448017-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
|
|
Track allocated mrs in a list and show warning when leaks are detected
on device free or reset.
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231225151203.152687-9-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Deleting the old mr during mr update (.set_map) and then modifying the
vqs with the new mr is not a good flow for firmware. The firmware
expects that mkeys are deleted after there are no more vqs referencing
them.
Introduce reference counting for mrs to fix this. It is the only way to
make sure that mkeys are not in use by vqs.
An mr reference is taken when the mr is associated to the mr asid table
and when the mr is linked to the vq on create/modify. The reference is
released when the mkey is unlinked from the vq (trough modify/destroy)
and from the mr asid table.
To make things consistent, get rid of mlx5_vdpa_destroy_mr and use
get/put semantics everywhere.
Reviewed-by: Gal Pressman <gal@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231225151203.152687-8-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Since commit 6f5312f80183 ("vdpa/mlx5: Add support for running with
virtio_vdpa"), mlx5_vdpa starts with preallocate 1:1 DMA MR at device
creation time. This 1:1 DMA MR will be implicitly destroyed while the
first .set_map call is invoked, in which case callers like vhost-vdpa
will start to set up custom mappings. When the .reset callback is
invoked, the custom mappings will be cleared and the 1:1 DMA MR will be
re-created.
In order to reduce excessive memory mapping cost in live migration, it
is desirable to decouple the vhost-vdpa IOTLB abstraction from the
virtio device life cycle, i.e. mappings can be kept around intact across
virtio device reset. Leverage the .reset_map callback, which is meant to
destroy the regular MR (including cvq mapping) on the given ASID and
recreate the initial DMA mapping. That way, the device .reset op runs
free from having to maintain and clean up memory mappings by itself.
Additionally, implement .compat_reset to cater for older userspace,
which may wish to see mapping to be cleared during reset.
Co-developed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <1697880319-4937-7-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
For the following sequence:
- cvq group is in ASID 0
- .set_map(1, cvq_iotlb)
- .set_group_asid(cvq_group, 1)
... the cvq mapping from ASID 0 will be used. This is not always correct
behaviour.
This patch adds support for the above mentioned flow by saving the iotlb
on each .set_map and updating the cvq iotlb with it on a cvq group change.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-18-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
They will be used in a follow-up patch.
For dup_iotlb, avoid the src == dst case. This is an error.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-17-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Introduce the vq descriptor group and mr per ASID. Until now
.set_map on ASID 1 was only updating the cvq iotlb. From now on it also
creates a mkey for it. The current patch doesn't use it but follow-up
patches will add hardware support for mapping the vq descriptors.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-15-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
The current flow for updating an mr works directly on mvdev->mr which
makes it cumbersome to handle multiple new mr structs.
This patch makes the flow more straightforward by having
mlx5_vdpa_create_mr return a new mr which will update the old mr (if
any). The old mr will be deleted and unlinked from mvdev. For the case
when the iotlb is empty (not NULL), the old mr will be cleared.
This change paves the way for adding mrs for different ASIDs.
The initialized bool is no longer needed as mr is now a pointer in the
mlx5_vdpa_dev struct which will be NULL when not initialized.
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-14-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
The mutex is named like it is supposed to protect only the mkey but in
reality it is a global lock for all mr resources.
Shift the mutex to it's rightful location (struct mlx5_vdpa_dev) and
give it a more appropriate name.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231018171456.1624030-13-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
This patch adapts the mr creation/deletion code to be able to work with
any given mr struct pointer. All the APIs are adapted to take an extra
parameter for the mr.
mlx5_vdpa_create/delete_mr doesn't need a ASID parameter anymore. The
check is done in the caller instead (mlx5_set_map).
This change is needed for a followup patch which will introduce an
additional mr for the vq descriptor data.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231018171456.1624030-12-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Make mlx5_destroy_mr symmetric to mlx5_create_mr.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-11-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Now that the cvq code is out of mlx5_vdpa_create/destroy_mr, the "dvq"
functions can be folded into their callers.
Having "dvq" in the naming will no longer be accurate in the downstream
patches.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-10-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
The reslock is taken while refresh is called but iommu_lock is more
specific to this resource. So take the iommu_lock during cvq iotlb
refresh.
Based on Eugenio's patch [0].
[0] https://lore.kernel.org/lkml/20230112142218.725622-4-eperezma@redhat.com/
Acked-by: Jason Wang <jasowang@redhat.com>
Suggested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-9-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
The handling of the cvq iotlb is currently coupled with the creation
and destruction of the hardware mkeys (mr).
This patch moves cvq iotlb handling into its own function and shifts it
to a scope that is not related to mr handling. As cvq handling is just a
prune_iotlb + dup_iotlb cycle, put it all in the same "update" function.
Finally, the destruction path is handled by directly pruning the iotlb.
After this move is done the ASID mr code can be collapsed into a single
function.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-8-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Necessary for upcoming cvq separation from mr allocation.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20231018171456.1624030-7-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>
|
|
Commit 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation")
declared but never implemented these.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Message-Id: <20230803143041.23388-1-yuehaibing@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
mlx5_vdpa_destroy_mr can be called from .set_map with data ASID after
the control virtqueue ASID iotlb has been populated. The control vq
iotlb must not be cleared, since it will not be populated again.
So call the ASID aware destroy function which makes sure that the
right vq resource is destroyed.
Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Message-Id: <20230802171231.11001-5-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
The mr->initialized flag is shared between the control vq and data vq
part of the mr init/uninit. But if the control vq and data vq get placed
in different ASIDs, it can happen that initializing the control vq will
prevent the data vq mr from being initialized.
This patch consolidates the control and data vq init parts into their
own init functions. The mr->initialized will now be used for the data vq
only. The control vq currently doesn't need a flag.
The uninitializing part is also taken care of: mlx5_vdpa_destroy_mr got
split into data and control vq functions which are now also ASID aware.
Fixes: 8fcd20c30704 ("vdpa/mlx5: Support different address spaces for control and data")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Message-Id: <20230802171231.11001-3-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Otherwise the virtqueue object to instate could point to invalid address
that was unmapped from the MTT:
mlx5_core 0000:41:04.2: mlx5_cmd_out_err:782:(pid 8321):
CREATE_GENERAL_OBJECT(0xa00) op_mod(0xd) failed, status
bad parameter(0x3), syndrome (0x5fa1c), err(-22)
Fixes: cae15c2ed8e6 ("vdpa/mlx5: Implement susupend virtqueue callback")
Cc: Eli Cohen <elic@nvidia.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Message-Id: <1676424640-11673-1-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Initialize itolb spinlock.
Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230206122016.1149373-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Clearing the mr struct erases the lock owner and causes warnings to be
emitted. It is not required to clear the mr so remove the memset call.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230206121956.1149356-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When creating a memory key, the key value should be assigned to the
passed pointer and not or'ed to.
No functional issue was observed due to this bug.
Fixes: 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20230205072906.1108194-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Variable i is just being incremented and it's never used
anywhere else. The variable and the increment are redundant so
remove it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Message-Id: <20221024133756.2158497-1-colin.i.king@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When qemu uses different address spaces for data and control virtqueues,
the current code would overwrite the control virtqueue iotlb through the
dup_iotlb call. Fix this by referring to the address space identifier
and the group to asid mapping to determine which mapping needs to be
updated. We also move the address space logic from mlx5 net to core
directory.
Reported-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20221114131759.57883-6-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
|
|
Partition virtqueues to two different address spaces: one for control
virtqueue which is implemented in software, and one for data virtqueues.
Based-on: <20220526124338.36247-1-eperezma@redhat.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220714113927.85729-3-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Implement the get_vq_stats calback of vdpa_config_ops to return the
statistics for a virtqueue.
The statistics are provided as vendor specific statistics where the
driver provides a pair of attribute name and attribute value.
Currently supported are received descriptors and completed descriptors.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220518133804.1075129-6-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Pull virtio updates from Michael Tsirkin:
"vhost and virtio fixes and features:
- Hardening work by Jason
- vdpa driver for Alibaba ENI
- Performance tweaks for virtio blk
- virtio rng rework using an internal buffer
- mac/mtu programming for mlx5 vdpa
- Misc fixes, cleanups"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits)
vdpa/mlx5: Forward only packets with allowed MAC address
vdpa/mlx5: Support configuration of MAC
vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
vdpa_sim_net: Enable user to set mac address and mtu
vdpa: Enable user to set mac and mtu of vdpa device
vdpa: Use kernel coding style for structure comments
vdpa: Introduce query of device config layout
vdpa: Introduce and use vdpa device get, set config helpers
virtio-scsi: don't let virtio core to validate used buffer length
virtio-blk: don't let virtio core to validate used length
virtio-net: don't let virtio core to validate used length
virtio_ring: validate used buffer length
virtio_blk: correct types for status handling
virtio_blk: allow 0 as num_request_queues
i2c: virtio: Add support for zero-length requests
virtio-blk: fixup coccinelle warnings
virtio_ring: fix typos in vring_desc_extra
virtio-pci: harden INTX interrupts
virtio_pci: harden MSI-X interrupts
virtio_config: introduce a new .enable_cbs method
...
|
|
A subesequent patch will use the same workqueue for executing other
work not related to control VQ. Rename the workqueue and the work queue
entry used to convey information to the workqueue.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210909123635.30884-3-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
In mlx5_core and vdpa there is no use of mlx5_core_mkey members except
for the key itself.
As preparation for moving mlx5_core_mkey to mlx5_ib, the occurrences of
struct mlx5_core_mkey in all modules except for mlx5_ib are replaced by
a u32 key.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
There is no read of mkey->pd, only write. Remove it.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
mkey->size is already stored in ibmr->length, no need to store it here.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
iova is already stored in ibmr->iova, no need to store it here.
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Multiqueue support requires additional virtio_net_q objects to be added
or removed per the configured number of queue pairs. In addition the RQ
tables needs to be modified to match the number of configured receive
queues so the packets are dispatched to the right virtqueue according to
the hash result.
Note: qemu v6.0.0 is broken when the device requests more than two data
queues; no net device will be created for the vdpa device. To avoid
this, one should specify mq=off to qemu. In this case it will end up
with a single queue.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210823052123.14909-7-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add support to handle control virtqueue configurations per virtio
specification. The control virtqueue is implemented in software and no
hardware offloading is involved.
Control VQ configuration need task context, therefore all configurations
are handled in a workqueue created for the purpose.
Modifications are made to the memory registration code to allow for
saving a copy of itolb to be used by the control VQ to access the vring.
The max number of data virtqueus supported by the driver has been
updated to 2 since multiqueue is not supported at this stage and we need
to ensure consistency of VQ indices mapping to either data or control
VQ.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210823052123.14909-6-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Following patches add control virtuqeue and multiqueue support. We want
to verify that the index value to callbacks referencing a virtqueue is
valid.
The logic defining valid indices is as follows:
CVQ clear: 0 and 1.
CVQ set, MQ clear: 0, 1 and 2
CVQ set, MQ set: 0..nvq where nvq is whatever provided to
_vdpa_register_device()
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210823052123.14909-5-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
linux/if_vlan.h is not required.
Remove it.
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210823052123.14909-2-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The current code treats an empty iotlb provdied in set_map() as a
special case and destroy the memory region object. This must not be done
since the virtqueue objects reference this MR. Doing so will cause the
driver unload to emit errors and log timeouts caused by the firmware
complaining on busy resources.
This patch treats an empty iotlb as any other change of mapping. In this
case, mlx5_vdpa_create_mr() will fail and the entire set_map() call to
fail.
This issue has not been encountered before but was seen to occur in a
non-official version of qemu. Since qemu is a userspace program, the
driver must protect against such case.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210811053713.66658-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|