rdma: Enable ib_alloc_cq to spread work over a device's comp_vectors

Send and Receive completion is handled on a single CPU selected at the time each Completion Queue is allocated. Typically this is when an initiator instantiates an RDMA transport, or when a target accepts an RDMA connection. Some ULPs cannot open a connection per CPU to spread completion workload across available CPUs and MSI vectors. For such ULPs, provide an API that allows the RDMA core to select a completion vector based on the device's complement of available comp_vecs. ULPs that invoke ib_alloc_cq() with only comp_vector 0 are converted to use the new API so that their completion workloads interfere less with each other. Suggested-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Cc: <linux-cifs@vger.kernel.org> Cc: <v9fs-developer@lists.sourceforge.net> Link: https://lore.kernel.org/r/20190729171923.13428.52555.stgit@manet.1015granger.net Signed-off-by: Doug Ledford <dledford@redhat.com>
author: Chuck Lever <chuck.lever@oracle.com> 2019-07-29 13:22:09 -0400
committer: Doug Ledford <dledford@redhat.com> 2019-08-05 11:50:32 -0400
commit: 20cf4e026730104892fa1268de0371a631cee294 (patch)
tree: 08a7e60c303ff468d50a33d52e2bc98eab9b1b30 /drivers/infiniband/core/cq.c
parent: 31d0e6c149b8c9a9bddc6d68f8600918bb771cb9 (diff)
1 files changed, 28 insertions, 0 deletions
diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index 7c599878ccf7..bbfded6d5d3d 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -253,6 +253,34 @@ out_free_cq:
 EXPORT_SYMBOL(__ib_alloc_cq_user);
 
 /**
+ * __ib_alloc_cq_any - allocate a completion queue
+ * @dev:		device to allocate the CQ for
+ * @private:		driver private data, accessible from cq->cq_context
+ * @nr_cqe:		number of CQEs to allocate
+ * @poll_ctx:		context to poll the CQ from
+ * @caller:		module owner name
+ *
+ * Attempt to spread ULP Completion Queues over each device's interrupt
+ * vectors. A simple best-effort mechanism is used.
+ */
+struct ib_cq *__ib_alloc_cq_any(struct ib_device *dev, void *private,
+				int nr_cqe, enum ib_poll_context poll_ctx,
+				const char *caller)
+{
+	static atomic_t counter;
+	int comp_vector = 0;
+
+	if (dev->num_comp_vectors > 1)
+		comp_vector =
+			atomic_inc_return(&counter) %
+			min_t(int, dev->num_comp_vectors, num_online_cpus());
+
+	return __ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx,
+				  caller, NULL);
+}
+EXPORT_SYMBOL(__ib_alloc_cq_any);
+
+/**
  * ib_free_cq_user - free a completion queue
  * @cq:		completion queue to free.
  * @udata:	User data or NULL for kernel object
author	Chuck Lever <chuck.lever@oracle.com>	2019-07-29 13:22:09 -0400
committer	Doug Ledford <dledford@redhat.com>	2019-08-05 11:50:32 -0400
commit	20cf4e026730104892fa1268de0371a631cee294 (patch)
tree	08a7e60c303ff468d50a33d52e2bc98eab9b1b30 /drivers/infiniband/core/cq.c
parent	31d0e6c149b8c9a9bddc6d68f8600918bb771cb9 (diff)