Age | Commit message (Collapse) | Author |
|
All TID RDMA packets are in KDETH packet format and therefore the
PbcInsertHcrc must be set properly before sending the packet to
hardware. Otherwise, the packets will be dropped by the receiver.
By default, HCRC is not inserted for 9B packets without KDETH, and
this patch adds that back for TID RDMA packets.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch adds the helper functions to build the TID RDMA READ request
on the requester side. The key is to allocate TID resources (TID flow
and TID entries) and send the resource information to the responder side
along with the read request. Since the TID resources are limited, each
TID RDMA READ request has to be split into segments with a default
segment size of 256K. A software flow is allocated to track the data
transaction for each segment. The work request opcode, packet opcode, and
packet formats for TID RDMA READ protocol are also defined in this patch.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch adds the static trace for the flow and TID management
functions to help debugging in the filed.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch adds the counter n_tidwait to count the number of times the
TID resource allocator has to wait for TID resources.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
TID entries are used by hfi1 hardware to receive data payload from
incoming packets directly into a user buffer and thus avoid data copying
by software. This patch implements the functions for TID allocation,
freeing, and programming TID RcvArray entries in hardware for kernel
clients. TID entries are managed via lists of TID groups similar to PSM.
Furthermore, to track TID resource allocation for each request, software
flows are also allocated and freed as needed. Since software flows
consume large amount of memory for tracking TID allocation and freeing,
it is generally desirable to allocate them dynamically in the send queue
and only for TID RDMA requests, but pre-allocate them for receive queue
because the send queue could have thousands of entries while the receive
queue has only a limited number of entries.
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The hfi1 hardware flow is a hardware flow-control mechanism for a KDETH
data packet that is received on a hfi1 port. It validates the packet by
checking both the generation and sequence. Each QP that uses the TID RDMA
mechanism will allocate a hardware flow from its receiving context for
any incoming KDETH data packets.
This patch implements:
(1) a function to allocate hardware flow
(2) a function to free hardware flow
(3) a function to initialize hardware flow generation for a receiving
context
(4) a wait mechanism if the hardware flow is not available
(4) a function to remove the qp from the wait queue for hardware flow
when the qp is reset or destroyed.
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch moves some RC helper functions into a header file so that
they can be called from both RC and TID RDMA functions. In addition,
a common function for rewinding a request is created in rdmavt so that
it can be shared between qib and hfi1 driver.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Move the iwpm kdoc comments from the prototype declarations to above
the function bodies. There are no functional changes in this patch.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Netlink statistics exported by rdma-cm never had any working user space
component published to the mailing list or to any open source
project. Canvassing various proprietary users, and the original requester,
we find that there are no real users of this interface.
This patch simply removes all occurrences of RDMA CM netlink in favour of
modern nldev implementation, which provides the same information and
accompanied by widely used user space component.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Avoid that sparse reports the following for the mlx5 driver:
drivers/infiniband/hw/mlx5/qp.c:2671:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2671:34: left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2671:34: right side has type int
drivers/infiniband/hw/mlx5/qp.c:2679:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2679:34: left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2679:34: right side has type int
drivers/infiniband/hw/mlx5/qp.c:2680:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2680:34: left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2680:34: right side has type int
drivers/infiniband/hw/mlx5/qp.c:2684:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2684:34: left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2684:34: right side has type int
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: incorrect type in argument 1 (different base types)
drivers/infiniband/hw/mlx5/qp.c:2686:28: expected unsigned int [usertype] val
drivers/infiniband/hw/mlx5/qp.c:2686:28: got restricted __be32 [usertype]
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
This patch does not change any functionality.
Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Change "unsoliticed" into "unsolicited".
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Make sure that the next time a response is sent to the initiator that the
credit it had allocated for the aborted request gets freed.
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Fixes: 131e6abc674e ("target: Add TFO->abort_task for aborted task resources release") # v3.15
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Instead of maintaining a list of free I/O contexts, use an sbitmap data
structure to track which I/O contexts are in use and which are free. This
makes the ib_srpt driver more consistent with other LIO drivers.
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If submitting a TMF to the target core fails, send the "FUNCTION REJECTED"
response to the initiator.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If submitting an SRP IU to the target core fails, send the SCSI response
"BUSY" to the initiator instead of not sending any response.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Due to the patch that makes TMF handling synchronous the
write_pending_status() callback function is no longer called. Hence remove
it.
Acked-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Since .scsi_done() must only be called after scsi_queue_rq() has
finished, make sure that the SRP initiator driver does not call
.scsi_done() while scsi_queue_rq() is in progress. Although
invoking sg_reset -d while I/O is in progress works fine with kernel
v4.20 and before, that is not the case with kernel v5.0-rc1. This
patch avoids that the following crash is triggered with kernel
v5.0-rc1:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000138
CPU: 0 PID: 360 Comm: kworker/0:1H Tainted: G B 5.0.0-rc1-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Workqueue: kblockd blk_mq_run_work_fn
RIP: 0010:blk_mq_dispatch_rq_list+0x116/0xb10
Call Trace:
blk_mq_sched_dispatch_requests+0x2f7/0x300
__blk_mq_run_hw_queue+0xd6/0x180
blk_mq_run_work_fn+0x27/0x30
process_one_work+0x4f1/0xa20
worker_thread+0x67/0x5b0
kthread+0x1cf/0x1f0
ret_from_fork+0x24/0x30
Cc: <stable@vger.kernel.org>
Fixes: 94a9174c630c ("IB/srp: reduce lock coverage of command completion")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
A soft iwarp driver that uses the host TCP stack via a kernel mode socket
does not need port mapping. In fact, if the port map daemon, iwpmd, is
running, then iwpmd must not try and create/bind a socket to the actual
port for a soft iwarp connection, since the driver already has that socket
bound.
Yet if the soft iwarp driver wants to interoperate with hard iwarp devices
that -are- using port mapping, then the soft iwarp driver's mappings still
need to be maintained and advertised by the iwpm protocol.
This patch enhances the rdma driver<->iwcm interface to allow an iwarp
driver to specify that it does not want port mapping. The iwpm
kernel<->iwpmd interface is also enhanced to pass up this information on
map requests.
Care is taken to interoperate with the current iwpmd version (ABI version
3) and only use the new NL attributes if iwpmd supports ABI version 4.
The ABI version define has also been created in rdma_netlink.h so both
kernel and user code can share it. The iwcm and iwpmd negotiate the ABI
version to use with a new HELLO netlink message.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In order to add new IWPM_NL attributes, the enums for the IWPM commands
attributes are refactored such that a new attribute can be added without
breaking ABI version 3. Instead of sharing nl attribute enums for both
request and response messages, we create separate enums for each IWPM
message request and reply. This allows us to extend any given IWPM
message by adding new attributes for just that message. These new enums
are created, though, in a way to avoid breaking ABI version 3.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
So future additions to that struct get initialized by default.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
On hi08 chip, There is a possibility of chip hanging when sending doorbell
during reset. We can fix it by prohibiting doorbell during reset.
Fixes: 2d40788825ac ("RDMA/hns: Add support for processing send wr and receive wr")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
On hi08 chip, There is a possibility of chip hanging and some errors when
sending mailbox & doorbell during reset. We can fix it by prohibiting
mailbox and doorbell during reset and reset occurred to ensure that
hardware can work normally.
Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In the reset process, the hns3 NIC driver notifies the RoCE driver to
perform reset related processing by calling the .reset_notify() interface
registered by the RoCE driver in hip08 SoC.
In the current version, if a reset occurs simultaneously during the
execution of rmmod or insmod ko, there may be Oops error as below:
Internal error: Oops: 86000007 [#1] PREEMPT SMP
Modules linked in: hns_roce(O) hns3(O) hclge(O) hnae3(O) [last unloaded: hns_roce_hw_v2]
CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G O 4.19.0-ge00d540 #1
Hardware name: Huawei Technologies Co., Ltd.
Workqueue: events hclge_reset_service_task [hclge]
pstate: 60c00009 (nZCv daif +PAN +UAO)
pc : 0xffff00000100b0b8
lr : 0xffff00000100aea0
sp : ffff000009afbab0
x29: ffff000009afbab0 x28: 0000000000000800
x27: 0000000000007ff0 x26: ffff80002f90c004
x25: 00000000000007ff x24: ffff000008f97000
x23: ffff80003efee0a8 x22: 0000000000001000
x21: ffff80002f917ff0 x20: ffff8000286ea070
x19: 0000000000000800 x18: 0000000000000400
x17: 00000000c4d3225d x16: 00000000000021b8
x15: 0000000000000400 x14: 0000000000000400
x13: 0000000000000000 x12: ffff80003fac6e30
x11: 0000800036303000 x10: 0000000000000001
x9 : 0000000000000000 x8 : ffff80003016d000
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000000
x3 : 0000000000000004 x2 : 00000000000007ff
x1 : 0000000000000000 x0 : 0000000000000000
Process kworker/0:1 (pid: 14, stack limit = 0x00000000af8f0ad9)
Call trace:
0xffff00000100b0b8
0xffff00000100b3a0
hns_roce_init+0x624/0xc88 [hns_roce]
0xffff000001002df8
0xffff000001006960
hclge_notify_roce_client+0x74/0xe0 [hclge]
hclge_reset_service_task+0xa58/0xbc0 [hclge]
process_one_work+0x1e4/0x458
worker_thread+0x40/0x450
kthread+0x12c/0x130
ret_from_fork+0x10/0x18
Code: bad PC value
In the reset process, we will release the resources firstly, and after the
hardware reset is completed, we will reapply resources and reconfigure the
hardware.
We can solve this problem by modifying both the NIC and the RoCE
driver. We can modify the concurrent processing in the NIC driver to avoid
calling the .reset_notify and .uninit_instance ops at the same time. And
we need to modify the RoCE driver to record the reset stage and the
driver's init/uninit state, and check the state in the .reset_notify,
.init_instance. and uninit_instance functions to avoid NULL pointer
operation.
Fixes: cb7a94c9c808 ("RDMA/hns: Add reset process for RoCE in hip08")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Currently a packet is marked for loopback only if the source and
destination addresses equals. This is not enough when multiple gids are
present in rxe device's gid table and the traffic is from one gid to
another. Fix it by marking the packet for loopback if the destination MAC
address is equal to the source MAC address.
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Move the function rxe_init_av() to rxe_av.c file and use it instead of
calling rxe_av_from_attr() and rxe_av_fill_ip_info(), also remove the
unused rxe_dev parameter from rxe_init_av().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Fixes the following sparse warnings:
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:5822:5: warning:
symbol 'hns_roce_v2_query_srq' was not declared. Should it be static?
drivers/infiniband/hw/hns/hns_roce_srq.c:158:6: warning:
symbol 'hns_roce_srq_free' was not declared. Should it be static?
drivers/infiniband/hw/hns/hns_roce_srq.c:81:5: warning:
symbol 'hns_roce_srq_alloc' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Linux 5.0-rc5
Needed to merge the include/uapi changes so we have an up to date
single-tree for these files. Patches already posted are also expected to
need this for dependencies.
|
|
Keeping single line wrapper functions is not useful. Hence remove the
ib_sg_dma_address() and ib_sg_dma_len() functions. This patch does not
change any functionality.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Query all per transport caps for XRC and set the appropriate bits in the
per transport field of the advertised struct.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
ODP support in SRQ is per transport capability. Based on device
capabilities set this flag in device structure for future queries.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Add changes to the WQE page-fault handler to
1. Identify that the event is for a SRQ WQE
2. Pass SRQ object instead of a QP to the function that reads the WQE
3. Parse the SRQ WQE with respect to its structure
The rest is handled as for regular RQ WQE.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Reading a WQE from SRQ is almost identical to reading from regular RQ.
The differences are the size of the queue, the size of a WQE and buffer
location.
Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE
from a SRQ or RQ by caller choice.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Skip XRC segment in the beginning of a send WQE and fetch ODP XRC
capabilities when QP type is IB_QPT_XRC_INI. The rest of the handling is
the same as in RC QP.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In the function mlx5_ib_mr_responder_pfault_handler()
1. The parameter wqe is used as read-only so there is no need to pass it
by reference.
2. Remove the unused argument pfault from list of arguments.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
When handling an ODP event for a receive WQE in SRQ the target QP is
unknown. Therefore, it is wrong to ask if QP has a SRQ in the page-fault
handler.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Expose XRC ODP capabilities as part of the extended device capabilities.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
QP and SRQ objects are stored in different containers so the action to get
and lock a common resource during ODP event needs to address that.
While here get rid of 'refcount' and 'free' fields in mlx5_core_srq struct
and use the fields with same semantics in common structure.
Fixes: 032080ab43ac ("IB/mlx5: Lock QP during page fault handling")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'hns_roce_v2_qp_flow_control_init':
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4384:33: warning:
variable 'rst' set but not used [-Wunused-but-set-variable]
It never used since introduction.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This patch adds the static trace to the OPFN code and moves tid related
static trace code into a new header file.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
OPFN parameter negotiation allows a pair of connected RC QPs to exchange
a set of parameters in succession. This negotiation does not commence
till the first ULP request. Because OPFN operations are operations
private to the driver, they do not generate user completions or put the
QP into error when they run out of retries. This patch integrates the
OPFN protocol into the transactions of an RC QP.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The OPFN protocol uses the COMPARE_SWAP request to exchange data
between the requester and the responder and therefore needs to
be stored in the QP's s_ack_queue when the request is received
on the responder side. However, because the user does not know
anything about the OPFN protocol, this extra entry in the
queue cannot be advertised to the user. This patch adds an extra
entry in a QP's s_ack_queue.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
OPFN allows a pair of connected RC QPs to exchange a set of parameters
in succession. The parameter exchange itself is done using the IB compare
and swap request with a special virtual address. The request is triggered
using a reserved IB work request opcode. This patch implements the OPFN
interface to initialize, start, process, and reset the OPFN request.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch adds the OPFN helper functions to initialize, encode, decode,
and reset OPFN parameters for the TID RDMA feature.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
OPFN (Omni Path Feature Negotiation) support discovery allows a RC QP to
announce that it supports OPFN and also discover if OPFN is supported by
the peer QP. OPFN parameter negotiation is skipped unless OPFN support is
first discovered. OPFN support is announced by claiming what was
the reserved bit in dword 1 of OmniPath modified base transport header
in requests and responses.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
As preparation to hide rdma_restrack_root, refactor the code to use the
ops structure instead of a special callback which is hidden in
rdma_restrack_root.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Since we already know if we are user/kernel before calling restrack_add,
move type dependent code into the callers to make the flow more readable.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In the current implementation, we have one restrack root per-device and
all users are simply providing it directly. Let's simplify the interface
and have callers provide the ib_device and internally access the
restrack_root.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The .doit callbacks don't have a netlink_callback to check capabilities so
in order to use the same fill_res_func for both .dump and .doit, we need
to do the capability check outside of those functions.
For .doit callbacks, it is possible to check CAP_NET_ADMIN directly on the
received sk_buff.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The PID namespace is going to be used in the .doit callback, so generalize
its implementation.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
There is no need to manually write same callbacks, automatically generate
them using C-macro language.
This macro is going to be extended to generate doit callbacks too, so use
general name for this macro.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|