summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2019-04-01IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass uverbs_attr_bundle down uobject destroy pathShamir Rabinovitch
Pass uverbs_attr_bundle down the uobject destroy path. The next patch will use this to eliminate the dependecy of the drivers in ib_x->uobject pointers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: ucontext should be set properly for all cmd & ioctl pathsShamir Rabinovitch
the Attempt to use the below commit to initialize the ucontext for the uobject destroy path has shown that the below commit is incomplete. Parts were reverted and the ucontext set up in the uverbs_attr_bundle was moved to rdma_lookup_get_uobject which is called from the uobj_get_XXX macros and rdma_alloc_begin_uobject which is called when uobject is created. Fixes: 3d9dfd060391 ("IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01opa_vnic: Convert vport_idr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01qib: Convert qib_unit_table to XArrayMatthew Wilcox
Also remove qib_devs_list. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01hfi1: Convert hfi1_unit_table to XArrayMatthew Wilcox
Also remove hfi1_devs_list. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29qedr: Convert srqidr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29qedr: Convert qpidr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29ipv6: Move ipv6 stubs to a separate header fileDavid Ahern
The number of stubs is growing and has nothing to do with addrconf. Move the definition of the stubs to a separate header file and update users. In the move, drop the vxlan specific comment before ipv6_stub. Code move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-29hfi1: Convert vesw_idr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29RDMA/hns: Convert qp_table_tree to XArrayMatthew Wilcox
Also fully initialise the qp before storing it in the XArray. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29RDMA/hns: Convert cq_table to XArrayMatthew Wilcox
Change the locking to not disable interrupts as the lookup in interrupt context will not see a freed CQ, thanks to the synchronize_irq() call before freeing the CQ. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Add command to set ib_core device net namspace sharing modeParav Pandit
Add netlink command that enables/disables sharing rdma device among multiple net namespaces. Using rdma tool, $rdma sys set netns shared (default mode) When rdma subsystem netns mode is set to shared mode, rdma devices will be accessible in all net namespaces. Using rdma tool, $rdma sys set netns exclusive When rdma subsystem netns mode is set to exclusive mode, devices will be accessible in only one net namespace at any given point of time. If there are any net namespaces other than default init_net exists, while executing this command, it will fail and mode cannot be changed. To change this mode, netlink command is used instead of sysctl, because netlink command allows to auto load a module. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Add interface to read device namespace sharing modeParav Pandit
Add an interface via netlink command to query whether rdma devices are shared among multiple net namespaces or not. When using RDMAtool, it can be queried as, $rdma system show netns netns shared Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Extend ib_device_get_by_index for net namespaceParav Pandit
Extend ib_device_get_by_index() API to check device access for net namespace for serving netlink commands. Also enforce net ns check on dumpit commands which iterate over all registered rdma devices and which don't call ib_device_get_by_index(). Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA: Check net namespace access for uverbs, umad, cma and nldevParav Pandit
Introduce an API rdma_dev_access_netns() to check whether a rdma device can be accessed from the specified net namespace or not. Use rdma_dev_access_netns() while opening character uverbs, umad network device and also check while rdma cm_id binds to rdma device. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Add module param to disable device sharing among net nsParav Pandit
Add module parameter to change a sharing mode of ib_core early in the boot process. This parameter helps to those systems where modern up to date rdma tool (iproute2) package may not be available during kernel upgrade cycle. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Support core port attributes in non init_netParav Pandit
Now that sysfs compatibility layer for non init_net exists, add core port attributes such as pkey and gid table to non init_net ns. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Implement compat device/sysfs tree in net namespaceParav Pandit
Implement compatibility layer sysfs entries of ib_core so that non init_net net namespaces can also discover rdma devices. Each non init_net net namespace has ib_core_device created in it. Such ib_core_device sysfs tree resembles rdma devices found in init_net namespace. This allows discovering rdma devices in multiple non init_net net namespaces via sysfs entries and helpful to rdma-core userspace. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Restrict sysfs entries view to init_netParav Pandit
This is a preparation patch to provide isolation of rdma device in a network namespace. As first step, make rdma device visible only in init net namespace. Subsequent patch will enable rdma device visibility back in multiple net namespaces using compat ib_core_device device/sysfs tree. Given that the IB subsystem depends on net stack, it needs to be initialized after netdev and since it support devices, it needs to be initialized before the device subsystem; therefore, change initcall sequence to fs_initcall, so that when ib_core is compiled in the kernel image, the right init sequence is followed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/core: Introduce ib_core_device to hold deviceParav Pandit
In order to support sysfs entries in multiple net namespaces for a rdma device, introduce a ib_core_device whose scope is limited to hold core device and per port sysfs related entries. This is preparation patch so that multiple ib_core_devices in each net namespace can be created in subsequent patch who all can share ib_device. (a) Move sysfs specific fields to ib_core_device. (b) Make sysfs and device life cycle related routines to work on ib_core_device. (c) Introduce and use rdma_init_coredev() helper to initialize coredev fields. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/rdmavt: Use correct sizing on buffers holding page DMA addressesShiraz Saleem
The buffer that holds the page DMA addresses is sized off umem->nmap. This can potentially cause out of bound accesses on the PBL array when iterating the umem DMA-mapped SGL. This is because if umem pages are combined, umem->nmap can be much lower than the number of system pages in umem. Use ib_umem_num_pages() to size this buffer. Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/rxe: Use correct sizing on buffers holding page DMA addressesShiraz Saleem
The buffer that holds the page DMA addresses is sized off umem->nmap. This can potentially cause out of bound accesses on the PBL array when iterating the umem DMA-mapped SGL. This is because if umem pages are combined, umem->nmap can be much lower than the number of system pages in umem. Use ib_umem_num_pages() to size this buffer. Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/mthca: Use correct sizing on buffers holding page DMA addressesShiraz Saleem
The buffer that holds the page DMA addresses is sized off umem->nmap. This can potentially cause out of bound accesses on the PBL array when iterating the umem DMA-mapped SGL. This is because if umem pages are combined, umem->nmap can be much lower than the number of system pages in umem. Use ib_umem_num_pages() to size this buffer. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/cxbg: Use correct sizing on buffers holding page DMA addressesShiraz Saleem
The PBL array that hold the page DMA address is sized off umem->nmap. This can potentially cause out of bound accesses on the PBL array when iterating the umem DMA-mapped SGL. This is because if umem pages are combined, umem->nmap can be much lower than the number of system pages in umem. Use ib_umem_num_pages() to size this array. Cc: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/bnxt_re: Use correct sizing on buffers holding page DMA addressesSelvin Xavier
umem->nmap is used while allocating internal buffer for storing page DMA addresses. This causes out of bounds array access while iterating the umem DMA-mapped SGL with umem page combining as umem->nmap can be less than number of system pages in umem. Use ib_umem_num_pages() instead of umem->nmap to size the page array. Add a new structure (bnxt_qplib_sg_info) to pass sglist, npages and nmap. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28IB/qib: Remove a set-but-not-used variableBart Van Assche
This patch avoids that a compiler warning is reported when building with W=1. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 49c0e2414b20 ("IB/qib: Change SDMA progression mode depending on single- or multi-rail") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28IB/hfi1: Fix two format stringsBart Van Assche
Enable format string checking for hfi1_cdbg() and fix the resulting compiler warnings. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28IB/mlx5: Declare devx_async_cmd_event_fops staticBart Van Assche
Avoid that sparse complains about a missing declaration. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 6bf8f22aea0d ("IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/uverbs: Allow the compiler to verify declaration and definition consistencyBart Van Assche
This patch avoids that sparse reports the following warnings: drivers/infiniband/core/uverbs_std_types_flow_action.c:442:30: warning: symbol 'uverbs_def_obj_flow_action' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_dm.c:112:30: warning: symbol 'uverbs_def_obj_dm' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_counters.c:153:30: warning: symbol 'uverbs_def_obj_counters' was not declared. Should it be static? drivers/infiniband/core/uverbs_std_types_mr.c:213:30: warning: symbol 'uverbs_def_obj_mr' was not declared. Should it be static? Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 0bd01f3d0907 ("RDMA/uverbs: Require all objects to have a driver destroy function") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/uverbs: Annotate uverbs_request_next_ptr() return value as a __user pointerBart Van Assche
This patch avoids that sparse complains about a mismatch between the returned value and the function return type. Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: c3bea3d2dc53 ("RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28RDMA/uverbs: Add a __user annotation to a pointerBart Van Assche
This patch avoids that sparse and smatch report the following: warning: cast removes address space of expression Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Fixes: 3a6532c9af1a ("RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2019-03-27IB/rxe: Replace av->network_type with skb->protocolZhu Yanjun
In the function rxe_init_packet, based on av->network_type, skb->protocol is set to ipv4 or ipv6. The functions rxe_prepare and rxe_send are called after the functin rxe_init_packet. So in these functions, av->network_type can be replaced with skb->protocol. The functions are in the xmit fast path. So with skb->protocol, the performance will be better. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/MAD: Add SMP details to MAD tracingIra Weiny
Decode more information from the packet and include it in the trace. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/UMAD: Add umad trace pointsIra Weiny
Trace MADs going to/from user space. Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/MAD: Add agent trace pointsIra Weiny
Trace agent details when agents are [un]registered. In addition, report agent details on send/recv. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/MAD: Add recv path trace pointIra Weiny
Trace received MAD details. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/MAD: Add send path trace pointsIra Weiny
Use the standard Linux trace mechanism to trace MADs being sent. 4 trace points are added, when the MAD is posted to the qp, when the MAD is completed, if a MAD is resent, and when the MAD completes in error. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27RDMA/vmw_pvrdma: Skip zeroing device attrsYuval Shaia
Caller already clears props before calling query_device. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/mlx5: Compare only index part of a memory window rkeyArtemy Kovalyov
The InfiniBand Architecture Specification section 10.6.7.2.4 TYPE 2 MEMORY WINDOWS says that if the CI supports the Base Memory Management Extensions defined in this specification, the R_Key format for a Type 2 Memory Window must consist of: * 24 bit index in the most significant bits of the R_Key, which is owned by the CI, and * 8 bit key in the least significant bits of the R_Key, which is owned by the Consumer. This means that the kernel should compare only the index part of a R_Key to determine equality with another R_Key. Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW") Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/mlx5: WQE dump jumps over first 16 bytesArtemy Kovalyov
Move index increment after its is used or otherwise it will start the dump of the WQE from second WQE BB. Fixes: 34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users") Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/mlx5: Reset access mask when looping inside page fault handlerMoni Shoua
If page-fault handler spans multiple MRs then the access mask needs to be reset before each MR handling or otherwise write access will be granted to mapped pages instead of read-only. Cc: <stable@vger.kernel.org> # 3.19 Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults") Reported-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27RDMA/hns: Limit scope of hns_roce_cmq_send()Leon Romanovsky
The forgotten static keyword causes to the following error to appear while building HNS driver. Declare hns_roce_cmq_send() to be static function to fix this warning. drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1089:5: warning: no previous prototype for _hns_roce_cmq_send_ [-Wmissing-prototypes] int hns_roce_cmq_send(struct hns_roce_dev *hr_dev, Fixes: 6a04aed6afae ("RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during reset") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/hfi1: Fix the allocation of RSM tableKaike Wan
The receive side mapping (RSM) on hfi1 hardware is a special matching mechanism to direct an incoming packet to a given hardware receive context. It has 4 instances of matching capabilities (RSM0 - RSM3) that share the same RSM table (RMT). The RMT has a total of 256 entries, each of which points to a receive context. Currently, three instances of RSM have been used: 1. RSM0 by QOS; 2. RSM1 by PSM FECN; 3. RSM2 by VNIC. Each RSM instance should reserve enough entries in RMT to function properly. Since both PSM and VNIC could allocate any receive context between dd->first_dyn_alloc_ctxt and dd->num_rcv_contexts, PSM FECN must reserve enough RMT entries to cover the entire receive context index range (dd->num_rcv_contexts - dd->first_dyn_alloc_ctxt) instead of only the user receive contexts allocated for PSM (dd->num_user_contexts). Consequently, the sizing of dd->num_user_contexts in set_up_context_variables is incorrect. Fixes: 2280740f01ae ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/hfi1: Eliminate opcode tests on mr derefKaike Wan
When an old ack_queue entry is used to store an incoming request, it may need to clean up the old entry if it is still referencing the MR. Originally only RDMA READ request needed to reference MR on the responder side and therefore the opcode was tested when cleaning up the old entry. The introduction of tid rdma specific operations in the ack_queue makes the specific opcode tests wrong. Multiple opcodes (RDMA READ, TID RDMA READ, and TID RDMA WRITE) may need MR ref cleanup. Remove the opcode specific tests associated with the ack_queue. Fixes: f48ad614c100 ("IB/hfi1: Move driver out of staging") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/hfi1: Clear the IOWAIT pending bits when QP is put into error stateKaike Wan
When a QP is put into error state, it may be waiting for send engine resources. In this case, the QP will be removed from the send engine's waiting list, but its IOWAIT pending bits are not cleared. This will normally not have any major impact as the QP is being destroyed. However, the QP still needs to wind down its operations, such as draining the send queue by scheduling the send engine. Clearing the pending bits will avoid any potential complications. In addition, if the QP will eventually hang, clearing the pending bits can help debugging by presenting a consistent picture if the user dumps the qp_stats. This patch clears a QP's IOWAIT_PENDING_IB and IO_PENDING_TID bits in priv->s_iowait.flags in this case. Fixes: 5da0fc9dbf89 ("IB/hfi1: Prepare resource waits for dual leg") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/hfi1: Failed to drain send queue when QP is put into error stateKaike Wan
When a QP is put into error state, all pending requests in the send work queue should be drained. The following sequence of events could lead to a failure, causing a request to hang: (1) The QP builds a packet and tries to send through SDMA engine. However, PIO engine is still busy. Consequently, this packet is put on the QP's tx list and the QP is put on the PIO waiting list. The field qp->s_flags is set with HFI1_S_WAIT_PIO_DRAIN; (2) The QP is put into error state by the user application and notify_error_qp() is called, which removes the QP from the PIO waiting list and the packet from the QP's tx list. In addition, qp->s_flags is cleared of RVT_S_ANY_WAIT_IO bits, which does not include HFI1_S_WAIT_PIO_DRAIN bit; (3) The hfi1_schdule_send() function is called to drain the QP's send queue. Subsequently, hfi1_do_send() is called. Since the flag bit HFI1_S_WAIT_PIO_DRAIN is set in qp->s_flags, hfi1_send_ok() fails. As a result, hfi1_do_send() bails out without draining any request from the send queue; (4) The PIO engine completes the sending and tries to wake up any QP on its waiting list. But the QP has been removed from the PIO waiting list and therefore is kept in sleep forever. The fix is to clear qp->s_flags of HFI1_S_ANY_WAIT_IO bits in step (2). HFI1_S_ANY_WAIT_IO includes RVT_S_ANY_WAIT_IO and HFI1_S_WAIT_PIO_DRAIN. Fixes: 2e2ba09e48b7 ("IB/rdmavt, IB/hfi1: Create device dependent s_flags") Cc: <stable@vger.kernel.org> # 4.19.x+ Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/iser: remove uninitialized variable lenColin Ian King
The variable len is not being inintialized and the uninitialized value is being returned. However, this return path is never reached because the default case in the switch statement returns -ENOSYS. Clean up the code by replacing the return -ENOSYS with a break for the default case and returning -ENOSYS at the end of the function. This allows len to be removed. Also remove redundant break that follows a return statement. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27RDMA/i40iw: Handle workqueue allocation failureKangjie Lu
alloc_ordered_workqueue may fail and return NULL. The fix captures the failure and handles it properly to avoid potential NULL pointer dereferences. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Reviewed-by: Shiraz, Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>