summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2023-08-03IB/hfi1: Fix possible panic during hotplug removeDouglas Miller
During hotplug remove it is possible that the update counters work might be pending, and may run after memory has been freed. Cancel the update counters work before freeing memory. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Link: https://lore.kernel.org/r/169099756100.3927190.15284930454106475280.stgit@awfm-02.cornelisnetworks.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-08-03RDMA/irdma: Replace one-element array with flexible-array memberGustavo A. R. Silva
One-element and zero-length arrays are deprecated. So, replace one-element array in struct irdma_qvlist_info with flexible-array member. A patch for this was sent a while ago[1]. However, it seems that, at the time, the changes were partially folded[2][3], and the actual flexible-array transformation was omitted. This patch fixes that. The only binary difference seen before/after changes is shown below: | drivers/infiniband/hw/irdma/hw.o | @@ -868,7 +868,7 @@ | drivers/infiniband/hw/irdma/hw.c:484 (discriminator 2) | size += struct_size(iw_qvlist, qv_info, rf->msix_count); | 55b: imul $0x45c,%rdi,%rdi |- 562: add $0x10,%rdi |+ 562: add $0x4,%rdi which is, of course, expected as it reflects the mistake made while folding the patch I've mentioned above. Worth mentioning is the fact that with this change we save 12 bytes of memory, as can be inferred from the diff snapshot above. Notice that: $ pahole -C rdma_qv_info idrivers/infiniband/hw/irdma/hw.o struct irdma_qv_info { u32 v_idx; /* 0 4 */ u16 ceq_idx; /* 4 2 */ u16 aeq_idx; /* 6 2 */ u8 itr_idx; /* 8 1 */ /* size: 12, cachelines: 1, members: 4 */ /* padding: 3 */ /* last cacheline: 12 bytes */ }; Link: https://lore.kernel.org/linux-hardening/20210525230038.GA175516@embeddedor/ [1] Link: https://lore.kernel.org/linux-hardening/bf46b428deef4e9e89b0ea1704b1f0e5@intel.com/ [2] Link: https://lore.kernel.org/linux-rdma/20210520143809.819-1-shiraz.saleem@intel.com/T/#u [3] Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZMpsQrZadBaJGkt4@work Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31RDMA/hns: Remove unused function declarationsYue Haibing
commit b16f8188472e ("RDMA/hns: Refactor eq code for hip06") left behind hns_roce_cleanup_eq_table(). commit 773f841ab1ae ("RDMA/hns: Avoid filling sgid index when modifying QP to RTR") leave hns_get_gid_index() unused. Remove both. Link: https://lore.kernel.org/r/20230731135916.32392-1-yuehaibing@huawei.com Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-31RDMA/rxe: Fix incomplete state save in rxe_requesterBob Pearson
If a send packet is dropped by the IP layer in rxe_requester() the call to rxe_xmit_packet() can fail with err == -EAGAIN. To recover, the state of the wqe is restored to the state before the packet was sent so it can be resent. However, the routines that save and restore the state miss a significnt part of the variable state in the wqe, the dma struct which is used to process through the sge table. And, the state is not saved before the packet is built which modifies the dma struct. Under heavy stress testing with many QPs on a fast node sending large messages to a slow node dropped packets are observed and the resent packets are corrupted because the dma struct was not restored. This patch fixes this behavior and allows the test cases to succeed. Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer") Link: https://lore.kernel.org/r/20230721200748.4604-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-31RDMA/rxe: Fix rxe_modify_srqBob Pearson
This patch corrects an error in rxe_modify_srq where if the caller changes the srq size the actual new value is not returned to the caller since it may be larger than what is requested. Additionally it open codes the subroutine rcv_wqe_size() which adds very little value, and makes some whitespace changes. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20230620140142.9452-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-31RDMA/rxe: Fix unsafe drain work queue codeBob Pearson
If create_qp does not fully succeed it is possible for qp cleanup code to attempt to drain the send or recv work queues before the queues have been created causing a seg fault. This patch checks to see if the queues exist before attempting to drain them. Link: https://lore.kernel.org/r/20230620135519.9365-3-rpearsonhpe@gmail.com Reported-by: syzbot+2da1965168e7dbcba136@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-rdma/00000000000012d89205fe7cfe00@google.com/raw Fixes: 49dc9c1f0c7e ("RDMA/rxe: Cleanup reset state handling in rxe_resp.c") Fixes: fbdeb828a21f ("RDMA/rxe: Cleanup error state handling in rxe_comp.c") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-31RDMA/rxe: Move work queue code to subroutinesBob Pearson
This patch: - Moves code to initialize a qp send work queue to a subroutine named rxe_init_sq. - Moves code to initialize a qp recv work queue to a subroutine named rxe_init_rq. - Moves initialization of qp request and response packet queues ahead of work queue initialization so that cleanup of a qp if it is not fully completed can successfully attempt to drain the packet queues without a seg fault. - Makes minor whitespace cleanups. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20230620135519.9365-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-31RDMA: Remove unnecessary ternary operatorsRuan Jinjie
There are a little ternary operators, the true or false judgment of which is unnecessary in C language semantics. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230731085118.394443-1-ruanjinjie@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31IB/mlx5: Add HW counter called rx_dct_connectShetu Ayalew
The rx_dct_connect counter shows the number of received connection requests for the associated DCTs. Signed-off-by: Shetu Ayalew <shetu@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/01cd24cd7f591734741309921fdc01fc770d84a8.1690121941.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-07-31RDMA/umem: Set iova in ODP flowMichael Guralnik
Fixing the ODP registration flow to set the iova correctly. The calculation in ib_umem_num_dma_blocks() function assumes the iova of the umem is set correctly. When iova is not set, the calculation in ib_umem_num_dma_blocks() is equivalent to length/page_size, which is true only when memory is aligned. For unaligned memory, iova must be set for the ALIGN() in the ib_umem_num_dma_blocks() to take effect and return a correct value. mlx5_ib uses ib_umem_num_dma_blocks() to decide the mkey size to use for the MR. Without this fix, when registering unaligned ODP MR, a wrong size mkey might be chosen and this might cause the UMR to fail. UMR would fail over insufficient size to update the mkey translation: infiniband mlx5_0: dump_cqe:273:(pid 0): dump error cqe 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 0f 00 78 06 25 00 00 58 00 da ac d2 infiniband mlx5_0: mlx5_ib_post_send_wait:806:(pid 20311): reg umr failed (6) infiniband mlx5_0: pagefault_real_mr:661:(pid 20311): Failed to update mkey page tables Fixes: f0093fb1a7cb ("RDMA/mlx5: Move mlx5_ib_cont_pages() to the creation of the mlx5_ib_mr") Fixes: a665aca89a41 ("RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()") Signed-off-by: Artemy Kovalyov <artemyko@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/3d4be7ca2155bf239dd8c00a2d25974a92c26ab8.1689757344.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31RDMA/mthca: Remove unnecessary NULL assignmentsRuan Jinjie
There are many pointers assigned first, which need not to be initialized, so remove the NULL assignments. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230731065543.2285928-1-ruanjinjie@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31RDMA/irdma: Fix one kernel-doc commentYang Li
Remove description of @free_hwcqp in irdma_destroy_cqp(). to silence the warning: drivers/infiniband/hw/irdma/hw.c:580: warning: Excess function parameter 'free_hwcqp' description in 'irdma_destroy_cqp' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6028 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230731015915.34867-1-yang.lee@linux.alibaba.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31RDMA/siw: Fix tx thread initialization.Bernard Metzler
Immediately removing the siw module after insertion may crash in siw_stop_tx_thread(), if the according thread did not yet had a chance to initialize its wait queue and siw_stop_tx_thread() tries to wakeup that thread. Initializing the threads state before spwaning it fixes it. Reported-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://lore.kernel.org/r/20230728114418.124328-1-bmt@zurich.ibm.com Tested-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31RDMA/mlx: Remove unnecessary variable initializationsRuan Jinjie
Remove unnecessary variable initializations. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20230728065139.3411703-1-ruanjinjie@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30RDMA/irdma: Use HW specific minimum WQ sizeSindhu Devale
HW GEN1 and GEN2 have different min WQ sizes but they are currently set to the same value. Use a gen specific attribute min_hw_wq_size and extend ABI to pass it to user-space. Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155525.1081-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30RDMA/irdma: Allow accurate reporting on QP max send/recv WRSindhu Devale
Currently the attribute cap.max_send_wr and cap.max_recv_wr sent from user-space during create QP are the provider computed SQ/RQ depth as opposed to raw values passed from application. This inhibits computation of an accurate value for max_send_wr and max_recv_wr for this QP in the kernel which matches the value returned in user create QP. Also these capabilities needs to be reported from the driver in query QP. Add support by extending the ABI to allow the raw cap.max_send_wr and cap.max_recv_wr to be passed from user-space, while keeping compatibility for the older scheme. The internal HW depth and shift needed for the WQs needs to be computed now for both kernel and user-mode QPs. Add new helpers to assist with this: irdma_uk_calc_depth_shift_sq, irdma_uk_calc_depth_shift_rq and irdma_uk_calc_depth_shift_wq. Consolidate all the user mode QP setup into a new function irdma_setup_umode_qp which keeps it with its counterpart irdma_setup_kmode_qp. Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155525.1081-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30RDMA/core: Get IB width and speed from netdevHaoyue Xu
Previously, there was no way to query the number of lanes for a network card, so the same netdev_speed would result in a fixed pair of width and speed. As network card specifications become more diverse, such fixed mode is no longer suitable, so a method is needed to obtain the correct width and speed based on the number of lanes. This patch retrieves netdev lanes and speed from net_device and translates them to IB width and speed. Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Signed-off-by: Luoyouming <luoyouming@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20230721092052.2090449-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30bnxt_re: Update the debug counters for doorbell pacingChandramohan Akula
Add debug counters to track the Doorbell pacing events and report the doorbell pacing debug stats. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1690383081-15033-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30bnxt_re: Expose the missing hw countersChandramohan Akula
Add code to expose some of the HW counters related to tx/rx data and Congestion control. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1690383081-15033-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30bnxt_re: Update the hw counters for resource statsChandramohan Akula
Report the additional resource counters which enables better debugging. Includes active RC/UD QPs, Watermark of the resources and a count that indicates the resize cq operations after driver load. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1690383081-15033-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-30bnxt_re: Reorganize the resource statsChandramohan Akula
Move the resource stats to a separate stats structure. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1690383081-15033-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Cleanup and rename irdma_netdev_vlan_ipv6()Mustafa Ismail
The return value from irdma_netdev_vlan_ipv6() is not used. Rename the functions and change to a void return. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155505.1069-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Add table based lookup for CQ pointer during an eventKrzysztof Czurylo
Add a CQ table based loookup to allow quick search for CQ pointer having CQ ID in case of CQ related asynchrononous event. The table is implemented in a similar fashion to QP table. Also add a reference counters for CQ. This is to prevent destroying CQ while an asynchronous event is being processed. The memory resource table size is sized higher with this update, and this table doesn't need to be physically contiguous, so use a vzalloc vs kzalloc to allocate the table. Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155505.1069-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Refactor error handling in create CQPSindhu Devale
In case of a failure in irdma_create_cqp, do not call irdma_destroy_cqp, but cleanup all the allocated resources in reverse order. Drop the extra argument in irdma_destroy_cqp as its no longer needed. Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155505.1069-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Drop a local in irdma_sc_get_next_aeqeSindhu Devale
Drop the local wqe_idx in irdma_sc_get_next_aeqe and instead store the wqe_idx in the info structure for all asynchronous events(AE) received. There is no reason it should be tied to a specific AE source. Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155505.1069-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Report correct WC errorSindhu Devale
Report the correct WC error if a MW bind is performed on an already valid/bound window. Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155439.1057-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26RDMA/irdma: Fix op_type reporting in CQEsSindhu Devale
The op_type field CQ poll info structure is incorrectly filled in with the queue type as opposed to the op_type received in the CQEs. The wrong opcode could be decoded and returned to the ULP. Copy the op_type field received in the CQE in the CQ poll info structure. Fixes: 24419777e943 ("RDMA/irdma: Fix RQ completion opcode") Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155439.1057-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-25scsi: RDMA/srp: Fix residual handlingBart Van Assche
Although the code for residual handling in the SRP initiator follows the SCSI documentation, that documentation has never been correct. Because scsi_finish_command() starts from the data buffer length and subtracts the residual, scsi_set_resid() must not be called if a residual overflow occurs. Hence remove the scsi_set_resid() calls from the SRP initiator if a residual overflow occurrs. Cc: Leon Romanovsky <leon@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Fixes: 9237f04e12cc ("scsi: core: Fix scsi_get/set_resid() interface") Fixes: e714531a349f ("IB/srp: Fix residual handling") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230724200843.3376570-3-bvanassche@acm.org Acked-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-07-24IB/hfi1: Use struct_size()Christophe JAILLET
Use struct_size() instead of hand-writing it, when allocating a structure with a flex array. This is less verbose, more robust and more informative. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/f4618a67d5ae0a30eb3f2b4558c8cc790feed79a.1690044376.git.christophe.jaillet@wanadoo.fr Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-24RDMA/hns: Remove VF extend configurationJunxian Huang
Remove VF extend configuration since the relative registers are configured in firmware currently. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20230721025146.450831-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-24RDMA/hns: Support get XRCD number from firmwareLuoyouming
Support driver get the num of XRCD from firmware. Signed-off-by: Luoyouming <luoyouming@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20230721025146.450831-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-23RDMA/qedr: Remove duplicate assignments of vaMinjie Du
Avoid double assignment of iwqp->ietf_mem.va. Signed-off-by: Minjie Du <duminjie@vivo.com> Link: https://lore.kernel.org/r/20230705031849.2443-1-duminjie@vivo.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-23RDMA/qedr: Remove a duplicate assignment in qedr_create_gsi_qp()Minjie Du
Delete a duplicate statement from this function implementation. Signed-off-by: Minjie Du <duminjie@vivo.com> Link: https://lore.kernel.org/r/20230705103950.15225-1-duminjie@vivo.com Acked-by: Alok Prasad <palok@marvell.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-21RDMA/bnxt_re: Add a new uapi for driver notificationChandramohan Akula
Add driver notify uapi for application notifying the driver about the doorbell FIFO congestion. Link: https://lore.kernel.org/r/1689742977-9128-8-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/bnxt_re: Implement doorbell pacing algorithmChandramohan Akula
User applications alert the driver when the Doorbell FIFO reaches the alarm threshold. The driver updates the pacing parameters in the shared page to do the maximum pacing by the application till the DB FIFO congestion reduces to pacing threshold. Driver keeps checking the DB FIFO depth at the pacing interval and gradually adjusts the pacing level. Once the pacing level reaches default values (no congestion in the FIFO) pacing gets completed. Link: https://lore.kernel.org/r/1689742977-9128-7-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/bnxt_re: Update alloc_page uapi for pacingChandramohan Akula
Update the alloc_page uapi functionality for handling the mapping of doorbell pacing shared page and bar address. Link: https://lore.kernel.org/r/1689742977-9128-6-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/bnxt_re: Enable pacing support for the user appsChandramohan Akula
Report the pacing capability to the user applications. Link: https://lore.kernel.org/r/1689742977-9128-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/bnxt_re: Initialize Doorbell pacing featureChandramohan Akula
Checks for pacing feature capability and get the doorbell pacing configuration using FW commands. Allocate a page and initialize the pacing parameters for the applications. Cleanup the page and de-initialize the pacing during device removal. Link: https://lore.kernel.org/r/1689742977-9128-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/cma: Avoid GID lookups on iWARP devicesChuck Lever
We would like to enable the use of siw on top of a VPN that is constructed and managed via a tun device. That hasn't worked up until now because ARPHRD_NONE devices (such as tun devices) have no GID for the RDMA/core to look up. But it turns out that the egress device has already been picked for us -- no GID is necessary. addr_handler() just has to do the right thing with it. Link: https://lore.kernel.org/r/168960675257.3007.4737911174148394395.stgit@manet.1015granger.net Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/cma: Deduplicate error flow in cma_validate_port()Chuck Lever
Clean up to prepare for the addition of new logic. Link: https://lore.kernel.org/r/168960674597.3007.6128252077812202526.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/core: Set gid_attr.ndev for iWARP devicesChuck Lever
Have the iwarp side properly set the ndev in the device's sgid_attrs so that address resolution can treat it more like a RoCE device. Link: https://lore.kernel.org/r/168960673933.3007.8043081822081877578.stgit@manet.1015granger.net Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/siw: Fabricate a GID on tun and loopback devicesChuck Lever
LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses. Currently, siw_device_create() falls back to copying the IB device's name in those cases, because an all-zero MAC address breaks the RDMA core address resolution mechanism. However, at the point when siw_device_create() constructs a GID, the ib_device::name field is uninitialized, leaving the MAC address to remain in an all-zero state. Fabricate a random artificial GID for such devices, and ensure this artificial GID is returned for all device query operations. Link: https://lore.kernel.org/r/168960673260.3007.12378736853793339110.stgit@manet.1015granger.net Reported-by: Tom Talpey <tom@talpey.com> Fixes: a2d36b02c15d ("RDMA/siw: Enable siw on tunnel devices") Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/bnxt_re: use vmalloc_array and vcallocJulia Lawall
Use vmalloc_array and vcalloc to protect against multiplication overflows. The changes were done using the following Coccinelle semantic patch: // <smpl> @initialize:ocaml@ @@ let rename alloc = match alloc with "vmalloc" -> "vmalloc_array" | "vzalloc" -> "vcalloc" | _ -> failwith "unknown" @@ size_t e1,e2; constant C1, C2; expression E1, E2, COUNT, x1, x2, x3; typedef u8; typedef __u8; type t = {u8,__u8,char,unsigned char}; identifier alloc = {vmalloc,vzalloc}; fresh identifier realloc = script:ocaml(alloc) { rename alloc }; @@ ( alloc(x1*x2*x3) | alloc(C1 * C2) | alloc((sizeof(t)) * (COUNT), ...) | - alloc((e1) * (e2)) + realloc(e1, e2) | - alloc((e1) * (COUNT)) + realloc(COUNT, e1) | - alloc((E1) * (E2)) + realloc(E1, E2) ) // </smpl> Link: https://lore.kernel.org/r/20230627144339.144478-20-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/siw: use vmalloc_array and vcallocJulia Lawall
Use vmalloc_array and vcalloc to protect against multiplication overflows. The changes were done using the following Coccinelle semantic patch: // <smpl> @initialize:ocaml@ @@ let rename alloc = match alloc with "vmalloc" -> "vmalloc_array" | "vzalloc" -> "vcalloc" | _ -> failwith "unknown" @@ size_t e1,e2; constant C1, C2; expression E1, E2, COUNT, x1, x2, x3; typedef u8; typedef __u8; type t = {u8,__u8,char,unsigned char}; identifier alloc = {vmalloc,vzalloc}; fresh identifier realloc = script:ocaml(alloc) { rename alloc }; @@ ( alloc(x1*x2*x3) | alloc(C1 * C2) | alloc((sizeof(t)) * (COUNT), ...) | - alloc((e1) * (e2)) + realloc(e1, e2) | - alloc((e1) * (COUNT)) + realloc(COUNT, e1) | - alloc((E1) * (E2)) + realloc(E1, E2) ) // </smpl> Link: https://lore.kernel.org/r/20230627144339.144478-15-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-21RDMA/erdma: use vmalloc_array and vcallocJulia Lawall
Use vmalloc_array and vcalloc to protect against multiplication overflows. The changes were done using the following Coccinelle semantic patch: // <smpl> @initialize:ocaml@ @@ let rename alloc = match alloc with "vmalloc" -> "vmalloc_array" | "vzalloc" -> "vcalloc" | _ -> failwith "unknown" @@ size_t e1,e2; constant C1, C2; expression E1, E2, COUNT, x1, x2, x3; typedef u8; typedef __u8; type t = {u8,__u8,char,unsigned char}; identifier alloc = {vmalloc,vzalloc}; fresh identifier realloc = script:ocaml(alloc) { rename alloc }; @@ ( alloc(x1*x2*x3) | alloc(C1 * C2) | alloc((sizeof(t)) * (COUNT), ...) | - alloc((e1) * (e2)) + realloc(e1, e2) | - alloc((e1) * (COUNT)) + realloc(COUNT, e1) | - alloc((E1) * (E2)) + realloc(E1, E2) ) // </smpl> Link: https://lore.kernel.org/r/20230627144339.144478-6-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-07-19RDMA/irdma: Fix building without IPv6Arnd Bergmann
The new irdma_iw_get_vlan_prio() function requires IPv6 support to build: x86_64-linux-ld: drivers/infiniband/hw/irdma/cm.o: in function `irdma_iw_get_vlan_prio': cm.c:(.text+0x2832): undefined reference to `ipv6_chk_addr' Add a compile-time check in the same way as elsewhere in this file to avoid this by conditionally leaving out the ipv6 specific bits. Fixes: f877f22ac1e9b ("RDMA/irdma: Implement egress VLAN priority") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230718193835.3546684-1-arnd@kernel.org Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-18RDMA/rxe: Fix an error handling path in rxe_bind_mw()Christophe JAILLET
All errors go to the error handling path, except this one. Be consistent and also branch to it. Fixes: 02ed253770fb ("RDMA/rxe: Introduce rxe access supported flags") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/43698d8a3ed4e720899eadac887427f73d7ec2eb.1689623735.git.christophe.jaillet@wanadoo.fr Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17RDMA/bnxt_re: Fix hang during driver unloadSelvin Xavier
Driver unload hits a hang during stress testing of load/unload. stack trace snippet - tasklet_kill at ffffffff9aabb8b2 bnxt_qplib_nq_stop_irq at ffffffffc0a805fb [bnxt_re] bnxt_qplib_disable_nq at ffffffffc0a80c5b [bnxt_re] bnxt_re_dev_uninit at ffffffffc0a67d15 [bnxt_re] bnxt_re_remove_device at ffffffffc0a6af1d [bnxt_re] tasklet_kill can hang if the tasklet is scheduled after it is disabled. Modified the sequences to disable the interrupt first and synchronize irq before disabling the tasklet. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1689322969-25402-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17RDMA/bnxt_re: Prevent handling any completions after qp destroyKashyap Desai
HW may generate completions that indicates QP is destroyed. Driver should not be scheduling any more completion handlers for this QP, after the QP is destroyed. Since CQs are active during the QP destroy, driver may still schedule completion handlers. This can cause a race where the destroy_cq and poll_cq running simultaneously. Snippet of kernel panic while doing bnxt_re driver load unload in loop. This indicates a poll after the CQ is freed.  [77786.481636] Call Trace: [77786.481640]  <TASK> [77786.481644]  bnxt_re_poll_cq+0x14a/0x620 [bnxt_re] [77786.481658]  ? kvm_clock_read+0x14/0x30 [77786.481693]  __ib_process_cq+0x57/0x190 [ib_core] [77786.481728]  ib_cq_poll_work+0x26/0x80 [ib_core] [77786.481761]  process_one_work+0x1e5/0x3f0 [77786.481768]  worker_thread+0x50/0x3a0 [77786.481785]  ? __pfx_worker_thread+0x10/0x10 [77786.481790]  kthread+0xe2/0x110 [77786.481794]  ? __pfx_kthread+0x10/0x10 [77786.481797]  ret_from_fork+0x2c/0x50 To avoid this, complete all completion handlers before returning the destroy QP. If free_cq is called soon after destroy_qp, IB stack will cancel the CQ work before invoking the destroy_cq verb and this will prevent any race mentioned. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1689322969-25402-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17RDMA/mthca: Fix crash when polling CQ for shared QPsThomas Bogendoerfer
Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") introduced a new struct mthca_sqp which doesn't contain struct mthca_qp any longer. Placing a pointer of this new struct into qptable leads to crashes, because mthca_poll_one() expects a qp pointer. Fix this by putting the correct pointer into qptable. Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de Signed-off-by: Leon Romanovsky <leon@kernel.org>