summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/irdma
AgeCommit message (Collapse)Author
2023-06-27Merge tag 'v6.4' into rdma.git for-nextJason Gunthorpe
Linux 6.4 Resolve conflicts between rdma rc and next in rxe_cq matching linux-next: drivers/infiniband/sw/rxe/rxe_cq.c: https://lore.kernel.org/r/20230622115246.365d30ad@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-01RDMA/irdma: avoid fortify-string warning in irdma_clr_wqesArnd Bergmann
Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") triggers a warning for fortified memset(): In function 'fortify_memset_chk', inlined from 'irdma_clr_wqes' at drivers/infiniband/hw/irdma/uk.c:103:4: include/linux/fortify-string.h:493:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 493 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here isthat the inner array only has four 8-byte elements, so clearing 4096 bytes overflows that. As this structure is part of an outer array, change the code to pass a pointer to the irdma_qp_quanta instead, and change the size argument for readability, matching the comment above it. Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries") Link: https://lore.kernel.org/r/20230523111859.2197825-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29RDMA/irdma: Fix Local Invalidate fencingMustafa Ismail
If the local invalidate fence is indicated in the WR, only the read fence is currently being set in WQE. Fix this to set both the read and local fence in the WQE. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20230522155654.1309-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29RDMA/irdma: Prevent QP use after freeMustafa Ismail
There is a window where the poll cq may use a QP that has been freed. This can happen if a CQE is polled before irdma_clean_cqes() can clear the CQE's related to the QP and the destroy QP races to free the QP memory. then the QP structures are used in irdma_poll_cq. Fix this by moving the clearing of CQE's before the reference is removed and the QP is destroyed. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20230522155654.1309-3-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17RDMA/irdma: Move iw device ops initializationKamal Heib
Move the initialization of the iw device ops to be under the declaration of the irdma_iw_dev_ops. Link: https://lore.kernel.org/r/20230515191142.413633-4-kheib@redhat.com Signed-off-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17RDMA/irdma: Return void from irdma_init_rdma_device()Kamal Heib
The return value from irdma_init_rdma_device() is always 0 - change it to be void. Link: https://lore.kernel.org/r/20230515191142.413633-3-kheib@redhat.com Signed-off-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17RDMA/irdma: Return void from irdma_init_iw_device()Kamal Heib
The return value from irdma_init_iw_device() is always 0 - change it to be void. Link: https://lore.kernel.org/r/20230515191142.413633-2-kheib@redhat.com Signed-off-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "Usual wide collection of unrelated items in drivers: - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5, rxe, usnic, usnic, bnxt_re, ocrdma, iser: - remove unnecessary NULL checks - kmap obsolescence - pci_enable_pcie_error_reporting() obsolescence - unused variables and macros - trace event related warnings - casting warnings - Code cleanups for irdm and erdma - EFA reporting of 128 byte PCIe TLP support - mlx5 more agressively uses the out of order HW feature - Big rework of how state machines and tasks work in rxe - Fix a syzkaller found crash netdev refcount leak in siw - bnxt_re revises their HW description header - Congestion control for bnxt_re - Use mmu_notifiers more safely in hfi1 - mlx5 gets better support for PCIe relaxed ordering inside VMs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits) RDMA/efa: Add rdma write capability to device caps RDMA/mlx5: Use correct device num_ports when modify DC RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call RDMA/rxe: Fix spinlock recursion deadlock on requester RDMA/mlx5: Fix flow counter query via DEVX RDMA/rxe: Protect QP state with qp->state_lock RDMA/rxe: Move code to check if drained to subroutine RDMA/rxe: Remove qp->req.state RDMA/rxe: Remove qp->comp.state RDMA/rxe: Remove qp->resp.state RDMA/mlx5: Allow relaxed ordering read in VFs and VMs net/mlx5: Update relaxed ordering read HCA capabilities RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write RDMA: Add ib_virt_dma_to_page() RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame() RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c IB/hfi1: Place struct mmu_rb_handler on cache line start IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests ...
2023-04-21RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() callTejun Heo
Workqueue is in the process of cleaning up the distinction between unbound workqueues w/ @nr_active==1 and ordered workqueues. Explicit WQ_UNBOUND isn't needed for alloc_ordered_workqueue() and will trigger a warning in the future. Let's remove it. This doesn't cause any functional changes. Link: https://lore.kernel.org/r/ZEGW-IcFReR1juVM@slm.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-13RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()Christophe JAILLET
There is no need to zero 'pktsize' bytes of 'buf', only the header needs to be cleared, to be safe. All the other bytes are already written with some memcpy() at the end of the function. Doing so also gives the opportunity to the compiler to avoid the memset() call. It can be inlined now that the length is known as compile time. Link: https://lore.kernel.org/r/098e3c397be0436f1867899245ecfe656c472110.1675369386.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-19RDMA/irdma: Add ipv4 check to irdma_find_listener()Tatyana Nikolova
Add ipv4 check to irdma_find_listener(). Otherwise the function incorrectly finds and returns a listener with a different addr family for the zero IP addr, if a listener with a zero IP addr and the same port as the one searched for has already been created. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Increase iWARP CM default rexmit countMustafa Ismail
When running perftest with large number of connections in iWARP mode, the passive side could be slow to respond. Increase the rexmit counter default to allow scaling connections. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Fix memory leak of PBLE objectsMustafa Ismail
On rmmod of irdma, the PBLE object memory is not being freed. PBLE object memory are not statically pre-allocated at function initialization time unlike other HMC objects. PBLEs objects and the Segment Descriptors (SD) for it can be dynamically allocated during scale up and SD's remain allocated till function deinitialization. Fix this leak by adding IRDMA_HMC_IW_PBLE to the iw_hmc_obj_types[] table and skip pbles in irdma_create_hmc_obj but not in irdma_del_hmc_objects(). Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Do not generate SW completions for NOPsMustafa Ismail
Currently, artificial SW completions are generated for NOP wqes which can generate unexpected completions with wr_id = 0. Skip the generation of artificial completions for NOPs. Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Refactor PBLE functionsSindhu Devale
Refactor PBLE functions using a bit mask to represent the PBLE level desired versus 2 parameters use_pble and lvl_one_only which makes the code confusing. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Change name of interruptsMichal Swiatkowski
Add more information in interrupt names. Before this patch it was: irdma CEQ CEQ ... Now: irdma-0000:18:00.0-AEQ irdma-0000:18:00.0-CEQ-0 irdma-0000:18:00.0-CEQ-1 ... Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Suggested-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Remove a redundant irdma_arp_table() callTatyana Nikolova
Remove a redundant function call in irdma_modify_qp_roce, since irdma_arp_table() with IRDMA_ARP_RESOLVE action is called after the if/else ipv check as part of irdma_add_arp(). Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Refactor HW statisticsKrzysztof Czurylo
Refactor HW statistics which, - Unifies HW statistics support for all HW generations. - Unifies support of 32- and 64-bit counters. - Removes duplicated code and simplifies implementation. - Fixes roll-over handling. - Removes unneeded last_hw_stats. With new implementation, there is no separate handling and no separate arrays for 32- and 64-bit counters (offsets, regs, values). Instead, there is a HW stats map array for each HW revision, which defines HW-specific width and location of each counter in the statistics buffer. Once the statistics are gathered (either via CQP op, or by reading HW registers), counter values are extracted from the statistics buffer using the stats map and the delta between the last and new values is computed. Finally, the counter values in rdma_hw_stats are incremented by those deltas. From the OS perspective, all the counters are 64-bit and their order in rdma_hw_stats->value[] array, as well as in irdma_hw_stat_names[], is the same for all HW gens. New statistics should always be added at the end. Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "Quite a small cycle this time, even with the rc8. I suppose everyone went to sleep over xmas. - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw, mana - inline CQE support for hns - Have mlx5 display device error codes - Pinned DMABUF support for irdma - Continued rxe cleanups, particularly converting the MRs to use xarray - Improvements to what can be cached in the mlx5 mkey cache" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (61 commits) IB/mlx5: Extend debug control for CC parameters IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors IB/hfi1: Fix math bugs in hfi1_can_pin_pages() RDMA/irdma: Add support for dmabuf pin memory regions RDMA/mlx5: Use query_special_contexts for mkeys net/mlx5e: Use query_special_contexts for mkeys net/mlx5: Change define name for 0x100 lkey value net/mlx5: Expose bits for querying special mkeys RDMA/rxe: Fix missing memory barriers in rxe_queue.h RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet RDMA/rxe: Remove rxe_alloc() RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size Subject: RDMA/rxe: Handle zero length rdma iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry() RDMA/mlx5: Use rdma_umem_for_each_dma_block() RDMA/umem: Remove unused 'work' member from struct ib_umem RDMA/irdma: Cap MSIX used to online CPUs + 1 RDMA/mlx5: Check reg_create() create for errors RDMA/restrack: Correct spelling RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish() ...
2023-02-17RDMA/irdma: Add support for dmabuf pin memory regionsZhu Yanjun
This is a followup to the EFA dmabuf[1]. Irdma driver currently does not support on-demand-paging(ODP). So it uses habanalabs as the dmabuf exporter, and irdma as the importer to allow for peer2peer access through libibverbs. In this commit, the function ib_umem_dmabuf_get_pinned() is used. This function is introduced in EFA dmabuf[1] which allows the driver to get a dmabuf umem which is pinned and does not require move_notify callback implementation. The returned umem is pinned and DMA mapped like standard cpu umems, and is released through ib_umem_release(). [1]https://lore.kernel.org/lkml/20211007114018.GD2688930@ziepe.ca/t/ Link: https://lore.kernel.org/r/20230217011425.498847-1-yanjun.zhu@intel.com Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-08RDMA/irdma: Cap MSIX used to online CPUs + 1Mustafa Ismail
The irdma driver can use a maximum number of msix vectors equal to num_online_cpus() + 1 and the kernel warning stack below is shown if that number is exceeded. The kernel throws a warning as the driver tries to update the affinity hint with a CPU mask greater than the max CPU IDs. Fix this by capping the MSIX vectors to num_online_cpus() + 1. WARNING: CPU: 7 PID: 23655 at include/linux/cpumask.h:106 irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma] RIP: 0010:irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma] Call Trace: irdma_rt_init_hw+0xa62/0x1290 [irdma] ? irdma_alloc_local_mac_entry+0x1a0/0x1a0 [irdma] ? __is_kernel_percpu_address+0x63/0x310 ? rcu_read_lock_held_common+0xe/0xb0 ? irdma_lan_unregister_qset+0x280/0x280 [irdma] ? irdma_request_reset+0x80/0x80 [irdma] ? ice_get_qos_params+0x84/0x390 [ice] irdma_probe+0xa40/0xfc0 [irdma] ? rcu_read_lock_bh_held+0xd0/0xd0 ? irdma_remove+0x140/0x140 [irdma] ? rcu_read_lock_sched_held+0x62/0xe0 ? down_write+0x187/0x3d0 ? auxiliary_match_id+0xf0/0x1a0 ? irdma_remove+0x140/0x140 [irdma] auxiliary_bus_probe+0xa6/0x100 __driver_probe_device+0x4a4/0xd50 ? __device_attach_driver+0x2c0/0x2c0 driver_probe_device+0x4a/0x110 __driver_attach+0x1aa/0x350 bus_for_each_dev+0x11d/0x1b0 ? subsys_dev_iter_init+0xe0/0xe0 bus_add_driver+0x3b1/0x610 driver_register+0x18e/0x410 ? 0xffffffffc0b88000 irdma_init_module+0x50/0xaa [irdma] do_one_initcall+0x103/0x5f0 ? perf_trace_initcall_level+0x420/0x420 ? do_init_module+0x4e/0x700 ? __kasan_kmalloc+0x7d/0xa0 ? kmem_cache_alloc_trace+0x188/0x2b0 ? kasan_unpoison+0x21/0x50 do_init_module+0x1d1/0x700 load_module+0x3867/0x5260 ? layout_and_allocate+0x3990/0x3990 ? rcu_read_lock_held_common+0xe/0xb0 ? rcu_read_lock_sched_held+0x62/0xe0 ? rcu_read_lock_bh_held+0xd0/0xd0 ? __vmalloc_node_range+0x46b/0x890 ? lock_release+0x5c8/0xba0 ? alloc_vm_area+0x120/0x120 ? selinux_kernel_module_from_file+0x2a5/0x300 ? __inode_security_revalidate+0xf0/0xf0 ? __do_sys_init_module+0x1db/0x260 __do_sys_init_module+0x1db/0x260 ? load_module+0x5260/0x5260 ? do_syscall_64+0x22/0x450 do_syscall_64+0xa5/0x450 entry_SYSCALL_64_after_hwframe+0x66/0xdb Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Link: https://lore.kernel.org/r/20230207201938.1329-1-sindhu.devale@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-29RDMA/irdma: Fix potential NULL-ptr-dereferenceNikita Zhandarovich
in_dev_get() can return NULL which will cause a failure once idev is dereferenced in in_dev_for_each_ifa_rtnl(). This patch adds a check for NULL value in idev beforehand. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://lore.kernel.org/r/20230126185230.62464-1-n.zhandarovich@fintech.ru Reviewed-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26RDMA/irdma: Split CQ handler into irdma_reg_user_mr_type_cqZhu Yanjun
Split the source codes related with CQ handling into a new function. Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20230116193502.66540-5-yanjun.zhu@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qpZhu Yanjun
Split the source codes related with QP handling into a new function. Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20230116193502.66540-4-yanjun.zhu@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26RDMA/irdma: Split mr alloc and free into new functionsZhu Yanjun
In the function irdma_reg_user_mr, the mr allocation and free will be used by other functions. As such, the source codes related with mr allocation and free are split into the new functions. Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20230116193502.66540-3-yanjun.zhu@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26RDMA/irdma: Split MEM handler into irdma_reg_user_mr_type_memZhu Yanjun
The source codes related with IRDMA_MEMREG_TYPE_MEM are split into a new function irdma_reg_user_mr_type_mem. Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20230116193502.66540-2-yanjun.zhu@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-04RDMA/irdma: Remove extra ret variable in favor of existing errZhu Yanjun
In the function irdma_reg_user_mr, err and ret exist. Actually, one variable err is enough. Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20230104064333.660344-1-yanjun.zhu@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-22RDMA/irdma: Initialize net_type before checking itMustafa Ismail
The av->net_type is not initialized before it is checked in irdma_modify_qp_roce. This leads to an incorrect update to the ARP cache and QP context. RoCEv2 connections might fail as result. Set the net_type using rdma_gid_attr_network_type. Fixes: 80005c43d4c8 ("RDMA/irdma: Use net_type to check network type") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221122004410.1471-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17RDMA/irdma: Do not request 2-level PBLEs for CQ allocMustafa Ismail
When allocating PBLE's for a large CQ, it is possible that a 2-level PBLE is returned which would cause the CQ allocation to fail since 1-level is assumed and checked for. Fix this by requesting a level one PBLE only. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221115011701.1379-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17RDMA/irdma: Fix RQ completion opcodeMustafa Ismail
The opcode written by HW, in the RQ CQE, is the RoCEv2/iWARP protocol opcode from the received packet and not the SW opcode as currently assumed. Fix this by returning the raw operation type and queue type in the CQE to irdma_process_cqe and add 2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map IRDMA HW op types to IB op types. Note that for iWARP, only Write with Immediate is supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM when there is immediate data present. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17RDMA/irdma: Fix inline for multiple SGE'sMustafa Ismail
Currently, inline send and inline write assume a single SGE and only copy data from the first one. Add support for multiple SGE's. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221115011701.1379-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07RDMA/irdma: Report the correct link speedShiraz Saleem
The active link speed is currently hard-coded in irdma_query_port due to which the port rate in ibstatus does reflect the active link speed. Call ib_get_eth_speed in irdma_query_port to get the active link speed. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Reported-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221104234957.1135-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-06Merge tag 'v6.0' into rdma.git for-nextJason Gunthorpe
Trvial merge conflicts against rdma.git for-rc resolved matching linux-next: drivers/infiniband/hw/hns/hns_roce_hw_v2.c drivers/infiniband/hw/hns/hns_roce_main.c https://lore.kernel.org/r/20220929124005.105149-1-broonie@kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-09-20RDMA/irdma: Validate udata inlen and outlenShiraz Saleem
Currently ib_copy_from_udata and ib_copy_to_udata could underfill the request and response buffer if the user-space passes an undersized value for udata->inlen or udata->outlen respectively [1] This could lead to undesirable behavior. Zero initing the buffer only goes as far as preventing using the buffer uninitialized. Validate udata->inlen and udata->outlen passed from user-space to ensure they are at least the required minimum size. [1] https://lore.kernel.org/linux-rdma/MWHPR11MB0029F37D40D9D4A993F8F549E9D79@MWHPR11MB0029.namprd11.prod.outlook.com/ Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220907191324.1173-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20RDMA/irdma: Align AE id codes to correct flush code and eventSindhu-Devale
A number of asynchronous event (AE) ids were not aligned to the correct flush_code and event_type. Fix these up so that the correct IBV error and event codes are returned to application. Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to return the correct WC error code. Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07RDMA/irdma: Report RNR NAK generation in device capsSindhu-Devale
Report RNR NAK generation when device capabilities are queried Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07RDMA/irdma: Use s/g array in post send only when its validSindhu-Devale
Send with invalidate verb call can pass in an uninitialized s/g array with 0 sge's which is filled into irdma WQE and causes a HW asynchronous event. Fix this by using the s/g array in irdma post send only when its valid. Fixes: 551c46e ("RDMA/irdma: Add user/kernel shared libraries") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07RDMA/irdma: Return correct WC error for bind operation failureSindhu-Devale
When a QP and a MR on a local host are in different PDs, the HW generates an asynchronous event (AE). The same AE is generated when a QP and a MW are in different PDs during a bind operation. Return the more appropriate IBV_WC_MW_BIND_ERR for the latter case by checking the OP type from the CQE in error. Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07RDMA/irdma: Return error on MR deregister CQP failureSindhu-Devale
The MR deregister CQP can fail if an MW is bound to it. Return an appropriate error for this case. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07RDMA/irdma: Report the correct max cqes from query deviceSindhu-Devale
Report the correct max cqes available to an application taking into account a reserved entry to detect overflow. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28RDMA/irdma: Fix drain SQ hang with no completionShiraz Saleem
SW generated completions for outstanding WRs posted on SQ after QP is in error target the wrong CQ. This causes the ib_drain_sq to hang with no completion. Fix this to generate completions on the right CQ. [ 863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds. [ 863.979224] Not tainted 5.14.0-130.el9.x86_64 #1 [ 863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 863.996997] task:kworker/u52:2 state:D stack: 0 pid: 671 ppid: 2 flags:0x00004000 [ 864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc] [ 864.014056] Call Trace: [ 864.017575] __schedule+0x206/0x580 [ 864.022296] schedule+0x43/0xa0 [ 864.026736] schedule_timeout+0x115/0x150 [ 864.032185] __wait_for_common+0x93/0x1d0 [ 864.037717] ? usleep_range_state+0x90/0x90 [ 864.043368] __ib_drain_sq+0xf6/0x170 [ib_core] [ 864.049371] ? __rdma_block_iter_next+0x80/0x80 [ib_core] [ 864.056240] ib_drain_sq+0x66/0x70 [ib_core] [ 864.062003] rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma] [ 864.069365] ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc] [ 864.076386] xprt_rdma_close+0xe/0x30 [rpcrdma] [ 864.082593] xprt_autoclose+0x52/0x100 [sunrpc] [ 864.088718] process_one_work+0x1e8/0x3c0 [ 864.094170] worker_thread+0x50/0x3b0 [ 864.099109] ? rescuer_thread+0x370/0x370 [ 864.104473] kthread+0x149/0x170 [ 864.109022] ? set_kthread_struct+0x40/0x40 [ 864.114713] ret_from_fork+0x22/0x30 Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@intel.com Reported-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "This cycle we got a new RDMA driver "ERDMA" for the Alibaba cloud environment. Otherwise the changes are dominated by rxe fixes. There is another RDMA driver on the list that might get merged next cycle, 'MANA' for the Azure cloud environment. Summary: - Bug fixes and small features for irdma, hns, siw, qedr, hfi1, mlx5 - General spelling/grammer fixes - rdma cm can follow changes in neighbours for control packets - Significant amounts of rxe fixes and spec compliance changes - Use the modern NAPI API - Use the bitmap API instead of open coding - Performance improvements for rtrs - Add the ERDMA driver for Alibaba cloud - Fix a use after free bug in SRP" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits) RDMA/ib_srpt: Unify checking rdma_cm_id condition in srpt_cm_req_recv() RDMA/rxe: Fix error unwind in rxe_create_qp() RDMA/mlx5: Add missing check for return value in get namespace flow RDMA/rxe: Split qp state for requester and completer RDMA/rxe: Generate error completion for error requester QP state RDMA/rxe: Update wqe_index for each wqe error completion RDMA/srpt: Fix a use-after-free RDMA/srpt: Introduce a reference count in struct srpt_device RDMA/srpt: Duplicate port name members IB/qib: Fix repeated "in" within comments RDMA/erdma: Add driver to kernel build environment RDMA/erdma: Add the ABI definitions RDMA/erdma: Add the erdma module RDMA/erdma: Add connection management (CM) support RDMA/erdma: Add verbs implementation RDMA/erdma: Add verbs header file RDMA/erdma: Add event queue implementation RDMA/erdma: Add cmdq implementation RDMA/erdma: Add main include file RDMA/erdma: Add the hardware related definitions ...
2022-07-18RDMA/irdma: Use the bitmap API to allocate bitmapsChristophe JAILLET
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less verbose and it improves the semantic. Link: https://lore.kernel.org/r/1f671b1af5881723ee265a0a12809c92950e58aa.1657567269.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Fix setting of QP context err_rq_idx_valid fieldMustafa Ismail
Setting err_rq_idx_valid field in QP context when the AE source of the AEQE is not associated with an RQ causes the firmware flush to fail. Set err_rq_idx_valid field in QP context only if it is associated with an RQ. Additionally, cleanup the redundant setting of this field in irdma_process_aeq. Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20220705230815.265-8-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Fix VLAN connection with wildcard addressMustafa Ismail
When an application listens on a wildcard address, and there are VLAN and non-VLAN IP addresses, iWARP connection establishemnt can fail if the listen node VLAN ID does not match. Fix this by checking the vlan_id only if not a wildcard listen node. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Link: https://lore.kernel.org/r/20220705230815.265-7-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Fix a window for use-after-freeMustafa Ismail
During a destroy CQ an interrupt may cause processing of a CQE after CQ resources are freed by irdma_cq_free_rsrc(). Fix this by moving the call to irdma_cq_free_rsrc() after the irdma_sc_cleanup_ceqes(), which is called under the cq_lock. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20220705230815.265-6-shiraz.saleem@intel.com Signed-off-by: Bartosz Sobczak <bartosz.sobczak@intel.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Make resource distribution algorithm more QP orientedNayan Kumar
Adapt the resource distribution algorithm in irdma_cfg_fpm_val to be more QP oriented. If the configuration is too big for the available memory, trim the MR and PBLE's first before trimming the QPs. This also avoids having to double QPs requested as input to algorithm for GEN1 devices. Link: https://lore.kernel.org/r/20220705230815.265-5-shiraz.saleem@intel.com Signed-off-by: Nayan Kumar <nayan.kumar@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Make CQP invalid state error non-criticalMustafa Ismail
The invalid state error returned by the Control Queue-Pair (CQP) is not a critical error. Add it to the irdma_noncrit_err_list and drop reporting it as device error message. Link: https://lore.kernel.org/r/20220705230815.265-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Add AE source to error logMustafa Ismail
To assist with debugging add the Asynchronous Event (AE) source when logging the abnormal AE error log message. Link: https://lore.kernel.org/r/20220705230815.265-3-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18RDMA/irdma: Add 2 level PBLE support for FMRMustafa Ismail
Level 2 Physical Buffer List Entry (PBLE) is currently not supported for Fast MRs which limits memory registrations to 256K pages. Adapt irdma_set_page and irdma_alloc_mr to allow for 2 level PBLEs. Link: https://lore.kernel.org/r/20220705230815.265-2-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>