summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2022-03-14RDMA/qib: Fix typos in commentsJulia Lawall
Various spelling mistakes in comments. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20220314115354.144023-23-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14RDMA/mlx5: Fix memory leak in error flow for subscribe event routineYongzhi Liu
In case the second xa_insert() fails, the obj_event is not released. Fix the error unwind flow to free that memory to avoid a memory leak. Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX") Link: https://lore.kernel.org/r/1647018361-18266-1-git-send-email-lyz_cs@pku.edu.cn Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14Revert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"Leon Romanovsky
This reverts commit 7c4a539ec38f4ce400a0f3fcb5ff6c940fcd67bb. which causes to the following error in mlx4. Destroy of kernel CQ shouldn't fail WARNING: CPU: 4 PID: 18064 at include/rdma/ib_verbs.h:3936 mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib] Modules linked in: bonding ib_ipoib ip_gre ipip tunnel4 geneve rdma_ucm nf_tables ib_umad mlx4_en mlx4_ib ib_uverbs mlx4_core ip6_gre gre ip6_tunnel tunnel6 iptable_raw openvswitch nsh rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay fuse [last unloaded: mlx4_core] CPU: 4 PID: 18064 Comm: ibv_xsrq_pingpo Not tainted 5.17.0-rc7_master_62c6ecb #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib] Code: 1e 93 08 00 40 80 fd 01 0f 87 fa f1 04 00 83 e5 01 0f 85 2b ff ff ff 48 c7 c7 20 4f b6 a0 c6 05 fd 92 08 00 01 e8 47 c9 82 e2 <0f> 0b e9 11 ff ff ff 0f b6 2d eb 92 08 00 40 80 fd 01 0f 87 b1 f1 RSP: 0018:ffff8881a4957750 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8881ac4b6800 RCX: 0000000000000000 RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed103492aedc RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2e378eb R10: ffffed109a5c6f1d R11: 0000000000000001 R12: ffff888132620000 R13: ffff8881a4957a90 R14: ffff8881aa2d4000 R15: ffff8881a4957ad0 FS: 00007f0401747740(0000) GS:ffff8884d2e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f8ae036118 CR3: 000000012fe94005 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ib_dealloc_xrcd_user+0xce/0x120 [ib_core] ib_uverbs_dealloc_xrcd+0xad/0x210 [ib_uverbs] uverbs_free_xrcd+0xe8/0x190 [ib_uverbs] destroy_hw_idr_uobject+0x7a/0x130 [ib_uverbs] uverbs_destroy_uobject+0x164/0x730 [ib_uverbs] uobj_destroy+0x72/0xf0 [ib_uverbs] ib_uverbs_cmd_verbs+0x19fb/0x3110 [ib_uverbs] ib_uverbs_ioctl+0x169/0x260 [ib_uverbs] __x64_sys_ioctl+0x856/0x1550 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 7c4a539ec38f ("RDMA/core: Fix ib_qp_usecnt_dec() called when error") Link: https://lore.kernel.org/r/74c11029adaf449b3b9228a77cc82f39e9e892c8.1646851220.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14RDMA/rxe: Remove useless argument for update_state()Chengguang Xu
The argument 'payload' is not used in update_state(), so just remove it. Link: https://lore.kernel.org/r/20220307145047.3235675-2-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14RDMA/rxe: Change variable and function argument to proper typeChengguang Xu
The type of wqe length is u32 so in order to avoid overflow and shadow casting change variable and relevant function argument to proper type. Link: https://lore.kernel.org/r/20220307145047.3235675-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14RDMA/irdma: Prevent some integer underflowsDan Carpenter
My static checker complains that: drivers/infiniband/hw/irdma/ctrl.c:3605 irdma_sc_ceq_init() warn: can subtract underflow 'info->dev->hmc_fpm_misc.max_ceqs'? It appears that "info->dev->hmc_fpm_misc.max_ceqs" comes from the firmware in irdma_sc_parse_fpm_query_buf() so, yes, there is a chance that it could be zero. Even if we trust the firmware, it's easy enough to change the condition just as a hardenning measure. Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs") Link: https://lore.kernel.org/r/20220307125928.GE16710@kili Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-09net/mlx5: Move debugfs entries to separate structMoshe Shemesh
Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-04RDMA/hns: Refactor the alloc_cqc()Wenpeng Liang
Abstract the alloc_cqc() into several parts and separate the process unrelated to allocating CQC. Link: https://lore.kernel.org/r/20220302064830.61706-10-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Refactor the alloc_srqc()Chengchang Tang
Abstract the alloc_srqc() into several parts and separate the alloc_srqn() from the alloc_srqc(). Link: https://lore.kernel.org/r/20220302064830.61706-9-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Clean up the return value check of hns_roce_alloc_cmd_mailbox()Wenpeng Liang
hns_roce_alloc_cmd_mailbox() never returns NULL, so the check should be IS_ERR(). And the error code should be converted as the function's return value. Link: https://lore.kernel.org/r/20220302064830.61706-8-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Remove similar code that configures the hardware contextsChengchang Tang
Remove duplicate code for creating and destroying hardware contexts via mailbox. Link: https://lore.kernel.org/r/20220302064830.61706-7-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Refactor mailbox functionsChengchang Tang
The current mailbox functions have too many parameters, making the code difficult to maintain. So construct a new structure mbox_msg to pass the information needed by mailbox. Link: https://lore.kernel.org/r/20220302064830.61706-6-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Fix the wrong type of parameter "op" of the mailboxWenpeng Liang
The "op" field of the mailbox occupies 8 bits, so the parameter "op" should be of type u8. Link: https://lore.kernel.org/r/20220302064830.61706-5-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Remove redundant parameter "mailbox" in the mailboxWenpeng Liang
The parameter "out_param" of the mailbox is always null when the context is destroyed. So remove the function parameter "mailbox". Link: https://lore.kernel.org/r/20220302064830.61706-4-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Remove fixed parameter “timeout” in the mailboxChengchang Tang
The value of the function parameter “timeout” is unique. Therefore, it is unnecessary to specify the parameter “timeout” value each time. So remove it. Link: https://lore.kernel.org/r/20220302064830.61706-3-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/hns: Remove the unused parameter "op_modifier" in mailboxChengchang Tang
The parameter "op_modifier" is only used for HIP06. It is useless for HIP08 and later versions. After removing HIP06, this parameter is no longer used, so remove it. Link: https://lore.kernel.org/r/20220302064830.61706-2-liangwenpeng@huawei.com Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04RDMA/core: Fix ib_qp_usecnt_dec() called when errorYajun Deng
ib_destroy_qp() would called by ib_create_qp_user() if error, the former contains ib_qp_usecnt_dec(), but ib_qp_usecnt_inc() was not called before. So move ib_qp_usecnt_inc() into create_qp(). Fixes: d2b10794fc13 ("RDMA/core: Create clean QP creations interface for uverbs") Link: https://lore.kernel.org/r/20220303024232.2847388-1-yajun.deng@linux.dev Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04IB/hfi1: Allow larger MTU without AIPMike Marciniszyn
The AIP code signals the phys_mtu in the following query_port() fragment: props->phys_mtu = HFI1_CAP_IS_KSET(AIP) ? hfi1_max_mtu : ib_mtu_enum_to_int(props->max_mtu); Using the largest MTU possible should not depend on AIP. Fix by unconditionally using the hfi1_max_mtu value. Fixes: 6d72344cf6c4 ("IB/ipoib: Increase ipoib Datagram mode MTU's upper limit") Link: https://lore.kernel.org/r/1644348309-174874-1-git-send-email-mike.marciniszyn@cornelisnetworks.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/batman-adv/hard-interface.c commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check") commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message") https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/ net/smc/af_smc.c commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails") commit 462791bbfa35 ("net/smc: add sysctl interface for SMC") https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03mm: don't include <linux/memremap.h> in <linux/mm.h>Christoph Hellwig
Move the check for the actual pgmap types that need the free at refcount one behavior into the out of line helper, and thus avoid the need to pull memremap.h into mm.h. Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Chaitanya Kulkarni <kch@nvidia.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-02-28Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next 2022-22-02 The following PR includes updates to mlx5-next branch: Headlines: ========== 1) Jakub cleans up unused static inline functions 2) I did some low level firmware command interface return status changes to provide the caller with full visibility on the error/status returned by the Firmware. 3) Use the new command interface in RDMA DEVX usecases to avoid flooding dmesg with some "expected" user error prone use cases. 4) Moshe also uses the new command interface to grab the specific error code from MFRL register command to provide the exact error reason for why SW reset couldn't perform internally in FW. 5) From Mark Bloch: Lag, drop packets in hardware when possible In active-backup mode the inactive interface's packets are dropped by the bond device. In switchdev where TC rules are offloaded to the FDB this can lead to packets being hit in the FDB where without offload they would have been dropped before reaching TC rules in the kernel. Create a drop rule to make sure packets on inactive ports are dropped before reaching the FDB. Listen on NETDEV_CHANGEUPPER / NETDEV_CHANGEINFODATA events and record the inactive state and offload accordingly. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add clarification on sync reset failure net/mlx5: Add reset_state field to MFRL register RDMA/mlx5: Use new command interface API net/mlx5: cmdif, Refactor error handling and reporting of async commands net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct} net/mlx5: cmdif, Add new api for command execution net/mlx5: cmdif, cmd_check refactoring net/mlx5: cmdif, Return value improvements net/mlx5: Lag, offload active-backup drops to hardware net/mlx5: Lag, record inactive state of bond device net/mlx5: Lag, don't use magic numbers for ports net/mlx5: Lag, use local variable already defined to access E-Switch net/mlx5: E-switch, add drop rule support to ingress ACL net/mlx5: E-switch, remove special uplink ingress ACL handling net/mlx5: E-Switch, reserve and use same uplink metadata across ports net/mlx5: Add ability to insert to specific flow group mlx5: remove unused static inlines ==================== Link: https://lore.kernel.org/r/20220223233930.319301-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-28RDMA/core: Remove unnecessary statementsYajun Deng
The rdma_zalloc_drv_obj() in __ib_alloc_pd() would zero pd, it unnecessary add NULL to the object in struct pd. The uverbs_free_pd() already return busy if pd->usecnt is true, there is no need to add a warning. Link: https://lore.kernel.org/r/20220223074901.201506-1-yajun.deng@linux.dev Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28RDMA/irdma: Remove incorrect masking of PDMustafa Ismail
The PD id is masked with 0x7fff, while PD can be 18 bits for GEN2 HW. Remove the masking as it should not be needed and can cause incorrect PD id to be used. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20220225163211.127-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28RDMA/irdma: Fix Passthrough mode in VMMustafa Ismail
Using PCI_FUNC macro in a VM, when the device is in passthrough mode does not provide the real function instance. This means that currently, devices will not probe unless the instance in the VM matches the instance in the host. Fix this by getting the pf_id from the LAN during the probe. Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs") Link: https://lore.kernel.org/r/20220225163211.127-3-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28RDMA/irdma: Fix netdev notifications for vlan'sMustafa Ismail
Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev and processing the notifications for vlan netdevs. Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28RDMA/irdma: Make irdma_create_mg_ctx return a voidZhu Yanjun
The function irdma_create_mg_ctx always returns 0, so make it void and delete the return value check. Link: https://lore.kernel.org/r/20220224182832.3896686-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-25RDMA/cma: Do not change route.addr.src_addr outside state checksJason Gunthorpe
If the state is not idle then resolve_prepare_src() should immediately fail and no change to global state should happen. However, it unconditionally overwrites the src_addr trying to build a temporary any address. For instance if the state is already RDMA_CM_LISTEN then this will corrupt the src_addr and would cause the test in cma_cancel_operation(): if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev) Which would manifest as this trace from syzkaller: BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26 Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204 CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 __list_add_valid+0x93/0xa0 lib/list_debug.c:26 __list_add include/linux/list.h:67 [inline] list_add_tail include/linux/list.h:100 [inline] cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline] rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751 ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102 ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732 vfs_write+0x28e/0xa30 fs/read_write.c:603 ksys_write+0x1ee/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae This is indicating that an rdma_id_private was destroyed without doing cma_cancel_listens(). Instead of trying to re-use the src_addr memory to indirectly create an any address derived from the dst build one explicitly on the stack and bind to that as any other normal flow would do. rdma_bind_addr() will copy it over the src_addr once it knows the state is valid. This is similar to commit bc0bdc5afaa7 ("RDMA/cma: Do not change route.addr.src_addr.ss_family") Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: 732d41c545bb ("RDMA/cma: Make the locking for automatic state transition more clear") Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/irdma: Move union irdma_sockaddr to header fileZhu Yanjun
The union irdma_sockaddr is used frequently. So move it to the header file. Link: https://lore.kernel.org/r/20220223024252.3873736-4-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/irdma: Remove the unnecessary variable saddrZhu Yanjun
Firstly the variable saddr was to check the type of a network. Now the variable net_type is used to do the same work. So it is removed. Link: https://lore.kernel.org/r/20220223024252.3873736-3-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/irdma: Use net_type to check network typeZhu Yanjun
The member variable net_type is to check the type of network. Link: https://lore.kernel.org/r/20220223024252.3873736-2-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/rxe: Cleanup rxe_mcast.cBob Pearson
Finish adding subroutine comment headers to subroutines in rxe_mcast.c. Make minor api change cleanups. Link: https://lore.kernel.org/r/20220223230706.50332-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/rxe: Collect cleanup mca code in a subroutineBob Pearson
Collect cleanup code for struct rxe_mca into a subroutine, __rxe_cleanup_mca() called in rxe_detach_mcg() in rxe_mcast.c. Link: https://lore.kernel.org/r/20220223230706.50332-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/rxe: Collect mca init code in a subroutineBob Pearson
Collect initialization code for struct rxe_mca into a subroutine, __rxe_init_mca(), to cleanup rxe_attach_mcg() in rxe_mcast.c. Check limit on total number of attached qp's. Link: https://lore.kernel.org/r/20220223230706.50332-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/rxe: Warn if mcast memory is not freedBob Pearson
Print a warning if memory allocated by mcast is not cleared when the rxe driver is unloaded. Link: https://lore.kernel.org/r/20220223230706.50332-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Use new command interface APISaeed Mahameed
DEVX can now use mlx5_cmd_do() which will not intercept the command execution status and will provide full information of the return code. DEVX can now propagate the error code safely to upper layers, to indicate to the callers if the command was actually executed and the error code indicates the command execution status availability in the command outbox buffer. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: cmdif, Refactor error handling and reporting of async commandsSaeed Mahameed
Same as the new mlx5_cmd_do API, report all information to callers and let them handle the error values and outbox parsing. The user callback status "work->user_callback(status)" is now similar to the error rc code returned from the blocking mlx5_cmd_do() version, and now is defined as follows: -EREMOTEIO : Command executed by FW, outbox.status != MLX5_CMD_STAT_OK. Caller must check FW outbox status. 0 : Command execution successful, outbox.status == MLX5_CMD_STAT_OK. < 0 : Command couldn't execute, FW or driver induced error. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct}Saeed Mahameed
mlx5_core_create_{cq/dct} functions are non-trivial mlx5 commands functions. They check command execution status themselves and hide valuable FW failure information. For mlx5_core/eth kernel user this is what we actually want, but for a devx/rdma user the hidden information is essential and should be propagated up to the caller, thus we convert these commands to use mlx5_cmd_do to return the FW/driver and command outbox status as is, and let the caller decide what to do with it. For kernel callers of mlx5_core_create_{cq/dct} or those who only care about the binary status (FAIL/SUCCESS) they must check status themselves via mlx5_cmd_check() to restore the current behavior. err = mlx5_create_cq(in, out) err = mlx5_cmd_check(err, in, out) if (err) // handle err For DEVX users and those who care about full visibility, They will just propagate the error to user space, and app can check if err == -EREMOTEIO, then outbox.{status,syndrome} are valid. API Note: mlx5_cmd_check() must be used by kernel users since it allows the driver to intercept the command execution status and return a driver simulated status in case of driver induced error handling or reset/recovery flows. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23RDMA/irdma: Remove excess error variablesShiraz Saleem
As irdma_status_code is replaced with an int, there is no need for two variables to hold error codes. Remove the excess variable in functions where this occurs. Also, remove any redundant initializations which are no longer needed. Link: https://lore.kernel.org/r/20220217151851.1518-4-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/irdma: Propagate error codesShiraz Saleem
All functions now return linux error codes. Propagate the return from these functions as opposed to converting them to generic values. Link: https://lore.kernel.org/r/20220217151851.1518-3-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/irdma: Remove enum irdma_status_codeShiraz Saleem
Replace use of custom irdma_status_code with linux error codes. Remove enum irdma_status_code and header in which its defined. Link: https://lore.kernel.org/r/20220217151851.1518-2-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/ib_srp: Add more documentationBart Van Assche
Make it more clear what the different ib_srp data structures represent. Link: https://lore.kernel.org/r/20220215210511.28303-2-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/ib_srp: Fix a deadlockBart Van Assche
Remove the flush_workqueue(system_long_wq) call since flushing system_long_wq is deadlock-prone and since that call is redundant with a preceding cancel_work_sync() Link: https://lore.kernel.org/r/20220215210511.28303-3-bvanassche@acm.org Fixes: ef6c49d87c34 ("IB/srp: Eliminate state SRP_TARGET_DEAD") Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Reorder calls to pcie_relaxed_ordering_enabled()Aharon Landau
The mkc is the key for the mkey cache, hence, created in each attempt to get a cache mkey, while pcie_relaxed_ordering_enabled() is called during the setting of the mkc, but used only for cases where IB_ACCESS_RELAXED_ORDERING is set. pcie_relaxed_ordering_enabled() is an expensive call (26 us). Reorder the code so the driver will call it only when it is needed. Link: https://lore.kernel.org/r/684be1366cb1d4f05aa3e78986205e4bc410443a.1644947594.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Store ndescs instead of the translation table sizeAharon Landau
Currently, ent->xlt stores the translation table size. This data should not be stored in the cache entry but be written directly to the mailbox. Store ndescs instead, and deduce the translation table size from it according to the access mode. Link: https://lore.kernel.org/r/e9dbfaa1f279793a6bd28ee5a31cb4f0f0d70f05.1644947594.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Merge similar flows of allocating MR from the cacheAharon Landau
When allocating a MR from the cache, the driver calls to get_cache_mr(), and in case of failure, retries with create_cache_mr(). This is the flow of mlx5_mr_cache_alloc(), so use it instead. Link: https://lore.kernel.org/r/53c85fcd4de6ec9de0b8e6cbb1bf5d5fe19900c3.1644947594.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Fix the flow of a miss in the allocation of a cache ODP MRAharon Landau
When an ODP MR cache entry is empty and trying to allocate it, increment the ent->miss counter and call to queue_adjust_cache_locked() to verify the entry is balanced. Fixes: aad719dcf379 ("RDMA/mlx5: Allow MRs to be created in the cache synchronously") Link: https://lore.kernel.org/r/09503e295276dcacc92cb1d8aef1ad0961c99dc1.1644947594.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23RDMA/mlx5: Remove redundant work in struct mlx5_cache_entAharon Landau
delayed_cache_work_func() and the cache_work_func() are both wrappers of __cache_work_func(). Instead of having a special not delayed work, use the delayed work with delay = 0. Link: https://lore.kernel.org/r/18b6ae205e75f087aa4a2a05c81ea8b66d8d88dc.1644947594.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-22scsi: iscsi: Stop using the SCSI pointerBart Van Assche
Instead of storing the iSCSI task pointer and the session age in the SCSI pointer, use command-private variables. This patch prepares for removal of the SCSI pointer from struct scsi_cmnd. The list of iSCSI drivers has been obtained as follows: $ git grep -lw iscsi_host_alloc drivers/infiniband/ulp/iser/iscsi_iser.c drivers/scsi/be2iscsi/be_main.c drivers/scsi/bnx2i/bnx2i_iscsi.c drivers/scsi/cxgbi/libcxgbi.c drivers/scsi/iscsi_tcp.c drivers/scsi/libiscsi.c drivers/scsi/qedi/qedi_main.c drivers/scsi/qla4xxx/ql4_os.c include/scsi/libiscsi.h Note: it is not clear to me how the qla4xxx driver can work without this patch since it uses the scsi_cmnd::SCp.ptr member for two different purposes: - The qla4xxx driver uses this member to store a struct srb pointer. - libiscsi uses this member to store a struct iscsi_task pointer. Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Chris Leech <cleech@redhat.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Nilesh Javali <njavali@marvell.com> Cc: Manish Rangankar <mrangankar@marvell.com> Cc: Karen Xie <kxie@chelsio.com> Cc: Ketan Mukadam <ketan.mukadam@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> iscsi Link: https://lore.kernel.org/r/20220218195117.25689-26-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-18RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_closeMd Haris Iqbal
Error path of rtrs_clt_open() calls free_clt(), where free_permit is called. This is wrong since error path of rtrs_clt_open() does not need to call free_permit(). Also, moving free_permits() call to rtrs_clt_close(), makes it more aligned with the call to alloc_permit() in rtrs_clt_open(). Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20220217030929.323849-2-haris.iqbal@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-18RDMA/rtrs-clt: Fix possible double free in error caseMd Haris Iqbal
Callback function rtrs_clt_dev_release() for put_device() calls kfree(clt) to free memory. We shouldn't call kfree(clt) again, and we can't use the clt after kfree too. Replace device_register() with device_initialize() and device_add() so that dev_set_name can() be used appropriately. Move mutex_destroy() to the release function so it can be called in the alloc_clt err path. Fixes: eab098246625 ("RDMA/rtrs-clt: Refactor the failure cases in alloc_clt") Link: https://lore.kernel.org/r/20220217030929.323849-1-haris.iqbal@ionos.com Reported-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>