summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2018-04-27IB/ipoib: fix ipoib_start_xmit()'s return typeLuc Van Oostenryck
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, but the implementation in this driver returns an 'int'. Fix this by returning 'netdev_tx_t' in this driver too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/nes: fix nes_netdev_start_xmit()'s return typeLuc Van Oostenryck
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, but the implementation in this driver returns an 'int'. Fix this by returning 'netdev_tx_t' in this driver too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27RDMA/cma: Fix use after destroy access to net namespace for IPoIBParav Pandit
There are few issues with validation of netdevice and listen id lookup for IB (IPoIB) while processing incoming CM request as below. 1. While performing lookup of bind_list in cma_ps_find(), net namespace of the netdevice can get deleted in cma_exit_net(), resulting in use after free access of idr and/or net namespace structures. This lookup occurs from the workqueue context (and not userspace context where net namespace is always valid). CPU0 CPU1 ==== ==== bind_list = cma_ps_find(); move netdevice to new namespace delete net namespace cma_exit_net() idr_destroy(idr); [..] cma_find_listener(bind_list, ..); 2. While netdevice is validated for IP address in given net namespace, netdevice's net namespace and/or ifindex can change in cma_get_net_dev() and cma_match_net_dev(). Above issues are overcome by using rcu lock along with netdevice UP/DOWN state as described below. When a net namespace is getting deleted, netdevice is closed and shutdown before moving it back to init_net namespace. change_net_namespace() synchronizes with any existing use of netdevice before changing the netdev properties such as net or ifindex. Once netdevice IFF_UP flags is cleared, such fields are not guaranteed to be valid. Therefore, rcu lock along with netdevice state check ensures that, while route lookup and cm_id lookup is in progress, netdevice of interest won't migrate to any other net namespace. This ensures that associated net namespace of netdevice won't get deleted while rcu lock is held for netdevice which is in IFF_UP state. Fixes: fa20105e09e9 ("IB/cma: Add support for network namespaces") Fixes: 4be74b42a6d0 ("IB/cma: Separate port allocation to network namespaces") Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/uverbs: Fix validating mandatory attributesMatan Barak
Previously, if a method contained mandatory attributes in a namespace that wasn't given by the user, these attributes weren't validated. Fixing this by iterating over all specification namespaces. Fixes: fac9658cabb9 ("IB/core: Add new ioctl interface") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/hfi1: Replace custom hfi1 macros with PCIe macrosFrederick Lawler
IB/hfi1 contains custom macros for PCIe link configuration. Remove the custom macros in favor of the PCIe link macros. No functional change intended. Signed-off-by: Frederick Lawler <fred@fredlawl.com> [bhelgaas: use "GT" instead of "GB"] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
2018-04-27RDMA/cxgb4: release hw resources on device removalRaju Rangoju
The c4iw_rdev_close() logic was not releasing all the hw resources (PBL and RQT memory) during the device removal event (driver unload / system reboot). This can cause panic in gen_pool_destroy(). The module remove function will wait for all the hw resources to be released during the device removal event. Fixes c12a67fe(iw_cxgb4: free EQ queue memory on last deref) Signed-off-by: Raju Rangoju <rajur@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: remove unused function variableZhu Yanjun
In the functions rxe_mem_init_dma, rxe_mem_init_user, rxe_mem_init_fast and copy_data, the function variable rxe is not used. So this function variable rxe is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: change rxe_set_mtu function type to voidZhu Yanjun
The function rxe_set_mtu always returns zero. So this function type is changed to void. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: Change rxe_rcv to return voidYuval Shaia
It always returns 0. Change return type to void. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27infiniband: hw: qib: Change return type to vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27infiniband: hw: hfi1: Change return type to vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB: make INFINIBAND_ADDR_TRANS configurableGreg Thelen
Allow INFINIBAND without INFINIBAND_ADDR_TRANS because fuzzing has been finding fair number of CM bugs. So provide option to disable it. Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Tarick Bedeir <tarick@google.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27ib_srp: depend on INFINIBAND_ADDR_TRANSGreg Thelen
INFINIBAND_SRP code depends on INFINIBAND_ADDR_TRANS provided symbols. So declare the kconfig dependency. This is necessary to allow for enabling INFINIBAND without INFINIBAND_ADDR_TRANS. Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Tarick Bedeir <tarick@google.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27ib_srpt: depend on INFINIBAND_ADDR_TRANSGreg Thelen
INFINIBAND_SRPT code depends on INFINIBAND_ADDR_TRANS provided symbols. So declare the kconfig dependency. This is necessary to allow for enabling INFINIBAND without INFINIBAND_ADDR_TRANS. Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Tarick Bedeir <tarick@google.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27RDMA/mlx5: Properly check return value of mlx5_get_uars_pageLeon Romanovsky
Starting from commit 72f36be06138 ("net/mlx5: Fix mlx5_get_uars_page to return error code") the mlx5_get_uars_page() call returns error in case of failure, but it was mistakenly overlooked in the merge commit. Fixes: e7996a9a77fc ("Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git") Reported-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/mlx5: Fix represent correct netdevice in dual port RoCEParav Pandit
In commit bcf87f1dbbec ("IB/mlx5: Listen to netdev register/unresiter events in switchdev mode") incorrectly mapped primary device's netdevice to 2nd port netdevice. It always represented primary port's netdevice for 2nd port netdevice when ib representors were not used. This results into failing to process CM request arriving on 2nd port due to incorrect mapping of netdevice. This fix corrects it by considering the right mdev. Cc: <stable@vger.kernel.org> # 4.16 Fixes: bcf87f1dbbec ("IB/mlx5: Listen to netdev register/unresiter events in switchdev mode") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/mlx5: Use unlimited rate when static rate is not supportedDanit Goldberg
Before the change, if the user passed a static rate value different than zero and the FW doesn't support static rate, it would end up configuring rate of 2.5 GBps. Fix this by using rate 0; unlimited, in cases where FW doesn't support static rate configuration. Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27RDMA/mlx5: Protect from shift operand overflowLeon Romanovsky
Ensure that user didn't supply values too large that can cause overflow. UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:263:23 shift exponent -2147483648 is negative CPU: 0 PID: 292 Comm: syzkaller612609 Not tainted 4.16.0-rc1+ #131 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ubsan_epilogue+0xe/0x81 set_rq_size+0x7c2/0xa90 create_qp_common+0xc18/0x43c0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x2260 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 do_syscall_64+0x1aa/0x740 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x433569 RSP: 002b:00007ffc6e62f448 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433569 RDX: 0000000000000070 RSI: 00000000200042c0 RDI: 0000000000000003 RBP: 00000000006d5018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040c9f0 R14: 000000000040ca80 R15: 0000000000000006 Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27RDMA/mlx5: Fix multiple NULL-ptr deref errors in rereg_mr flowLeon Romanovsky
Failure in rereg MR releases UMEM but leaves the MR to be destroyed by the user. As a result the following scenario may happen: "create MR -> rereg MR with failure -> call to rereg MR again" and hit "NULL-ptr deref or user memory access" errors. Ensure that rereg MR is only performed on a non-dead MR. Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 4.5 Fixes: 395a8e4c32ea ("IB/mlx5: Refactoring register MR code") Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-26net/mlx5: Fix mlx5_get_vector_affinity functionIsrael Rukshin
Adding the vector offset when calling to mlx5_vector2eqn() is wrong. This is because mlx5_vector2eqn() checks if EQ index is equal to vector number and the fact that the internal completion vectors that mlx5 allocates don't get an EQ index. The second problem here is that using effective_affinity_mask gives the same CPU for different vectors. This leads to unmapped queues when calling it from blk_mq_rdma_map_queues(). This doesn't happen when using affinity_hint mask. Fixes: 2572cf57d75a ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0") Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-04-23IB/core: Fix deleting default GIDs when changing mac adddressParav Pandit
Before [1], When MAC address of the netdevice is changed, default GID is supposed to get deleted and added back which affects the node and/or port GUID in below sequence. netdevice_event() -> NETDEV_CHANGEADDR default_del_cmd() del_netdev_default_ips() bond_delete_netdev_default_gids() ib_cache_gid_set_default_gid() ib_cache_gid_del() add_cmd() [..] However, ib_cache_gid_del() was not getting invoked in non bonding scenarios because event_ndev and rdma_ndev are same. Therefore, fix such condition to ignore checking upper device when event ndev and rdma_dev are same; similar to bond_set_netdev_default_gids(). Which this fix ib_cache_gid_del() is invoked correctly; however ib_cache_gid_del() doesn't find the default GID for deletion because find_gid() was given default_gid = false with GID_ATTR_FIND_MASK_DEFAULT set. But it was getting overwritten by ib_cache_gid_set_default_gid() later on as part of add_cmd(). Therefore, mac address change used to work for default GID. With refactor series [1], this incorrect behavior is detected. Therefore, when deleting default GID, set default_gid and set MASK flag. when deleting IP based GID, clear default_gid and set MASK flag. [1] https://patchwork.kernel.org/patch/10319151/ Fixes: 238fdf48f2b5 ("IB/core: Add RoCE table bonding support") Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-23IB/core: Fix to avoid deleting IPv6 look alike default GIDsParav Pandit
When IPv6 link local address is removed, if it matches with the default GID, default GID(s)s gets removed which may not be a desired behavior. This behavior is introduced by refactor work in Fixes tag. When IPv6 link address is removed, removing its equivalent RoCEv2 GID which exactly matches with default RoCEv2 GID, is right thing to do. However achieving it correctly requires lot more changes, likely in roce_gid_mgmt.c and core/cache.c. This should be done as independent patch. Therefore, this patch preserves behavior of not deleteing default GIDs. This is done by providing explicit hint to consider default GID property using mask and default_gid; similar to add_gid(). Fixes: 598ff6bae68 ("IB/core: Refactor GID modify code for RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-23IB/core: Don't allow default GID addition at non reseved slotsParav Pandit
Default GIDs are marked reserved at the start of the GID table at index 0 and 1 by gid_table_reserve_default(). Currently when default GID is requested, it can still allocates an empty slot which was not marked as RESERVED for default GID, which is incorrect. At least in current code flow of roce_gid_mgmt.c, in theory we can still request to allocate more than one/two default GIDs depending on how upper devices are setup. Therefore, it is better for cache layer to only allow our reserved slots to be used by default GID allocation requests. Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-23RDMA/ucma: Allow resolving address w/o specifying source addressRoland Dreier
The RDMA CM will select a source device and address by consulting the routing table if no source address is passed into rdma_resolve_address(). Userspace will ask for this by passing an all-zero source address in the RESOLVE_IP command. Unfortunately the new check for non-zero address size rejects this with EINVAL, which breaks valid userspace applications. Fix this by explicitly allowing a zero address family for the source. Fixes: 2975d5de6428 ("RDMA/ucma: Check AF family prior resolving address") Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19RDMA/ucma: Check for a cm_id->device in all user calls that need itJason Gunthorpe
This is done by auditing all callers of ucma_get_ctx and switching the ones that unconditionally touch ->device to ucma_get_ctx_dev. This covers a little less than half of the call sites. The 11 remaining call sites to ucma_get_ctx() were manually audited. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: replace refcount_inc with skb_getZhu Yanjun
Follow the advice from Bart, the function refcount_inc is replaced with skb_get in commit 99dae690255e ("IB/rxe: optimize mcast recv process") and commit 86af61764151 ("IB/rxe: remove unnecessary skb_clone"). CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Suggested-by: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: optimize the function duplicate_requestZhu Yanjun
In the function duplicate_request, the reference of skb can be increased to replace the function skb_clone. This will make rxe performace better and save memory. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: make rxe_release_udp_tunnel staticZhu Yanjun
The function rxe_release_udp_tunnel is only used in rxe_net.c. So it is necessary to make this function as static. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-17IB/uverbs: Add missing braces in anonymous union initializersGeert Uytterhoeven
With gcc-4.1.2: drivers/infiniband/core/uverbs_std_types_flow_action.c:366: error: unknown field ‘ptr’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:367: error: unknown field ‘type’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:367: warning: missing braces around initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:367: warning: (near initialization for ‘uverbs_flow_action_esp_keymat[0].<anonymous>.<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:368: error: unknown field ‘min_len’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:368: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:368: warning: (near initialization for ‘uverbs_flow_action_esp_keymat[0].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:368: error: unknown field ‘len’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:368: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:368: warning: (near initialization for ‘uverbs_flow_action_esp_keymat[0].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:369: error: unknown field ‘flags’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:369: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:369: warning: (near initialization for ‘uverbs_flow_action_esp_keymat[0].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:376: error: unknown field ‘ptr’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:377: error: unknown field ‘type’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:377: warning: missing braces around initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:377: warning: (near initialization for ‘uverbs_flow_action_esp_replay[0].<anonymous>.<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:379: error: unknown field ‘len’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:379: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:379: warning: (near initialization for ‘uverbs_flow_action_esp_replay[0].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:383: error: unknown field ‘ptr’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:384: error: unknown field ‘type’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:385: error: unknown field ‘min_len’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:385: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:385: warning: (near initialization for ‘uverbs_flow_action_esp_replay[1].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:385: error: unknown field ‘len’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:385: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:385: warning: (near initialization for ‘uverbs_flow_action_esp_replay[1].<anonymous>’) drivers/infiniband/core/uverbs_std_types_flow_action.c:386: error: unknown field ‘flags’ specified in initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:386: warning: excess elements in union initializer drivers/infiniband/core/uverbs_std_types_flow_action.c:386: warning: (near initialization for ‘uverbs_flow_action_esp_replay[1].<anonymous>’) Add the missing braces to fix this. Fixes: 2eb9beaee5d7 ("IB/uverbs: Add flow_action create and destroy verbs") Fixes: 7d12f8d5a164 ("IB/uverbs: Add modify ESP flow_action") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_l2param_changeJia-Ju Bai
i40iw_l2param_change() is never called in atomic context. i40iw_make_listen_node() is only set as ".l2_param_change" in struct i40e_client_ops, and this function pointer is not called in atomic context. Despite never getting called from atomic context, i40iw_l2param_change() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_make_listen_nodeJia-Ju Bai
i40iw_make_listen_node() is never called in atomic context. i40iw_make_listen_node() is only called by i40iw_create_listen, which is set as ".create_listen" in struct iw_cm_verbs. Despite never getting called from atomic context, i40iw_make_listen_node() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_add_mqh_4Jia-Ju Bai
i40iw_add_mqh_4() is never called in atomic context, because it calls rtnl_lock() that can sleep. Despite never getting called from atomic context, i40iw_add_mqh_4() calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which can sleep and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17IB/rxe: avoid export symbolsZhu Yanjun
The functions rxe_set_mtu, rxe_add and rxe_remove are only used in their own module. So it is not necessary to export them. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17IB/rxe: make the variable staticZhu Yanjun
The variable rxe_net_notifier is only used in the file rxe_net.c. So remove it from rxe_net.h file and make it static in the file rxe_net.c. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Delete rdma_addr_clientJason Gunthorpe
The only thing it does is block module unload while work is posted from rdma_resolve_ip(). However, this is not the right place to do this. The users of rdma_resolve_ip() must ensure their own module does not unload until rdma_resolve_ip() calls the callback, or until rdma_addr_cancel() is called. Similarly callers to rdma_addr_find_l2_eth_by_grh() must ensure their module does not unload while they are calling code. The only two users are already safe, so there is no need for this. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Make rdma_addr_cancel into a fenceJason Gunthorpe
Currently rdma_addr_cancel does not prevent the callback from being used, this is surprising and hard to reason about. There does not appear to be a bug here as the only user of this API does refcount properly, fixing it only to increase clarity. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17RDMA/rdma_cm: Remove process_req and timer sortingJason Gunthorpe
Now that the work queue is used directly to launch and track the work there is no need for the second processing function to do 'all list entries'. Just schedule all entries onto the main work queue directly. We can also drop all of the useless list sorting now, as the workqueue sorts by expiration time automatically. This change requires switching lock to a spinlock as netdev notifiers are called in an atomic context, this is now easy since the lock does not need to be held across the lookup, that is already single threaded due to the work queue. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-17infiniband: mlx5: fix build errors when INFINIBAND_USER_ACCESS=mRandy Dunlap
Fix build errors when INFINIBAND_USER_ACCESS=m and MLX5_INFINIBAND=y. The build error occurs when the mlx5 driver code attempts to use USER_ACCESS interfaces, which are built as a loadable module. Fixes these build errors: drivers/infiniband/hw/mlx5/main.o: In function `populate_specs_root': ../drivers/infiniband/hw/mlx5/main.c:4982: undefined reference to `uverbs_default_get_objects' ../drivers/infiniband/hw/mlx5/main.c:4994: undefined reference to `uverbs_alloc_spec_tree' drivers/infiniband/hw/mlx5/main.o: In function `depopulate_specs_root': ../drivers/infiniband/hw/mlx5/main.c:5001: undefined reference to `uverbs_free_spec_tree' Build-tested with multiple config combinations. Fixes: 8c84660bb437 ("IB/mlx5: Initialize the parsing tree root without the help of uverbs") Cc: stable@vger.kernel.org # reported against 4.16 Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-16IB/mlx5: remove duplicate header fileZhu Yanjun
The header file fs_helpers.h is included twice. So it should be removed. Fixes: 802c2125689d ("IB/mlx5: Add IPsec support for egress and ingress") CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-16RDMA/ucma: ucma_context reference leak in error pathShamir Rabinovitch
Validating input parameters should be done before getting the cm_id otherwise it can leak a cm_id reference. Fixes: 6a21dfc0d0db ("RDMA/ucma: Limit possible option size") Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-06Merge tag 'for-linus-unmerged' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Doug and I are at a conference next week so if another PR is sent I expect it to only be bug fixes. Parav noted yesterday that there are some fringe case behavior changes in his work that he would like to fix, and I see that Intel has a number of rc looking patches for HFI1 they posted yesterday. Parav is again the biggest contributor by patch count with his ongoing work to enable container support in the RDMA stack, followed by Leon doing syzkaller inspired cleanups, though most of the actual fixing went to RC. There is one uncomfortable series here fixing the user ABI to actually work as intended in 32 bit mode. There are lots of notes in the commit messages, but the basic summary is we don't think there is an actual 32 bit kernel user of drivers/infiniband for several good reasons. However we are seeing people want to use a 32 bit user space with 64 bit kernel, which didn't completely work today. So in fixing it we required a 32 bit rxe user to upgrade their userspace. rxe users are still already quite rare and we think a 32 bit one is non-existing. - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them" * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits) IB/rxe: Fix for oops in rxe_register_device on ppc64le arch IB/mlx5: Device memory mr registration support net/mlx5: Mkey creation command adjustments IB/mlx5: Device memory support in mlx5_ib net/mlx5: Query device memory capabilities IB/uverbs: Add device memory registration ioctl support IB/uverbs: Add alloc/free dm uverbs ioctl support IB/uverbs: Add device memory capabilities reporting IB/uverbs: Expose device memory capabilities to user RDMA/qedr: Fix wmb usage in qedr IB/rxe: Removed GID add/del dummy routines RDMA/qedr: Zero stack memory before copying to user space IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR IB/mlx5: Add information for querying IPsec capabilities IB/mlx5: Add IPsec support for egress and ingress {net,IB}/mlx5: Add ipsec helper IB/mlx5: Add modify_flow_action_esp verb IB/mlx5: Add implementation for create and destroy action_xfrm IB/uverbs: Introduce ESP steering match filter IB/uverbs: Add modify ESP flow_action ...
2018-04-05IB/rxe: Fix for oops in rxe_register_device on ppc64le archMikhail Malygin
On ppc64le arch rxe_add command causes oops in kernel log: [ 92.495140] Oops: Kernel access of bad area, sig: 11 [#1] [ 92.499710] SMP NR_CPUS=2048 NUMA pSeries [ 92.499792] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) nf_conntrack_netlink(E) nfnetlink(E) xfrm_user(E) iptable _nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E) nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) af_packet(E) rpcrdma(E) ib_isert(E) iscsi_target_mod(E) i b_iser(E) libiscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) bochs_drm(E) tt m(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm(E) agpgart(E) virtio_rng(E) virtio_console(E) rtc_ generic(E) dm_ec(OEN) ttln_rdma(OEN) rdma_cm(E) configfs(E) iw_cm(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_core(E) ql a2xxx(E) [ 92.499832] scsi_transport_fc(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) ipmi_watchdog(E) ipmi_ssif(E) ipmi_poweroff(E) ipmi_powernv(EX) ipmi_devintf(E) ipmi_msghandler(E) dummy(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_service_time(E) scsi_transport_iscsi(E) sd_mod(E) sr_mod(E) cdrom(E) hid_generic(E) usbhid(E) virtio_blk(E) virtio_scsi(E) virtio_net(E) ibmvscsi(EX) scsi_transport_srp(E) xhci_pci(E) xhci_hcd(E) usbcore(E) usb_common(E) virtio_pci(E) virtio_ring(E) virtio(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E) [ 92.499834] Supported: No, Unsupported modules are loaded [ 92.499839] CPU: 3 PID: 5576 Comm: sh Tainted: G OE NX 4.4.120-ttln.17-default #1 [ 92.499841] task: c0000000afe8a490 ti: c0000000beba8000 task.ti: c0000000beba8000 [ 92.499842] NIP: c00000000008ba3c LR: c000000000027644 CTR: c00000000008ba10 [ 92.499844] REGS: c0000000bebab750 TRAP: 0300 Tainted: G OE NX (4.4.120-ttln.17-default) [ 92.499850] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28424428 XER: 20000000 [ 92.499871] CFAR: 0000000000002424 DAR: 0000000000000208 DSISR: 40000000 SOFTE: 1 GPR00: c000000000027644 c0000000bebab9d0 c000000000f09700 0000000000000000 GPR04: d0000000043d7192 0000000000000002 000000000000001a fffffffffffffffe GPR08: 000000000000009c c00000000008ba10 d0000000043e5848 d0000000043d3828 GPR12: c00000000008ba10 c000000007a02400 0000000010062e38 0000010020388860 GPR16: 0000000000000000 0000000000000000 00000100203885f0 00000000100f6c98 GPR20: c0000000b3f1fcc0 c0000000b3f1fc48 c0000000b3f1fbd0 c0000000b3f1fb58 GPR24: c0000000b3f1fae0 c0000000b3f1fa68 00000000000005dc c0000000b3f1f9f0 GPR28: d0000000043e5848 c0000000b3f1f900 c0000000b3f1f320 c0000000b3f1f000 [ 92.499881] NIP [c00000000008ba3c] dma_get_required_mask_pSeriesLP+0x2c/0x1a0 [ 92.499885] LR [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499886] Call Trace: [ 92.499891] [c0000000bebab9d0] [c0000000bebaba30] 0xc0000000bebaba30 (unreliable) [ 92.499894] [c0000000bebaba10] [c000000000027644] dma_get_required_mask+0x44/0xac [ 92.499904] [c0000000bebaba30] [d0000000043cb4b4] rxe_register_device+0xc4/0x430 [rdma_rxe] [ 92.499910] [c0000000bebabab0] [d0000000043c06c8] rxe_add+0x448/0x4e0 [rdma_rxe] [ 92.499915] [c0000000bebabb30] [d0000000043d28dc] rxe_net_add+0x4c/0xf0 [rdma_rxe] [ 92.499921] [c0000000bebabb60] [d0000000043d305c] rxe_param_set_add+0x6c/0x1ac [rdma_rxe] [ 92.499924] [c0000000bebabbf0] [c0000000000e78c0] param_attr_store+0xa0/0x180 [ 92.499927] [c0000000bebabc70] [c0000000000e6448] module_attr_store+0x48/0x70 [ 92.499932] [c0000000bebabc90] [c000000000391f60] sysfs_kf_write+0x70/0xb0 [ 92.499935] [c0000000bebabcb0] [c000000000390f1c] kernfs_fop_write+0x18c/0x1e0 [ 92.499939] [c0000000bebabd00] [c0000000002e22ac] __vfs_write+0x4c/0x1d0 [ 92.499942] [c0000000bebabd90] [c0000000002e2f94] vfs_write+0xc4/0x200 [ 92.499945] [c0000000bebabde0] [c0000000002e488c] SyS_write+0x6c/0x110 [ 92.499948] [c0000000bebabe30] [c000000000009384] system_call+0x38/0xe4 [ 92.499949] Instruction dump: [ 92.499954] 4e800020 3c4c00e8 3842dcf0 7c0802a6 f8010010 60000000 7c0802a6 fba1ffe8 [ 92.499958] fbc1fff0 fbe1fff8 f8010010 f821ffc1 <e9230208> 7c7e1b78 2fa90000 419e0078 [ 92.499962] ---[ end trace bed077e15eb420cf ]--- It fails in dma_get_required_mask, that has ppc-specific implementation, and fail if provided device argument is NULL Signed-off-by: Mikhail Malygin <mikhail@malygin.me> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/mlx5: Device memory mr registration supportAriel Levkovich
Adding mlx5_ib driver implementation for reg_dm_mr callback which allows registering device memory (DM) as an MR for local and remote access. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05net/mlx5: Mkey creation command adjustmentsAriel Levkovich
This change updates the mlx5 interface to create mkey on the device. The updates in the command mailbox include increasing the access mode type field to 5 bits in order to support additional types such as MLX5_MKC_ACCESS_MODE_MEMIC which represents device memory access type and will be used when registering MR on allocated device memory. All the places that use the old access mode format are adjusted as well. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/mlx5: Device memory support in mlx5_ibAriel Levkovich
This patch adds the mlx5_ib driver implementation for the device memory allocation API. It implements the ib_device callbacks for allocation and deallocation operations as well as a new mmap command support which allows mapping an allocated device memory to a VMA. The change also adds reporting of device memory maximum size and alignment parameters reported in device capabilities. The allocation/deallocation operations are using new firmware commands to allocate MEMIC memory on the device. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/uverbs: Add device memory registration ioctl supportAriel Levkovich
Adding new ioctl method for the MR object - REG_DM_MR. This command can be used by users to register an allocated device memory buffer as an MR and receive lkey and rkey to be used within work requests. It is added as a new method under the MR object and using a new ib_device callback - reg_dm_mr. The command creates a standard ib_mr object which represents the registered memory. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/uverbs: Add alloc/free dm uverbs ioctl supportAriel Levkovich
This change adds uverbs support for allocation/freeing of device memory commands. A new uverbs object is defined of type idr to represent and track the new resource type allocation per context. The API requires provider driver to implement 2 new ib_device callbacks - one for allocation and one for deallocation which return and accept (respectively) the ib_dm object which represents the allocated memory on the device. The support is added via the ioctl command infrastructure only. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/uverbs: Add device memory capabilities reportingAriel Levkovich
This change allows vendors to report device memory capability max_dm_size - to user via uverbs command. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05RDMA/qedr: Fix wmb usage in qedrKalderon, Michal
This patch comes as a result of Sinan Kaya's work and the decision that writel() must be a strong enough barrier for DMA. wmb usages in qedr driver have either been removed where they were there only to order DMA accesses, and replaced with smp_wmb and comments for the places that the barrier was there for SMP reasons. Fixes: 561e5d48968b ("RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05IB/rxe: Removed GID add/del dummy routinesParav Pandit
rxe driver's add_gid() and del_gid() callbacks are doing simple checks which are already done by the ib core before invoking these callback routines. Therefore, code is simplified to skip implementing add_gid() and del_gid() callback functions. They are only invoked by ib_core if they are implemented. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>