summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)Author
2017-12-27IB/mlx5: Serialize access to the VMA listMajd Dibbiny
User-space applications can do mmap and munmap directly at any time. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Add a mutex around the list. Cc: <stable@vger.kernel.org> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22IB/hfi1: Change slid arg in ingress_pkey_table_fail to 32bitDon Hiatt
Change the slid arg to ingress_pkey_table_fail() to a full 32Bits and do not convert to 16Bits in caller. This is so we can keep everything 32bit in the kernel and only change to 16bit at the uapi boundary. Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Fix the connection ORD value for loopbackTatyana Nikolova
The accepting QP ORD value should be adjusted not to exceed the peer QP IRD value (RFC 6581). This is skipped for loopback. After the ORD is validated by i40iw_record_ird_ord(), adjust the ORD value of the loopback accepting QP to prevent overrunning the IRD space of the peer QP. Also move the ORD accounting for 0-byte RDMA read to i40iw_record_ird_ord(). Fixes: f27b4746f378 ("i40iw: add connection management code") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Validate correct IRD/ORD connection parametersTatyana Nikolova
Casting to u16 before validating IRD/ORD connection parameters could cause recording wrong IRD/ORD values in the cm_node. Validate the IRD/ORD parameters as they are passed by the application before recording them. Fixes: f27b4746f378 ("i40iw: add connection management code") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Ignore LLP_DOUBT_REACHABILITY AEShiraz Saleem
The LLP_DOUBT_REACHABILITY Asynchronous Event (AE) is an early warning of a connection issue. It is followed by LLP_TOO_MANY_RETRIES AE, if the retransmit threshold is reached and recovery is not possible for the connection. Currently we terminate the connection on receiving the LLP_DOUBT_REACHABILITY AE. Ignore this AE and terminate the connection only on LLP_TOO_MANY_RETRIES AE. This improves the user experience on cable disconnect/reconnect scenario while running iWARP traffic. On cable disconnect, the QP traffic is paused and the user has a larger and more reasonable timeout within which if the cable is reconnected, traffic can continue. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Fix sequence number for the first partial FPDUShiraz Saleem
Partial FPDU processing is broken as the sequence number for the first partial FPDU is wrong due to incorrect Q2 buffer offset. The offset should be 64 rather than 16. Fixes: 786c6adb3a94 ("i40iw: add puda code") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Selectively teardown QPs on IP addr change eventShiraz Saleem
On IP address change event, all connected QPs are torn down irrespective of whether IP address is involved in a connection. Only teardown connections those source or destination address matches the netdev interface IP address being changed, and if they are on the same VLAN as the netdev. Fixes: e5e74b61b165 ("i40iw: Add IP addr handling on netdev events") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Add notifier for network device eventsShiraz Saleem
Register a netdevice notifier for netdev UP/DOWN notification events and report the appropriate ib event. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Correct Q1/XF object count equationShiraz Saleem
Lower Inbound RDMA Read Queue (Q1) object count by a factor of 2 as it is incorrectly doubled. Also, round up Q1 and Transmit FIFO (XF) object count to power of 2 to satisfy hardware requirement. Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Use utility function roundup_pow_of_two()Shiraz Saleem
Consolidate all power of 2 round calculations to use kernel utility function roundup_pow_of_two(). Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22i40iw: Set MAX_IRD_SIZE to 64Shiraz Saleem
Increase I40IW_MAX_IRD_SIZE to 64 which is the device limit. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22nes: Change accelerated flag to boolShiraz Saleem
The accelerated flag only utilizes two values: 0 and 1. Modify accelerated flag in struct nes_cm_node to bool. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22IB/hfi: Only read capability registers if the capability existsMichael J. Ruhl
During driver init, various registers are saved to allow restoration after an FLR or gen3 bump. Some of these registers are not available in some circumstances (i.e. Virtual machines). This bug makes the driver unusable when the PCI device is passed into a VM, it fails during probe. Delete unnecessary register read/write, and only access register if the capability exists. Cc: <stable@vger.kernel.org> # 4.14.x Fixes: a618b7e40af2 ("IB/hfi1: Move saving PCI values to a separate function") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22drivers: infiniband: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22RDMA/hns: Add eq support of hip08Yixian Liu
This patch adds eq support for hip08. The eq table can be multi-hop addressed. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Reviewed-by: Lijun Ou <oulijun@huawei.com> Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22RDMA/hns: Refactor eq code for hip06Yixian Liu
Considering the compatibility of supporting hip08's eq process and possible changes of data structure, this patch refactors the eq code structure of hip06. We move all the eq process code for hip06 from hns_roce_eq.c into hns_roce_hw_v1.c, and also for hns_roce_eq.h. With these changes, it will be convenient to add the eq support for later hardware version. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Reviewed-by: Lijun Ou <oulijun@huawei.com> Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21IB/mlx5: Fix congestion counters in LAG modeMajd Dibbiny
Congestion counters are counted and queried per physical function. When working in LAG mode, CNP packets can be sent or received on both of the functions, thus congestion counters should be aggregated from the two physical functions. Fixes: e1f24a79f424 ("IB/mlx5: Support congestion related counters") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroyBryan Tan
The use of wait queues in vmw_pvrdma for handling concurrent access to a resource leaves a race condition which can cause a use after free bug. Fix this by using the pattern from other drivers, complete() protected by dec_and_test to ensure complete() is called only once. Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warningBryan Tan
refcount_dec generates a warning when the operation causes the refcount to hit zero. Avoid this by using refcount_dec_and_test. Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP pathBryan Tan
The QP cleanup did not previously call ib_umem_release, resulting in a user-triggerable kernel resource leak. Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21iw_cxgb4: when flushing, complete all wrs in a chainSteve Wise
If a wr chain was posted and needed to be flushed, only the first wr in the chain was completed with FLUSHED status. The rest were never completed. This caused isert to hang on shutdown due to the missing completions which left iscsi IO commands referenced, stalling the shutdown. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21iw_cxgb4: reflect the original WR opcode in drain cqesSteve Wise
The flush/drain logic was not retaining the original wr opcode in its completion. This can cause problems if the application uses the completion opcode to make decisions. Use bit 10 of the CQE header word to indicate the CQE is a special drain completion, and save the original WR opcode in the cqe header opcode field. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21iw_cxgb4: Only validate the MSN for successful completionsSteve Wise
If the RECV CQE is in error, ignore the MSN check. This was causing recvs that were flushed into the sw cq to be completed with the wrong status (BAD_MSN instead of FLUSHED). Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-20RDMA/ocrdma: Fix permissions for OCRDMA_RESET_STATSAnton Vasilyev
Debugfs file reset_stats is created with S_IRUSR permissions, but ocrdma_dbgfs_ops_read() doesn't support OCRDMA_RESET_STATS, whereas ocrdma_dbgfs_ops_write() supports only OCRDMA_RESET_STATS. The patch fixes misstype with permissions. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18IB/mlx4: Remove unused ibpd parameterErez Alfasi
Remove unused ibpd parameter from create_qp_rss() function. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13IB/ocrdma: Remove unneeded conversions to boolAndrew F. Davis
Found with scripts/coccinelle/misc/boolconv.cocci. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13IB/mlx4: Potential buffer overflow in _mlx4_set_path()Dan Carpenter
Smatch complains about this code: drivers/infiniband/hw/mlx4/qp.c:1827 _mlx4_set_path() error: buffer overflow 'dev->dev->caps.gid_table_len' 3 <= 255 The mlx4_ib_gid_index_to_real_index() does check that "port" is within bounds, but we don't check the return value for errors. It seems simple enough to add a check for that. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13RDMA/cxgb4: Add a sanity check in process_work()Dan Carpenter
The story is that Smatch marks skb->data as untrusted so it generates a warning message here: drivers/infiniband/hw/cxgb4/cm.c:4100 process_work() error: buffer overflow 'work_handlers' 241 <= 255 In other places which handle this such as t4_uld_rx_handler() there is some checking to make sure that the function pointer is not NULL. I have added bounds checking and a check for NULL here as well. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13iw_cxgb4: make pointer reg_workq staticColin Ian King
The pointer reg_workq is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/infiniband/hw/cxgb4/device.c:69:25: warning: symbol 'reg_workq' was not declared. Should it be static? Fixes: 1c8f1da5d851 ("iw_cxgb4: Fix possible circular dependency locking warning") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13infiniband: cxgb4: use ktime_get for timestampsArnd Bergmann
The debugfs file prints the difference between host timestamps as a seconds/nanoseconds tuple, along with a 64-bit nanoseconds hardware timestamp. The host time is read using getnstimeofday() which is deprecated because of the y2038 overflow, and it suffers from time jumps during settimeofday() and leap seconds. Converting to ktime_get_ts64() would solve those two, but I'm going a little further here by changing to ktime_get() and printing 64-bit nanoseconds on both host and hw timestamps. This simplifies the code further and makes the output easier to understand. The format of the debugfs file obviously changes here, but this should only be read by humans and not scripts, so I assume it's fine. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-13RDMA/bnxt_re: Remove redundant bnxt_qplib_disable_nq() callArvind Yadav
The bnxt_qplib_disable_nq() call is redundant as it occurs after 'goto fail' and hence it called twice. Remove it. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/mlx5: revisit -Wmaybe-uninitialized warningArnd Bergmann
A warning that I thought I had fixed before occasionally comes back in rare randconfig builds (I found 7 instances in the last 100000 builds, originally it was much more frequent): drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_ib_reg_user_mr': drivers/infiniband/hw/mlx5/mr.c:1229:5: error: 'order' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (order <= mr_cache_max_order(dev)) { ^ drivers/infiniband/hw/mlx5/mr.c:1247:8: error: 'ncont' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/infiniband/hw/mlx5/mr.c:1247:8: error: 'page_shift' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/infiniband/hw/mlx5/mr.c:1260:2: error: 'npages' may be used uninitialized in this function [-Werror=maybe-uninitialized] I've looked at all those findings again and noticed that they are all with CONFIG_INFINIBAND_USER_MEM=n, which means ib_umem_get() returns an error unconditionally and we never initialize or use those variables. This triggers a condition in gcc iff mr_umem_get() is partially but not entirely inlined, which in turn depends on the exact combination of optimization settings. This is a known problem with gcc, with no easy solution in the compiler, so this adds another workaround that should be more reliable than my previous attempt. Returning an error from mlx5_ib_reg_user_mr() earlier means that we can completely bypass the logic that caused the warning, the compiler can now see that the variable is never accessed. Fixes: 14ab8896f5d9 ("IB/mlx5: avoid bogus -Wmaybe-uninitialized warning") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11i40iw: Reinitialize add_sd_cntMustafa Ismail
add_sd_cnt in info structure passed to i40iw_create_hmc_obj_type must be 0 and since it is modified during the call, it must be reset in the loop. This avoids unnecessarily reprogramming the SDs multiple times with the same values. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11i40iw: Use sqsize to initialize cqp_requests elementsChien Tin Tung
Use sqsize instead of I40IW_CQP_SW_SQSIZE_2048 to initialize cqp_requests elements in the for-loop as sqsize is used to allocate memory for cqp_requests. Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11nes: remove unused 'timeval' struct memberArnd Bergmann
There is a stale entry in nes_cm_tcp_context that has apparently never been used in mainline linux. I'm trying to kill off all users of timeval as part of the y2038-safety work, so let's just remove this one. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11i40iw: remove unused 'timeval' struct memberArnd Bergmann
There is a stale entry in i40iw_cm_tcp_context that apparently was copied from the 'nes' driver but never used in i40iw. I'm trying to kill off all users of timeval as part of the y2038-safety work, so let's just remove this one. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11RDMA/vmw_pvrdma: Do not re-calculate npagesYuval Shaia
There is no need to re-calculate the number of pages since it is already done in ib_umem_get. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Adit Ranadive <aditr@vmware.com> Tested-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11i40w: Remove garbage at end of INFINIBAND_I40IW Kconfig sectionGeert Uytterhoeven
Remove leftover garbage (containing Kconfig dependencies for another symbol?) Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/qib: Cleanup qib_set_part_key() with direct returnsJoe Perches
Perhaps the function is better written without the empty bail: label and without setting ret and just using return. Combining the int/bool conversion of any and the direct returns makes the resulting code clearer. Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/qib: remove redundant setting of any in for-loopColin Ian King
The variable all is being set but is never read after this hence it can be removed from the for loop initialization. Cleans up clang warning: drivers/infiniband/hw/qib/qib_file_ops.c:640:7: warning: Value stored to 'any' is never read Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/qib: Fix comparison error with qperf compare/swap testMike Marciniszyn
This failure exists with qib: ver_rc_compare_swap: mismatch, sequence 2, expected 123456789abcdef, got 0 The request builder was using the incorrect inlines to build the request header resulting in incorrect data in the atomic header. Fix by using the appropriate inlines to create the request. Cc: <stable@vger.kernel.org> # 4.9.x+ Fixes: 261a4351844b ("IB/qib,IB/hfi: Use core common header file") Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/hfi1: Use 4096 for default active MTU in query_qpJan Sokolowski
Currently, if a port is queried that has an invalid Maximum Transmission Unit, driver reports default MTU of 2048. This in incorrect. Use default value of 4096 if invalid. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11IB/hfi1: Mask the path bits with the LMC for 16B RC AcksDon Hiatt
16B packets require that the path bits are masked with the LMC. This mask is done correctly in all 16B header creation but was left out for the RC Acknowledge. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11iw_cxgb4: only insert drain cqes if wq is flushedSteve Wise
Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-07iw_cxgb4: only clear the ARMED bit if a notification is neededSteve Wise
In __flush_qp(), the CQ ARMED bit was being cleared regardless of whether any notification is actually needed. This resulted in the iser termination logic getting stuck in ib_drain_sq() because the CQ was not marked ARMED and thus the drain CQE notification wasn't triggered. This new bug was exposed when this commit was merged: commit cbb40fadd31c ("iw_cxgb4: only call the cq comp_handler when the cq is armed") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-07IB/mlx4: Fix RSS hash fields restrictionsGuy Levi
Mistakenly the driver didn't allow RSS hash fields combinations which involve both IPv4 and IPv6 protocols. This bug caused to failures for user's use cases for RSS. Consequently, this patch fixes this bug and allows any combination that the HW can support. Additionally, the patch fixes the driver to return an error in case the user provides an unsupported mask for RSS hash fields. Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP") Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-12-05drivers/infiniband: Remove now-redundant smp_read_barrier_depends()Paul E. McKenney
The smp_read_barrier_depends() does nothing at all except on DEC Alpha, and no current DEC Alpha systems use Infiniband: lkml.kernel.org/r/20171023085921.jwbntptn6ictbnvj@tower This commit therefore makes Infiniband depend on !ALPHA and removes the now-ineffective invocations of smp_read_barrier_depends() from the InfiniBand driver. Please note that this patch should not be construed as my saying that InfiniBand's memory ordering is correct, but rather that this patch does not in any way affect InfiniBand's correctness. In other words, the result of applying this patch is bug-for-bug compatible with the original. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: <linux-rdma@vger.kernel.org> Cc: <linux-alpha@vger.kernel.org> [ paulmck: Removed drivers/dma/ioat/dma.c per Jason Gunthorpe's feedback. ] Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-01RDMA/hns: Get rid of page operation after dma_alloc_coherentWei Hu\(Xavier\)
In general, dma_alloc_coherent() returns a CPU virtual address and a DMA address, and we have no guarantee that the underlying memory even has an associated struct page at all. This patch gets rid of the page operation after dma_alloc_coherent, and records the VA returned form dma_alloc_coherent in the struct of hem in hns RoCE driver. Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-01RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherentWei Hu\(Xavier\)
In general dma_alloc_coherent() returns a CPU virtual address and a DMA address, and we have no guarantee that the virtual address is either in the linear map or vmalloc. It could be in some other special place. We have no guarantee that the underlying memory even has an associated struct page at all. In current code, there are incorrect usage as below: dma_alloc_coherent + virt_to_page + vmap. There will probably introduce coherency problem. This patch fixes it to get rid of virt_to_page and vmap calls at Leon's suggestion. The related link: https://lkml.org/lkml/2017/11/7/34 Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-01RDMA/hns: Fix the issue of IOVA not page continuous in hip08Wei Hu\(Xavier\)
If the smmu is enabled, the length of sg obtained from __iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg dma address, the IOVA will not be page continuous. so, the current code has MTPT configuration error that probably cause dma operation failure. In order to fix this issue, the IOVA should be calculated based on the sg length. Fixes: 3958cc5("RDMA/hns: Configure the MTPT in hip08") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>