summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2020-01-13RDMA/core: Simplify type usage for ib_uverbs_async_handler()Jason Gunthorpe
This function works on an ib_uverbs_async_file. Accept that as a parameter instead of the struct ib_uverbs_file. Consoldiate all the callers working from an ib_uevent_object to a single function and locate the async_file directly from the struct ib_uobject instead of using context_ptr. Link: https://lore.kernel.org/r/1578504126-9400-11-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_wq.uobjectJason Gunthorpe
This is a struct ib_uwq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-10-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_srq.uobjectJason Gunthorpe
This is a struct ib_usrq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-9-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_qp.uobjectJason Gunthorpe
This is a struct ib_uqp_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-8-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_cq.uobjectJason Gunthorpe
This is a struct ib_ucq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-7-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Make ib_ucq_object use ib_uevent_objectJason Gunthorpe
Any uobject that sends events into the async_event_file should be using ib_uevent_object so it can use the standard uevent based helper functions. CQ pushes events into both the async_event and the comp_channel in an open coded way. Move the async events related stuff to ib_uevent_object. Link: https://lore.kernel.org/r/1578504126-9400-6-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not allow alloc_commit to failJason Gunthorpe
This is a left over from an earlier version that creates a lot of complexity for error unwind, particularly for FD uobjects. The only reason this was done is so that anon_inode_get_file() could be called with the final fops and a fully setup uobject. Both need to be setup since unwinding anon_inode_get_file() via fput will call the driver's release(). Now that the driver does not provide release, we no longer need to worry about this complicated sequence, simply create the struct file at the start and allow the core code's release function to deal with the abort case. This allows all the confusing error paths around commit to be removed. Link: https://lore.kernel.org/r/1578504126-9400-5-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Simplify devx async commandsJason Gunthorpe
With the new FD structure the async commands do not need to hold any references while running. The existing mlx5_cmd_exec_cb() and mlx5_cmd_cleanup_async_ctx() provide enough synchronization to ensure that all outstanding commands are completed before the uobject can be destructed. Remove the now confusing get_file() and the type erasure of the devx_async_cmd_event_file. Link: https://lore.kernel.org/r/1578504126-9400-4-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Simplify destruction of FD uobjectsJason Gunthorpe
FD uobjects have a weird split between the struct file and uobject world. Simplify this to make them pure uobjects and use a generic release method for all struct file operations. This fixes the control flow so that mlx5_cmd_cleanup_async_ctx() is always called before erasing the linked list contents to make the concurrancy simpler to understand. For this to work the uobject destruction must fence anything that it is cleaning up - the design must not rely on struct file lifetime. Only deliver_event() relies on the struct file to when adding new events to the queue, add a is_destroyed check under lock to block it. Link: https://lore.kernel.org/r/1578504126-9400-3-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Use RCU and direct refcounts to keep memory aliveJason Gunthorpe
dispatch_event_fd() runs from a notifier with minimal locking, and relies on RCU and a file refcount to keep the uobject and eventfd alive. As the next patch wants to remove the file_operations release function from the drivers, re-organize things so that the devx_event_notifier() path uses the existing RCU to manage the lifetime of the uobject and eventfd. Move the refcount puts to a call_rcu so that the objects are guaranteed to exist and remove the indirect file refcount. Link: https://lore.kernel.org/r/1578504126-9400-2-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/uverbs: Remove needs_kfree_rcu from uverbs_obj_type_classJason Gunthorpe
After device disassociation the uapi_objects are destroyed and freed, however it is still possible that core code can be holding a kref on the uobject. When it finally goes to uverbs_uobject_free() via the kref_put() it can trigger a use-after-free on the uapi_object. Since needs_kfree_rcu is a micro optimization that only benefits file uobjects, just get rid of it. There is no harm in using kfree_rcu even if it isn't required, and the number of involved objects is small. Link: https://lore.kernel.org/r/20200113143306.GA28717@ziepe.ca Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-12IB/mlx5: Add mmap support for VARYishai Hadas
Add mmap support for VAR, it uses the 'offset' command mode with involvement of IB core APIs to find the previously allocated mmap entry. Link: https://lore.kernel.org/r/20191212110928.334995-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-12IB/mlx5: Introduce VAR object and its alloc/destroy methodsYishai Hadas
Introduce VAR object and its alloc/destroy KABI methods. The internal implementation uses the IB core API to manage mmap/munamp calls. Link: https://lore.kernel.org/r/20191212110928.334995-5-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-12IB/mlx5: Extend caps stage to handle VAR capabilitiesYishai Hadas
Extend caps stage to handle VAR capabilities. Link: https://lore.kernel.org/r/20191212110928.334995-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10Merge branch 'x86/mm' into efi/core, to pick up dependenciesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-10RDMA/core: Remove err in iw_query_portGuoqing Jiang
Since we can return device->ops.query_port directly, so no need to keep those lines. Link: https://lore.kernel.org/r/20200109134043.15568-1-guoqing.jiang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/hns: Add support for reporting wc as software modeXi Wang
When hardware is in resetting stage, we may can't poll back all the expected work completions as the hardware won't generate cqe anymore. This patch allows the driver to compose the expected wc instead of the hardware during resetting stage. Once the hardware finished resetting, we can poll cq from hardware again. Link: https://lore.kernel.org/r/1578572412-25756-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/hns: Bugfix for posting a wqe with sgeLijun Ou
Driver should first check whether the sge is valid, then fill the valid sge and the caculated total into hardware, otherwise invalid sges will cause an error. Fixes: 52e3b42a2f58 ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode") Fixes: 7bdee4158b37 ("RDMA/hns: Fill sq wqe context of ud type in hip08") Link: https://lore.kernel.org/r/1578571852-13704-1-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add RcvShortLengthErrCnt to hfi1statsMike Marciniszyn
This counter, RxShrErr, is required for error analysis and debug. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200106134235.119356.29123.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add software counter for ctxt0 seq dropMike Marciniszyn
All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Return void in packet receiving functionsGrzegorz Andrejczuk
Packet receiving functions returns int value, and yet the return values are not used at all. This patch converts the functions to return void. Link: https://lore.kernel.org/r/20200106134222.119356.84098.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Decouple IRQ name from typeGrzegorz Andrejczuk
IRQ name was connected to IRQ type, this is not sufficient and it would be better to use name as argument to msix_request_irq instead of assigning it to variables when function is called. Index argument was required to generate name and now it can be removed. To generate name correctly helpers function were added and updated. Link: https://lore.kernel.org/r/20200106134216.119356.44478.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Create API for auto activateMike Marciniszyn
Add an auto activate routine for use by the interrupt handler. Link: https://lore.kernel.org/r/20200106134210.119356.43079.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: IB/hfi1: Add an API to handle special case dropMike Marciniszyn
This patch pushes special case drop logic into an API to be shared by all interrupt handlers. Additionally, convert do_drop to a bool. Link: https://lore.kernel.org/r/20200106134203.119356.36962.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move common receive IRQ code to functionGrzegorz Andrejczuk
Tracing interrupts, incrementing interrupt counter and ASPM are part that will be reused by HFI1 receive IRQ handlers. Create common function to have shared code in one place. Link: https://lore.kernel.org/r/20200106134157.119356.32656.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add fast and slow handlers for receive contextMike Marciniszyn
This patch eliminate special cases by adding a fast_handler member to the receive context and changes to the fast handler as specified in the new variable. Initialize the variable as soon as the setting for dma tail is known when the context is created. Setting fast path is called every time when any context has entered slow path. Add function to check if contexts is using fast path and do not set fast path when it is already done to improve RCD fastpath setting. Link: https://lore.kernel.org/r/20200106134150.119356.87558.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move chip specific functions to chip.cMike Marciniszyn
Move routines and defines associated with hdrq size validation to a chip specific routine since the limits are specific to the device. Fix incorrect value for min size 2 -> 32 CSR writes should also be in chip.c. Create a chip routine to write the hdrq specific CSRs and call as appropriate. Link: https://lore.kernel.org/r/20200106134144.119356.74312.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/core: Fix locking in ib_uverbs_event_readJason Gunthorpe
This should not be using ib_dev to test for disassociation, during disassociation is_closed is set under lock and the waitq is triggered. Instead check is_closed and be sure to re-obtain the lock to test the value after the wait_event returns. Fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications") Link: https://lore.kernel.org/r/1578504126-9400-12-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-09IB/core: Fix build failure without hugepagesArnd Bergmann
HPAGE_SHIFT is only defined on architectures that support hugepages: drivers/infiniband/core/umem_odp.c: In function 'ib_umem_odp_get': drivers/infiniband/core/umem_odp.c:245:26: error: 'HPAGE_SHIFT' undeclared (first use in this function); did you mean 'PAGE_SHIFT'? Enclose this in an #ifdef. Fixes: 9ff1b6466a29 ("IB/core: Fix ODP with IB_ACCESS_HUGETLB handling") Link: https://lore.kernel.org/r/20200109084740.2872079-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Rename event_handler_lock to qp_open_list_lockParav Pandit
This lock is used to protect the qp->open_list linked list. As a side effect it seems to also globally serialize the qp event_handler, but it isn't clear if that is a deliberate design. Link: https://lore.kernel.org/r/20191212113024.336702-5-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Cut down single member ib_cache structureParav Pandit
Given that ib_cache structure has only single member now, merge the cache lock directly in the ib_device. Link: https://lore.kernel.org/r/20191212113024.336702-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Let IB core distribute cache update eventsParav Pandit
Currently when the low level driver notifies Pkey, GID, and port change events they are notified to the registered handlers in the order they are registered. IB core and other ULPs such as IPoIB are interested in GID, LID, Pkey change events. Since all GID queries done by ULPs are serviced by IB core, and the IB core deferes cache updates to a work queue, it is possible for other clients to see stale cache data when they handle their own events. For example, the below call tree shows how ipoib will call rdma_query_gid() concurrently with the update to the cache sitting in the WQ. mlx5_ib_handle_event() ib_dispatch_event() ib_cache_event() queue_work() -> slow cache update [..] ipoib_event() queue_work() [..] work handler ipoib_ib_dev_flush_light() __ipoib_ib_dev_flush() ipoib_dev_addr_changed_valid() rdma_query_gid() <- Returns old GID, cache not updated. Move all the event dispatch to a work queue so that the cache update is always done before any clients are notified. Fixes: f35faa4ba956 ("IB/core: Simplify ib_query_gid to always refer to cache") Link: https://lore.kernel.org/r/20191212113024.336702-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/mlx5: Do reverse sequence during device removalParav Pandit
When IB device profile initialization completes, device is marked as active. However, IB device is not marked inactive, during device removal flow. It should be the mirror of the add flow. Hence, mark it inactive during remove sequence. Link: https://lore.kernel.org/r/20191212113024.336702-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Fix coding style issuesLijun Ou
Fix some coding style issuses without changing logic of codes, most of the modification is unreasonable line breaks and alignments. Link: https://lore.kernel.org/r/1578313276-29080-8-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Replace custom macros HNS_ROCE_ALIGN_UPWenpeng Liang
HNS_ROCE_ALIGN_UP can be replaced by round_up() which is defined in kernel.h. Link: https://lore.kernel.org/r/1578313276-29080-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove redundant print informationYixing Liu
There are already necessary prints in outer function, prints in hns_roce_function_clear() may confuse users. So these prints is removed. Link: https://lore.kernel.org/r/1578313276-29080-6-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Delete unnessary parameters in hns_roce_v2_qp_modify()Lijun Ou
Current state and new state of qp won't be configured when modifying qp, so these two redundant parameters should be removed. Link: https://lore.kernel.org/r/1578313276-29080-5-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Update the value of qp typeLijun Ou
The values used to represent service type of RC and UD should be interchanged according to design of hardware. And it's better to define these types in enumeration than macros. Link: https://lore.kernel.org/r/1578313276-29080-4-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove unused function hns_roce_init_eq_table()Lijun Ou
hns_roce_init_eq_table() is an unused function that only retains its declaration in driver. Link: https://lore.kernel.org/r/1578313276-29080-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Avoid printing address of mtt pageWenpeng Liang
Address of a page shouldn't be printed in case of security issues. Link: https://lore.kernel.org/r/1578313276-29080-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/core: Add trace points to follow MR allocationChuck Lever
Track the lifetime of ib_mr objects. Here's sample output from a test run with NFS/RDMA: <...>-361 [009] 79238.772782: mr_alloc: pd.id=3 mr.id=11 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772812: mr_alloc: pd.id=3 mr.id=12 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772839: mr_alloc: pd.id=3 mr.id=13 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772866: mr_alloc: pd.id=3 mr.id=14 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772893: mr_alloc: pd.id=3 mr.id=15 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772921: mr_alloc: pd.id=3 mr.id=16 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772947: mr_alloc: pd.id=3 mr.id=17 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772974: mr_alloc: pd.id=3 mr.id=18 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773001: mr_alloc: pd.id=3 mr.id=19 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773028: mr_alloc: pd.id=3 mr.id=20 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773055: mr_alloc: pd.id=3 mr.id=21 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.270942: mr_alloc: pd.id=3 mr.id=22 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.270975: mr_alloc: pd.id=3 mr.id=23 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271007: mr_alloc: pd.id=3 mr.id=24 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271036: mr_alloc: pd.id=3 mr.id=25 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271067: mr_alloc: pd.id=3 mr.id=26 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271095: mr_alloc: pd.id=3 mr.id=27 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271121: mr_alloc: pd.id=3 mr.id=28 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271153: mr_alloc: pd.id=3 mr.id=29 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271181: mr_alloc: pd.id=3 mr.id=30 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271208: mr_alloc: pd.id=3 mr.id=31 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271236: mr_alloc: pd.id=3 mr.id=32 type=MEM_REG max_num_sg=30 rc=0 <...>-4351 [001] 79242.299400: mr_dereg: mr.id=32 <...>-4351 [001] 79242.299467: mr_dereg: mr.id=31 <...>-4351 [001] 79242.299554: mr_dereg: mr.id=30 <...>-4351 [001] 79242.299615: mr_dereg: mr.id=29 <...>-4351 [001] 79242.299684: mr_dereg: mr.id=28 <...>-4351 [001] 79242.299748: mr_dereg: mr.id=27 <...>-4351 [001] 79242.299812: mr_dereg: mr.id=26 <...>-4351 [001] 79242.299874: mr_dereg: mr.id=25 <...>-4351 [001] 79242.299944: mr_dereg: mr.id=24 <...>-4351 [001] 79242.300009: mr_dereg: mr.id=23 <...>-4351 [001] 79242.300190: mr_dereg: mr.id=22 <...>-4351 [001] 79242.300263: mr_dereg: mr.id=21 <...>-4351 [001] 79242.300326: mr_dereg: mr.id=20 <...>-4351 [001] 79242.300388: mr_dereg: mr.id=19 <...>-4351 [001] 79242.300450: mr_dereg: mr.id=18 <...>-4351 [001] 79242.300516: mr_dereg: mr.id=17 <...>-4351 [001] 79242.300629: mr_dereg: mr.id=16 <...>-4351 [001] 79242.300718: mr_dereg: mr.id=15 <...>-4351 [001] 79242.300784: mr_dereg: mr.id=14 <...>-4351 [001] 79242.300879: mr_dereg: mr.id=13 <...>-4351 [001] 79242.300945: mr_dereg: mr.id=12 <...>-4351 [001] 79242.301012: mr_dereg: mr.id=11 Some features of the output: - The lifetime and owner PD of each MR is clearly visible. - The type of MR is captured, as is the SGE array size. - Failing MR allocation can be recorded. Link: https://lore.kernel.org/r/20191218201820.30584.34636.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/core: Trace points for diagnosing completion queue issuesChuck Lever
Sample trace events: kworker/u29:0-300 [007] 120.042217: cq_alloc: cq.id=4 nr_cqe=161 comp_vector=2 poll_ctx=WORKQUEUE <idle>-0 [002] 120.056292: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.056402: cq_process: cq.id=4 wake-up took 109 [us] from interrupt kworker/2:1H-482 [002] 120.056407: cq_poll: cq.id=4 requested 16, returned 1 <idle>-0 [002] 120.067503: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.067537: cq_process: cq.id=4 wake-up took 34 [us] from interrupt kworker/2:1H-482 [002] 120.067541: cq_poll: cq.id=4 requested 16, returned 1 <idle>-0 [002] 120.067657: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.067672: cq_process: cq.id=4 wake-up took 15 [us] from interrupt kworker/2:1H-482 [002] 120.067674: cq_poll: cq.id=4 requested 16, returned 1 ... systemd-1 [002] 122.392653: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 122.392688: cq_process: cq.id=4 wake-up took 35 [us] from interrupt kworker/2:1H-482 [002] 122.392693: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.392836: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.392970: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.393083: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.393195: cq_poll: cq.id=4 requested 16, returned 3 Several features to note in this output: - The WCE count and context type are reported at allocation time - The CPU and kworker for each CQ is evident - The CQ's restracker ID is tagged on each trace event - CQ poll scheduling latency is measured - Details about how often single completions occur versus multiple completions are evident - The cost of the ULP's completion handler is recorded Link: https://lore.kernel.org/r/20191218201815.30584.3481.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/cma: Add trace points in RDMA Connection ManagerChuck Lever
Record state transitions as each connection is established. The IP address of both peers and the Type of Service is reported. These trace points are not in performance hot paths. Also, record each cm_event_handler call to ULPs. This eliminates the need for each ULP to add its own similar trace point in its CM event handler function. These new trace points appear in a new trace subsystem called "rdma_cma". Sample events: <...>-220 [004] 121.430733: cm_id_create: cm.id=0 <...>-472 [003] 121.430991: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ADDR_RESOLVED (0/0) <...>-472 [003] 121.430995: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-472 [003] 121.431172: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ROUTE_RESOLVED (2/0) <...>-472 [003] 121.431174: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 121.433480: cm_qp_create: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 pd.id=2 qp_type=RC send_wr=4091 recv_wr=256 qp_num=521 rc=0 <...>-220 [004] 121.433577: cm_send_req: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 kworker/1:2-973 [001] 121.436190: cm_send_mra: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436340: cm_send_rtu: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436359: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ESTABLISHED (9/0) kworker/1:2-973 [001] 121.436365: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-1975 [005] 123.161954: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-1975 [005] 123.161974: cm_sent_dreq: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-220 [004] 123.162102: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/0:1-13 [000] 123.162391: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 DISCONNECTED (10/0) kworker/0:1-13 [000] 123.162393: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 123.164456: cm_qp_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 <...>-220 [004] 123.165290: cm_id_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 Some features to note: - restracker ID of the rdma_cm_id is tagged on each trace event - The source and destination IP addresses and TOS are reported - CM event upcalls are shown with decoded event and status - CM state transitions are reported - rdma_cm_id lifetime events are captured - The latency of ULP CM event handlers is reported - Lifetime events of associated QPs are reported - Device removal and insertion is reported This patch is based on previous work by: Saeed Mahameed <saeedm@mellanox.com> Mukesh Kacker <mukesh.kacker@oracle.com> Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Aron Silverton <aron.silverton@oracle.com> Avinash Repaka <avinash.repaka@oracle.com> Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Link: https://lore.kernel.org/r/20191218201810.30584.3052.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07i40iw: Remove setting of VMA private data and use rdma_user_mmap_ioShiraz Saleem
vm_ops is now initialized in ib_uverbs_mmap() with the recent rdma mmap API changes. Earlier it was done in rdma_umap_priv_init() which would not be called unless a driver called rdma_user_mmap_io() in its mmap. i40iw does not use the rdma_user_mmap_io API but sets the vma's vm_private_data to a driver object. This now conflicts with the vm_op rdma_umap_close as priv pointer points to the i40iw driver object instead of the private data setup by core when rdma_user_mmap_io is called. This leads to a crash in rdma_umap_close with a mmap put being called when it should not have. Remove the redundant setting of the vma private_data in i40iw as it is not used. Also move i40iw over to use the rdma_user_mmap_io API. This gives the extra protection of having the mappings zapped when the context is detsroyed. BUG: unable to handle page fault for address: 0000000100000001 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 6 PID: 9528 Comm: rping Kdump: loaded Not tainted 5.5.0-rc4+ #117 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F7 01/17/2014 RIP: 0010:rdma_user_mmap_entry_put+0xa/0x30 [ib_core] RSP: 0018:ffffb340c04c7c38 EFLAGS: 00010202 RAX: 00000000ffffffff RBX: ffff9308e7be2a00 RCX: 000000000000cec0 RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000100000001 RBP: ffff9308dc7641f0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff8d4414d8 R12: ffff93075182c780 R13: 0000000000000001 R14: ffff93075182d2a8 R15: ffff9308e2ddc840 FS: 0000000000000000(0000) GS:ffff9308fdc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000100000001 CR3: 00000002e0412004 CR4: 00000000001606e0 Call Trace: rdma_umap_close+0x40/0x90 [ib_uverbs] remove_vma+0x43/0x80 exit_mmap+0xfd/0x1b0 mmput+0x6e/0x130 do_exit+0x290/0xcc0 ? get_signal+0x152/0xc40 do_group_exit+0x46/0xc0 get_signal+0x1bd/0xc40 ? prepare_to_wait_event+0x97/0x190 do_signal+0x36/0x630 ? remove_wait_queue+0x60/0x60 ? __audit_syscall_exit+0x1d9/0x290 ? rcu_read_lock_sched_held+0x52/0x90 ? kfree+0x21c/0x2e0 exit_to_usermode_loop+0x4f/0xc3 do_syscall_64+0x1ed/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fae715a81fd Code: Bad RIP value. RSP: 002b:00007fae6e163cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: fffffffffffffe00 RBX: 00007fae6e163d30 RCX: 00007fae715a81fd RDX: 0000000000000010 RSI: 00007fae6e163cf0 RDI: 0000000000000003 RBP: 00000000013413a0 R08: 00007fae68000000 R09: 0000000000000017 R10: 0000000000000001 R11: 0000000000000293 R12: 00007fae680008c0 R13: 00007fae6e163cf0 R14: 00007fae717c9804 R15: 00007fae6e163ed0 CR2: 0000000100000001 ---[ end trace b33d58d3a06782cb ]--- RIP: 0010:rdma_user_mmap_entry_put+0xa/0x30 [ib_core] Fixes: b86deba977a9 ("RDMA/core: Move core content from ib_uverbs to ib_core") Link: https://lore.kernel.org/r/20200107162223.1745-1-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-06remove ioremap_nocache and devm_ioremap_nocacheChristoph Hellwig
ioremap has provided non-cached semantics by default since the Linux 2.6 days, so remove the additional ioremap_nocache interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03RDMA/cm: Delete unused CM ARP functionsLeon Romanovsky
Clean the code by deleting ARP functions, which are not called anyway. Link: https://lore.kernel.org/r/20191212093830.316934-46-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/cm: Delete unused CM LAP functionsLeon Romanovsky
Clean the code by deleting LAP functions, which are not called anyway. Link: https://lore.kernel.org/r/20191212093830.316934-43-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/i40iw: fix a potential NULL pointer dereferenceXiyu Yang
A NULL pointer can be returned by in_dev_get(). Thus add a corresponding check so that a NULL pointer dereference will be avoided at this place. Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status") Link: https://lore.kernel.org/r/1577672668-46499-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/rxe: Fix error type of mmap_offsetJiewei Ke
The type of mmap_offset should be u64 instead of int to match the type of mminfo.offset. If otherwise, after we create several thousands of CQs, it will run into overflow issues. Link: https://lore.kernel.org/r/20191227113613.5020-1-kejiewei.cn@gmail.com Signed-off-by: Jiewei Ke <kejiewei.cn@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/mlx5: use true,false for bool variablezhengbin
Fixes coccicheck warning: drivers/infiniband/hw/mlx5/mr.c:150:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/mr.c:1455:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/qp.c:1874:6-20: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-6-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>