summaryrefslogtreecommitdiff
path: root/include/uapi/rdma
AgeCommit message (Collapse)Author
2018-01-31RDMA/netlink: Hide unimplemented NLDEV commandsLeon Romanovsky
The nldev was implemented by following devlink implementation, including SET/DEL/NEW commands. However these commands were not implemented and hence don't need to be exposed. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-29RDMA/nldev: Provide detailed QP informationLeon Romanovsky
Implement RDMA nldev netlink interface to get detailed information on each QP in the system. This includes the owning process or kernel ULP and detailed information from the qp_attrs. Currently only the dumpit variant is implemented. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-29RDMA/nldev: Provide global resource utilizationLeon Romanovsky
Expose through the netlink interface the global per-device utilization of the supported object types. Provide both dumpit and doit callbacks. As an example of possible output from rdmatool for system with 5 mlx5 cards: $ rdma res 1: mlx5_0: qp 4 cq 5 pd 3 2: mlx5_1: qp 4 cq 5 pd 3 3: mlx5_2: qp 4 cq 5 pd 3 4: mlx5_3: qp 2 cq 3 pd 2 5: mlx5_4: qp 4 cq 5 pd 3 Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-29RDMA: Move enum ib_cq_creation_flags to uapi headersJason Gunthorpe
The flags field the enum is used with comes directly from the uapi so it belongs in the uapi headers for clarity and so userspace can use it. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-18IB/mlx5: Mmap the HCA's clock info to user-spaceFeras Daoud
This patch maps the new page to user space applications to allow converting a user space completion timestamp to system wall time at the lowest possible latency cost. By using a versioning scheme we allow compatibility between current and future userspace libraries. The change moves mlx5_ib_mmap_cmd enum from mlx5_ib.h to the abi header file mlx5-abi.h. Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Eitan Rabin <rabin@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18net/mlx5e: Add clock info page to mlx5 core devicesFeras Daoud
Adds a new page to mlx5 core containing clock info data that allows user level applications to translate between cqe timestamp to nanoseconds. The information stored into this page is represented through mlx5_ib_clock_info. In order to synchronize between kernel and user space a sequence number is incremented at the beginning and end of each update. An odd number means the data is being updated while an even means the access was already done. To guarantee that the data structure was accessed atomically user will: repeat: seq1 = <read sequence> goto <repeate> while odd <read data structure> seq2 = <read sequence> if seq1 != seq2 goto repeat Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Eitan Rabin <rabin@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18RDMA/bnxt_re: Add SRQ support for Broadcom adaptersDevesh Sharma
Shared receive queue (SRQ) is defined as a pool of receive buffers shared among multiple QPs which belong to same protection domain in a given process context. Use of SRQ reduces the memory foot print of IB applications. Broadcom adapters support SRQ, adding code-changes to enable shared receive queue. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-15RDMA: Mark imm_data as be32 in the verbs uapi headerJason Gunthorpe
This matches what the userspace copy of this header has been doing for a while. imm_data is an opaque 4 byte array carried over the network, and invalidate_rkey is in CPU byte order. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Add support for DC target QPMoni Shoua
A DC Target (DCT) QP is represented in the hardware as a unique object. This object is created by CREATE_DCT command and destroyed by DESTROY_DCT command. However, in the driver we describe it as a QP. The hardware command that creates a DCT needs parameters that the verb create_qp() does not provide. Those remaining parameters are provided with the call to the verb modify_qp(). Therefore we delay the actual creation of a DCT in the hardware until the stage of modify_qp() to RTR. A support for query_qp() was added as well. It uses QUERY_DCT command to retrieve the applicable fields. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Handle type IB_QPT_DRIVER when creating a QPMoni Shoua
The QP type IB_QPT_DRIVER doesn't describe the transport or the service that the QP provides but those are known only to the hardware driver. The actual type of the QP is stored in the hardware driver context (i.e. mlx5_qp) under the field qp_sub_type. Take the real QP type and any extra data that is required to create the QP from the driver channel and modify the QP initial attributes before continuing with create_qp(). Downstream patches from this series will add support for both DCI and DCT driver QPs. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-28IB/mlx5: Enable QP creation with a given blue flame indexYishai Hadas
This patch enables QP creation with a given BF index, this allows the user space driver to share same BF between few QPs or alternatively have a dedicated BF per QP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-28IB/mlx5: Extend UAR stuff to support dynamic allocationYishai Hadas
This patch extends the alloc context flow to be prepared for working with dynamic UAR allocations. Currently upon alloc context there is some fix size of UARs that are allocated (named 'static allocation') and there is no option to user application to ask for more or control which UAR will be used by which QP. In this patch the driver prepares its data structures to manage both the static and the dynamic allocations and let the user driver knows about the max value of dynamic blue-flame registers that are allowed. Downstream patches from this series will enable the dynamic allocation and the association as part of QP creation. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-28IB/mlx5: Report inner RSS capabilityMaor Gottlieb
Add missing inner RSS support capability as part of the RSS supported fields. In addition change MLX5_RX_HASH_INNER to 1UL << 31 in order to define it as unsigned. Fixes: 309fa3470fca ("IB/mlx5: Add support for RSS on the inner packet") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-28IB/mlx4: Add support to RSS hash for inner headersGuy Levi
Support RSS hash for inner headers according to a new flag, MLX4_IB_RX_HASH_INNER provided by the vendor channel. In case the flag is set, RSS hash will be done on the inner headers of VXLAN packets (which are encapsulated). Non-encapsulated packets will be hashed according to the outer headers. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27RDMA/vmw_pvrdma: Remove usage of BIT() from UAPI headerBryan Tan
BIT() should not be used in the UAPI header. Remove it. Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-27RDMA/vmw_pvrdma: Add UAR SRQ macros in ABI header fileBryan Tan
Support for SRQs were added in the vmw_pvrdma userlevel library before two necessary macros were added into the kernel ABI header file. Add the two UAR SRQ macros that are required by the userlevel library so that the library can rely on the kernel ABI header file for these SRQ macro definitions. Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-11-15Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is a fairly plain pull request. Lots of driver updates across the stack, a huge number of static analysis cleanups including a close to 50 patch series from Bart Van Assche, and a number of new features inside the stack such as general CQ moderation support. Nothing really stands out, but there might be a few conflicts as you take things in. In particular, the cleanups touched some of the same lines as the new timer_setup changes. Everything in this pull request has been through 0day and at least two days of linux-next (since Stephen doesn't necessarily flag new errors/warnings until day2). A few more items (about 30 patches) from Intel and Mellanox showed up on the list on Tuesday. I've excluded those from this pull request, and I'm sure some of them qualify as fixes suitable to send any time, but I still have to review them fully. If they contain mostly fixes and little or no new development, then I will probably send them through by the end of the week just to get them out of the way. There was a break in my acceptance of patches which coincides with the computer problems I had, and then when I got things mostly back under control I had a backlog of patches to process, which I did mostly last Friday and Monday. So there is a larger number of patches processed in that timeframe than I was striving for. Summary: - Add iWARP support to qedr driver - Lots of misc fixes across subsystem - Multiple update series to hns roce driver - Multiple update series to hfi1 driver - Updates to vnic driver - Add kref to wait struct in cxgb4 driver - Updates to i40iw driver - Mellanox shared pull request - timer_setup changes - massive cleanup series from Bart Van Assche - Two series of SRP/SRPT changes from Bart Van Assche - Core updates from Mellanox - i40iw updates - IPoIB updates - mlx5 updates - mlx4 updates - hns updates - bnxt_re fixes - PCI write padding support - Sparse/Smatch/warning cleanups/fixes - CQ moderation support - SRQ support in vmw_pvrdma" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (296 commits) RDMA/core: Rename kernel modify_cq to better describe its usage IB/mlx5: Add CQ moderation capability to query_device IB/mlx4: Add CQ moderation capability to query_device IB/uverbs: Add CQ moderation capability to query_device IB/mlx5: Exposing modify CQ callback to uverbs layer IB/mlx4: Exposing modify CQ callback to uverbs layer IB/uverbs: Allow CQ moderation with modify CQ iw_cxgb4: atomically flush the qp iw_cxgb4: only call the cq comp_handler when the cq is armed iw_cxgb4: Fix possible circular dependency locking warning RDMA/bnxt_re: report vlan_id and sl in qp1 recv completion IB/core: Only maintain real QPs in the security lists IB/ocrdma_hw: remove unnecessary code in ocrdma_mbx_dealloc_lkey RDMA/core: Make function rdma_copy_addr return void RDMA/vmw_pvrdma: Add shared receive queue support RDMA/core: avoid uninitialized variable warning in create_udata RDMA/bnxt_re: synchronize poll_cq and req_notify_cq verbs RDMA/bnxt_re: Flush CQ notification Work Queue before destroying QP RDMA/bnxt_re: Set QP state in case of response completion errors RDMA/bnxt_re: Add memory barriers when processing CQ/EQ entries ...
2017-11-13IB/uverbs: Add CQ moderation capability to query_deviceYonatan Cohen
The query_device function can now obtain the maximum values for cq_max_count and cq_period, needed for CQ moderation. cq_max_count is a 16 bits number that determines the number of CQEs to accumulate before generating an event. cq_period is a 16 bits number that determines the timeout in micro seconds from the last event generated, upon which a new event will be generated even if cq_max_count was not reached. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13IB/uverbs: Allow CQ moderation with modify CQYonatan Cohen
Uverbs support in modify_cq for CQ moderation only. Gives ability to change cq_max_count and cq_period. CQ moderation enhance performance by moderating the number of CQEs needed to create an event instead of application having to suffer from event per-CQE. To achieve CQ moderation the application needs to set cq_max_count and cq_period. cq_max_count - defines the number of CQEs needed to create an event. cq_period - defines the timeout (micro seconds) between last event and a new one that will occur even if cq_max_count was not satisfied Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13RDMA/vmw_pvrdma: Add shared receive queue supportBryan Tan
Add the required functions needed to support SRQs. Currently, kernel clients are not supported. SRQs will only be available in userspace. Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Reviewed-by: Nitish Bhat <bnitish@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-13IB/mlx5: Fix ABI alignment to 64 bitNoa Osherovich
Struct mlx5_ib_striding_rq_caps was not aligned to 64 bit as it should have been. Add a 32 bit reserved field. Fixes: b4f34597a5ce ('IB/mlx5: Expose multi-packet RQ capabilities') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-11-02License cleanup: add SPDX license identifier to uapi header files with a licenseGreg Kroah-Hartman
Many user space API headers have licensing information, which is either incomplete, badly formatted or just a shorthand for referring to the license under which the file is supposed to be. This makes it hard for compliance tools to determine the correct license. Update these files with an SPDX license identifier. The identifier was chosen based on the license information in the file. GPL/LGPL licensed headers get the matching GPL/LGPL SPDX license identifier with the added 'WITH Linux-syscall-note' exception, which is the officially assigned exception identifier for the kernel syscall exception: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work". This exception makes it possible to include GPL headers into non GPL code, without confusing license compliance tools. Headers which have either explicit dual licensing or are just licensed under a non GPL license are updated with the corresponding SPDX identifier and the GPLv2 with syscall exception identifier. The format is: ((GPL-2.0 WITH Linux-syscall-note) OR SPDX-ID-OF-OTHER-LICENSE) SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text. The update does not remove existing license information as this has to be done on a case by case basis and the copyright holders might have to be consulted. This will happen in a separate step. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX license identifier to uapi header files with no ↵Greg Kroah-Hartman
license Many user space API headers are missing licensing information, which makes it hard for compliance tools to determine the correct license. By default are files without license information under the default license of the kernel, which is GPLV2. Marking them GPLV2 would exclude them from being included in non GPLV2 code, which is obviously not intended. The user space API headers fall under the syscall exception which is in the kernels COPYING file: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work". otherwise syscall usage would not be possible. Update the files which contain no license information with an SPDX license identifier. The chosen identifier is 'GPL-2.0 WITH Linux-syscall-note' which is the officially assigned identifier for the Linux syscall exception. SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-25IB/mlx5: Add support for RSS on the inner packetMaor Gottlieb
Some user space application would like to do RSS on the inner packet fields instead on the outer. When MLX5_RX_HASH_INNER is set with one or more of the other hash fields, then the RSS will be done using the inner packet. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Add tunneling offloads supportMaor Gottlieb
The device can support receive Stateless Offloads for the inner packet's fields only when the packet is processed by TIR which is enabled to support tunneling. Otherwise, the device treats the packet as an ordinary non-tunneling packet and receive offloads can be done only for the outer packet's field. In order to enable receive Stateless Offloading support for incoming tunneling traffic the TIR should be created with tunneled_offload_en. Tunneling offloads is supported only be raw ethernet QP. This patch includes: * New QP creation flag for tunneling offloads. * Reports device capabilities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Support padded 128B CQE featureGuy Levi
In some benchmarks and some CPU architectures, writing the CQE on a full cache line size improves performance by saving memory access operations (read-modify-write) relative to partial cache line change. This patch lets the user to configure the device to pad the CQE up to 128B in case its content is less than 128B. Currently the driver supports only padding for a CQE size of 128B. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Support 128B CQE compression featureGuy Levi
In commit 1cbe6fc86ccf ("IB/mlx5: Add support for CQE compressing") the concept of CQE compression was introduced and added a support for 64B CQE size. This change update the code to support 128B CQE size as well. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Allow creation of a multi-packet RQNoa Osherovich
Allow creation of a multi-packet receive queue. In order to create a multi-packet RQ, the following fields in the mlx5_ib_rwq should be set: - log_num_strides: Log of number of strides per WQE - single_stride_log_num_of_bytes: Log of a single stride size - two_byte_shift_en: When enabled, hardware pads 2 bytes of zeros before writing the message to memory (e.g. for the IP alignment). Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Expose multi-packet RQ capabilitiesNoa Osherovich
This patch reports the device's striding RQ capabilities to the user-space: - min/max_single_stride_log_num_of_bytes: Log of min/max number of bytes in a single stride. - min/max_single_wqe_log_num_of_strides: Log of min/max number of strides in a single WQE. - supported_qpts: A bit mask to know which QP types support multi- packet RQ, for now only Raw Packet QPs. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-25IB/core: Fix typo in the name of the tag-matching cap structLeon Romanovsky
The tag matching functionality is implemented by mlx5 driver by extending XRQ, however this internal kernel information was exposed to user space applications with *xrq* name instead of *tm*. This patch renames *xrq* to *tm* to handle that. Fixes: 8d50505ada72 ("IB/uverbs: Expose XRQ capabilities") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31IB/core: Add completion queue (cq) object actionsMatan Barak
Adding CQ ioctl actions: 1. create_cq 2. destroy_cq This requires adding the following: 1. A specification describing the method a. Handler b. Attributes specification Each attribute is one of the following: a. PTR_IN - input data Note: This could be encoded inlined for data < 64bit b. PTR_OUT - response data c. IDR - idr based object d. FD - fd based object Blobs attributes (clauses a and b) contain their type, while objects specifications (clauses c and d) contains the expected object type (for example, the given id should be UVERBS_TYPE_PD) and the required access (READ, WRITE, NEW or DESTROY). If a NEW is required, the new object's id will be assigned to this attribute. All attributes could get UA_FLAGS attribute. Currently we support stating that an attribute is mandatory or that the specification size corresponds to a lower bound (and that this attribute could be extended). We currently add both default attributes and the two generic UHW_IN and UHW_OUT driver specific attributes. 2. Handler A handler gets a uverbs_attr_bundle. The handler developer uses uverbs_attr_get to fetch an attribute of a given id. Each of these attribute groups correspond to the specification group defined in the action (clauses 1.b and 1.c respectively). The indices of these arrays corresponds to the attribute ids declared in the specifications (clause 2). The handler is quite simple. It assumes the infrastructure fetched all objects and locked, created or destroyed them as required by the specification. Pointer (or blob) attributes were validated to match their required sizes. After the handler finished, the infrastructure commits or rollbacks the objects. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31IB/core: Add legacy driver's user-dataMatan Barak
In this phase, we don't want to change all the drivers to use flexible driver's specific attributes. Therefore, we add two default attributes: UHW_IN and UHW_OUT. These attributes are optional in some methods and they encode the driver specific command data. We add a function that extract this data and creates the legacy udata over it. Driver's data should start from UVERBS_UDATA_DRIVER_DATA_FLAG. This turns on the first bit of the namespace, indicating this attribute belongs to the driver's namespace. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31IB/core: Export ioctl enum types to user-spaceMatan Barak
Add a new ib_user_ioctl_verbs.h which exports all required ABI enums and structs to the user-space. Export the default types to user-space through this file. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31IB/core: Add new ioctl interfaceMatan Barak
In this ioctl interface, processing the command starts from properties of the command and fetching the appropriate user objects before calling the handler. Parsing and validation is done according to a specifier declared by the driver's code. In the driver, all supported objects are declared. These objects are separated to different object namepsaces. Dividing objects to namespaces is done at initialization by using the higher bits of the object ids. This initialization can mix objects declared in different places to one parsing tree using in this ioctl interface. For each object we list all supported methods. Similarly to objects, methods are separated to method namespaces too. Namespacing is done similarly to the objects case. This could be used in order to add methods to an existing object. Each method has a specific handler, which could be either a default handler or a driver specific handler. Along with the handler, a bunch of attributes are specified as well. Similarly to objects and method, attributes are namespaced and hashed by their ids at initialization too. All supported attributes are subject to automatic fetching and validation. These attributes include the command, response and the method's related objects' ids. When these entities (objects, methods and attributes) are used, the high bits of the entities ids are used in order to calculate the hash bucket index. Then, these high bits are masked out in order to have a zero based index. Since we use these high bits for both bucketing and namespacing, we get a compact representation and O(1) array access. This is mandatory for efficient dispatching. Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length. Attributes could be validated through some attributes, like: (*) Minimum size / Exact size (*) Fops for FD (*) Object type for IDR If an IDR/fd attribute is specified, the kernel also states the object type and the required access (NEW, WRITE, READ or DESTROY). All uobject/fd management is done automatically by the infrastructure, meaning - the infrastructure will fail concurrent commands that at least one of them requires concurrent access (WRITE/DESTROY), synchronize actions with device removals (dissociate context events) and take care of reference counting (increase/decrease) for concurrent actions invocation. The reference counts on the actual kernel objects shall be handled by the handlers. objects +--------+ | | | | methods +--------+ | | ns method method_spec +-----+ |len | +--------+ +------+[d]+-------+ +----------------+[d]+------------+ |attr1+-> |type | | object +> |method+-> | spec +-> + attr_buckets +-> |default_chain+--> +-----+ |idr_type| +--------+ +------+ |handler| | | +------------+ |attr2| |access | | | | | +-------+ +----------------+ |driver chain| +-----+ +--------+ | | | | +------------+ | | +------+ | | | | | | | | | | | | | | | | | | | | +--------+ [d] = Hash ids to groups using the high order bits The right types table is also chosen by using the high bits from the ids. Currently we have either default or driver specific groups. Once validation and object fetching (or creation) completed, we call the handler: int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile, struct uverbs_attr_bundle *ctx); ctx bundles attributes of different namespaces. Each element there is an array of attributes which corresponds to one namespaces of attributes. For example, in the usually used case: ctx core +----------------------------+ +------------+ | core: +---> | valid | +----------------------------+ | cmd_attr | | driver: | +------------+ |----------------------------+--+ | valid | | | cmd_attr | | +------------+ | | valid | | | obj_attr | | +------------+ | | drivers | +------------+ +> | valid | | cmd_attr | +------------+ | valid | | cmd_attr | +------------+ | valid | | obj_attr | +------------+ Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31RDMA/vmw_pvrdma: Report network header type in WCAditya Sarwade
We should report the network header type in the work completion so that the kernel can infer the right RoCE type headers. Reviewed-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Aditya Sarwade <asarwade@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-29IB/uverbs: Expose XRQ capabilitiesArtemy Kovalyov
Make XRQ capabilities available via ibv_query_device() verb. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-29IB/uverbs: Add XRQ creation parameter to UAPIArtemy Kovalyov
Add tm_list_size parameter to struct ib_uverbs_create_xsrq. If SRQ type is tag-matching this field defines maximum size of tag matching list. Otherwise, it is expected to be zero. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/mlx5: Report mlx5 enhanced multi packet WQE capabilityBodong Wang
Expose enhanced multi packet WQE capability to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/mlx5: Allow posting multi packet send WQEs if hardware supportsBodong Wang
Set the field to allow posting multi packet send WQEs if hardware supports this feature. This doesn't mean the send WQEs will be for multi packet unless the send WQE was prepared according to multi packet send WQE format. User space shall use flag MLX5_IB_ALLOW_MPW to check if hardware supports MPW and allows MPW in SQ context. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/mlx5: Expose software parsing for Raw Ethernet QPNoa Osherovich
Software parsing (SWP) is a feature that can be used to instruct the device to stop using its internal parser and to parse packets on the transmit path according to offsets set for each packets. Through this feature, the device allows the handling of checksum and LSO by the hardware according to the location of IP and TCP/UDP headers. Enable SW parsing on Raw Ethernet send queue by default if firmware supports it and report these capabilities to user space. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/mlx4: Remove redundant attribute in mlx4_ib_create_qp_rss structGuy Levi
rx_key_len is not in use and needs to be removed. Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP") Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24IB/mlx4: Fix struct mlx4_ib_create_wq alignmentGuy Levi
The mlx4 ABI defines to have structures with alignment of 64B. Fixes: 400b1ebcfe31 ("IB/mlx4: Add support for WQ related verbs") Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-24RDMA/mlx4: Fix create qp command alignmentMaor Gottlieb
Avoid extra padding by replacing the order of inl_recv_sz and reserved, otherwise 'mlx4_ib_create_qp' structure might be larger than legacy user input leading to copy of some garbage data from the user space buffer. Fixes: ea30b966f7dd ('IB/mlx4: Add inline-receive support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-10RDMA/netlink: Export node_typeLeon Romanovsky
Add ability to get node_type for RDAM netlink users. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netlink: Provide port state and physical link stateLeon Romanovsky
Add port state and physical link state to the users of RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netlink: Export LID mask control (LMC)Leon Romanovsky
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netink: Export lids and sm_lidsLeon Romanovsky
According to the IB specification, the LID and SM_LID are 16-bit wide, but to support OmniPath users, export it as 32-bit value from the beginning. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netlink: Advertise IB subnet prefixLeon Romanovsky
Add IB subnet prefix to the port properties exported by RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netlink: Export node_guid and sys_image_guidLeon Romanovsky
Add Node GUID and system image GUID to the device properties exported by RDMA netlink, to be used by RDMAtool. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-08-10RDMA/netlink: Export FW versionLeon Romanovsky
Add FW version to the device properties exported by RDMA netlink, to be used by RDMAtool. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>