summaryrefslogtreecommitdiff
path: root/include/linux/mlx5
AgeCommit message (Collapse)Author
2017-05-03Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "More exchaustive description of primary updates in this release: - Lots of driver fixes and misc fixes across the board. - I had to base on a net-next tree because the IPoIB Accelorator patches needed it. Unfortunately, it was known to Mellanox that there would need to be an IPoIB accelorator patch to the net tree (which left some functions turned off by an #ifdef construct to avoid warnings about defined but unused functions), then one to the RDMA tree, then a fixup that went back and re-enabled the functions in the net tree and enabled their use in the rdma tree Also, a sparse fix was sent to the net tree after I did my pull, and the fixup patch conflicts quite directly with that sparse fix, so I'm going to submit the fixup patch towards the end of the merge window by itself and based upon your master branch at the time. - Two separate rounds of hfi1 fixes, one that got dropped from last release because it came in just a day or two before the end of the merge window and then the one from this release cycle. Of note is that I now have a third series that just landed from Intel yesterday. It is not included in this pull request, but I may submit it by the end of the week. I'll talk to Intel about improving the timing of thier submissions for my workflow. - Changes to our idr usage in the RDMA subsystem that will tie into our cgroup management and also into the upcoming changes for the RDMA kernel<->userspace API. - Addition of support for a netdev to be tied to an RDMA device at the core level - Addition of the VNIC driver from Intel. While IPoIB provides IP over InfiniBand (and *only* IP, no lower layer protocol headers are allowed or supported), the VNIC driver presents a virtual Ethernet device with support for things like varying Ethertypes, VLANs, priorities and other features of Ethernet. The virtual devices are centrally managed by the OPA fabric manager, making this (for the time being) a strictly OPA specific feature. - Improvements to the On-Demand Paging support in the RDMA subsystem. - Addition of three significant OPA changes. While we added OPA support some time ago (via the hfi1 driver), the RDMA subsystem has so far glossed over the areas where OPA and InfiniBand differ. With this release we are starting to add support for the OPA extensions into the RDMA core in the following area: Extended port information for OPA is now supported, extended Address Handle attributes for OPA are now supported, and extended SA Queries to get OPA specific subnet information is now supported. Concise summary from the tag: - idr usage and locking changes - build fix for hns - ipoib debug path record file fix - hfi1 updates - core RDMA netdev addition - Intel VNIC driver addition - Enhanced accelerators for IPoIB addition - Debug cleanups in cxgb3/4 - Trivial cleanups from SF Markus Elfring - Misc rxe fixes from Mellanox - Misc ipoib fixes from Mellanox - Lots of mlx4/mlx5 changes from Mellanox - Misc fixes across the RDMA subsystem - ODP paging fixes and improvements - qedr updates - hfi1 updates - OPA port info patches - OPA AH patches - OPA SA Query patches" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (191 commits) infiniband: avoid dereferencing uninitialized dst on error path IB/SA: Add OPA addr header IB/mlx5: Add port_xmit_wait to counter registers read IB/ocrdma: fix out of bounds access to local buffer IB/mlx4: Fix incorrect order of formal and actual parameters IB/mlx4: Change flush logic so it adheres to the variable name mlx5: Fix mlx5_ib_map_mr_sg mr length IB/rxe: Don't clamp residual length to mtu IB/SA: Add support to query OPA path records IB/SA: Add OPA path record type IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields IB/SA: Introduce path record specific types IB/SA: Rename ib_sa_path_rec to sa_path_rec IB/CM: Add braces when using sizeof IB/core: Define 'opa' rdma_ah_attr type IB/core: Define 'ib' and 'roce' rdma_ah_attr types IB/core: Use rdma_ah_attr accessor functions IB/core: Add accessor functions for rdma_ah_attr fields IB/PVRDMA: Rename ib_ah_attr related functions IB/mthca: Rename to_ib_ah_attr to to_rdma_ah_attr ...
2017-05-01IB/mlx5: Add port_xmit_wait to counter registers readTim Wright
Add port_xmit_wait to the error counters read by mlx5_ib_process_mad to ensure sysfs port counter provides correct value for PortXmitWait. Otherwise the sysfs port_xmit_wait file always contains zero. The previous MAD_IFC implementation populated this counter, but it was removed during the migration to PPCNT for error counters (32-bit only). Signed-off-by: Tim Wright <tim@binbash.co.uk> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-30net/mlx5e: Update neighbour 'used' state using HW flow rules countersHadar Hen Zion
When IP tunnel encapsulation rules are offloaded, the kernel can't see the traffic of the offloaded flow. The neighbour for the IP tunnel destination of the offloaded flow can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To make sure that a neighbour which is used by the HW won't become STALE, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME period, when packets were matched and counted by the HW for one of the tunnel encap flows related to this neighbour. The periodic task that updates the used neighbours is scheduled when a tunnel encap rule is successfully offloaded into HW and keeps re-scheduling itself as long as the representor's neighbours list isn't empty. Add, remove, lookup and status change operations done over the representor's neighbours list or the neighbour hash entry encaps list are all serialized by RTNL lock. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-21IB/mlx5: Support congestion related countersParav Pandit
This patch adds support to query the congestion related hardware counters through new command and links them with other hw counters being available in hw_counters sysfs location. In order to reuse existing infrastructure it renames related q_counter data structures to more generic counters to reflect q_counters and congestion counters and maybe some other counters in the future. New hardware counters: * rp_cnp_handled - CNP packets handled by the reaction point * rp_cnp_ignored - CNP packets ignored by the reaction point * np_cnp_sent - CNP packets sent by notification point to respond to CE marked RoCE packets * np_ecn_marked_roce_packets - CE marked RoCE packets received by notification point It also avoids returning ENOSYS which is specific for invalid system call and produces the following checkpatch.pl warning. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + return -ENOSYS; Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21IB/mlx5: Use IP version matching to classify IP trafficAriel Levkovich
This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classifed steering rules. When user added a flow rule with IP classification, driver was implicitly adding ethertype matching to the created rule in order to distinguish between IPv4 and IPv6 protocols. Since IP packets with MPLS tag header have MPLS ethertype, they missed the rule and ended up hitting the default filters. Such behavior prevented from MPLS packets to undergo inbound traffic load balancing flows (if such were defined by configuring RSS) to achieve higher throughput - the way that non-MPLS IP packets performed. Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces Ethertype matching by the device's capability to perform IP version parsing and matching in order to distinguish between IPv4 and IPv6. Therefore, whenever a flow with IP spec is added and device support IP version matching, driver will implicitly add IP version matching to the rule (Based on the IP spec type) without Ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. Otherwise (device doesn't support IP version matching), we fall back to setting Ethertype matching. If the user's filters specify an L2 ethertype and an IP spec the rule will then match both the ethertype and the IP version. The device's support for IP version matching is reported by the device via dedicated capability bit in query_device_cap and named outer/inner_ip_version. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-17net/mlx5e: IPoIB, Xmit flowSaeed Mahameed
Implement mlx5e's IPoIB SKB transmit using the helper functions provided by mlx5e ethernet tx flow, the only difference in the code between mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill (UD datagram segment) in the TX descriptor (WQE) and it doesn't need to have any vlan handling. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net/mlx5: Refactor create flow table method to accept underlay QPErez Shitrit
IB flow tables need the underlay qp to perform flow steering. Here we change the API of the flow tables creation to accept the underlay QP number as a parameter in order to support IB (IPoIB) flow steering. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net/mlx5: Add IPoIB enhanced offloads bits to mlx5_ifcErez Shitrit
New capability bit: ipoib_enhanced_offloads, indicates new ability for UD QP to do RSS and enhanced IPoIB offloads and acceleration. Add underlay_qpn to the TIS and flow_table objects In order to support SET_ROOT command, to connect between IPoIB QPs and flow steering tables. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07net/mlx5e: Add support for RXFCS feature flagGuy Ergas
Add support for rx-fcs flag from ethtool. In case this flag is set, update all RQs to scatter the FCS data into the packet. Signed-off-by: Guy Ergas <guye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28net/mlx5: Introduce modify header structures, commands and steering action ↵Or Gerlitz
definitions Add the definitions related to creation/deletion of a modify header context and the modify header steering action which are used for HW packet header modify (re-write) as part of steering. Add as well the modify header id into two intermediate structs and set it to the FTE. Note that as the push/pop vlan steering actions are emulated by the ewitch management code, we're not breaking any compatibility while changing their values to make room for the modify header action which is not emulated and whose value is part of the FW API. The new bit values for the emulated actions are at the end of the possible range. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-28net/mlx5: Add helper to initialize a flow steering actions struct instanceOr Gerlitz
There are bunch of places in the code where the intermediate struct that keeps the elements related to flow actions is initialized with the same default values. Put that into a small DECLARE type helper. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-03-24net/mlx5e: Single bfreg (UAR) for all mlx5e SQs and netdevsSaeed Mahameed
One is sufficient since Blue Flame is not supported anymore. This will also come in handy for switchdev mode to save resources, since VF representors will use same single UAR as well for their own SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull Mellanox rdma updates from Doug Ledford: "Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits) IB/mlx5: Fix configuration of port capabilities IB/mlx4: Take source GID by index from HW GID table IB/mlx5: Fix blue flame buffer size calculation IB/mlx4: Remove unused variable from function declaration IB: Query ports via the core instead of direct into the driver IB: Add protocol for USNIC IB/mlx4: Support raw packet protocol IB/mlx5: Support raw packet protocol IB/core: Add raw packet protocol IB/mlx5: Add implicit MR support IB/mlx5: Expose MR cache for mlx5_ib IB/mlx5: Add null_mkey access IB/umem: Indicate that process is being terminated IB/umem: Update on demand page (ODP) support IB/core: Add implicit MR flag IB/mlx5: Support creation of a WQ with scatter FCS offload IB/mlx5: Enable QP creation with cvlan offload IB/mlx5: Enable WQ creation and modification with cvlan offload IB/mlx5: Expose vlan offloads capabilities IB/uverbs: Enable QP creation with cvlan offload ...
2017-02-14IB/mlx5: Add implicit MR supportArtemy Kovalyov
Add implicit MR, covering entire user address space. The MR is implemented as an indirect KSM MR consisting of 1GB direct MRs. Pages and direct MRs are added/removed to MR by ODP. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-14IB/mlx5: Expose MR cache for mlx5_ibArtemy Kovalyov
Allow other parts of mlx5_ib to use MR cache mechanism. * Add new functions mlx5_mr_cache_alloc and mlx5_mr_cache_free * Traditional MTT MKey buckets are limited by MAX_UMR_CACHE_ENTRY Additinal buckets may be added above. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-14IB/mlx5: Add port counter support for Receive WQsMajd Dibbiny
Counters weren't updated due to Receive WQs' traffic since the counter-id was not associated with the RQ. Added support for associating the q-counter-id with the Receive WQ. The attachment is done only when changing WQ's state from RESET to READY in modify-WQ command. FW support is required for the above, without this support Receive WQ counters will not count. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-14IB/mlx5: Add additional checks before processing MADsMaor Gottlieb
Check the has_smi bit in vport context and class version of MADs before allowing MADs processing to take place. MAD_IFC SMI commands can be executed only if smi bit is set. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Parvi Kaustubhi <parvik@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-06net/mlx5: TX WQE updateSaeed Mahameed
Add new TX WQE fields for Connect-X5 vlan insertion support, type and vlan_tci, when type = MLX5_ETH_WQE_INSERT_VLAN the HW will insert the vlan and prio fields (vlan_tci) to the packet. Those bits and the inline header fields are mutually exclusive, and valid only when: MLX5_CAP_ETH(mdev, wqe_inline_mode) == MLX5_CAP_INLINE_MODE_NOT_REQUIRED and MLX5_CAP_ETH(mdev, wqe_vlan_insert), who will be set in ConnectX-5 and later HW generations. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2017-02-06net/mlx5: Configure cache line size for start and end paddingDaniel Jurgens
There is a hardware feature that will pad the start or end of a DMA to be cache line aligned to avoid RMWs on the last cache line. The default cache line size setting for this feature is 64B. This change configures the hardware to use 128B alignment on systems with 128B cache lines. In addition we lower bound MPWRQ stride by HCA cacheline in mlx5e, MPWRQ stride should be at least the HCA cacheline, the current default is 64B and in case HCA_CAP.cach_line_128byte capability is set, MPWRQ RX stride will automatically be aligned to 128B. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-02-06net/mlx5: Fix static checker warningsOr Gerlitz
For some reason, sparse doesn't like using an expression of type (!x) with a bitwise | and &. In order to mitigate that, we use a local variable. This removes the following sparse complaints on the core driver (and similar ones on the IB driver too): drivers/net/ethernet/mellanox/mlx5/core/srq.c:83:9: warning: dubious: !x & y drivers/net/ethernet/mellanox/mlx5/core/srq.c:96:9: warning: dubious: !x & y drivers/net/ethernet/mellanox/mlx5/core/port.c:59:9: warning: dubious: !x & y drivers/net/ethernet/mellanox/mlx5/core/vport.c:561:9: warning: dubious: !x & y Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-24net/mlx5: Push min-inline mode resolution helper into the coreOr Gerlitz
So we can use that from the IB driver too in downstream patches. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-24net/mlx5: Add support for setting VF min rateMohamad Haj Yahia
Add support for SRIOV VF min rate guarantee by using the TSAR BW share weights mechanism. The TSAR BW share vport attribute represents the weight of that vport among the other vports weights which means that the actual vport BW percentage is the same vport weight percentage among the total vports weights sum. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Move cached hca caps to designated caps structGal Pressman
The caps structure consists of hca caps and port/management caps, all under one roof. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Add MPCNT register infrastructureGal Pressman
Add the needed infrastructure for future use of MPCNT register. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Add PPCNT physical layer statistical group infrastructureGal Pressman
Add the needed infrastructure for future use of PPCNT physical layer statistical group. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2017-01-19net/mlx5: Query and cache PCAM, MCAM registers on initializationGal Pressman
On load_one, we now cache our capabilities registers internally, similar to QUERY_HCA_CAP. Capabilities can later be queried using macros introduced in this patch. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Expose PCAM, MCAM registers infrastructureGal Pressman
PCAM: Ports capabilities mask register. MCAM: Management capabilities mask register. PCAM and MCAM registers will provide information regarding firmware support for different features, in order to avoid cases where new driver combined with old firmware results in syndromes (for ex. PCIe counters before this patchset). Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Add support to s-tag in mlx5 firmware interfaceMohamad Haj Yahia
Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry match param. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2017-01-19net/mlx5: Add MTPPS and MTPPSE registers infrastructureEugenia Emantayev
Implement query and set functionality for MTPPS and MTPPSE registers. MTPPS (Management Pulse Per Second) provides the device PPS capabilities, configures the PPS in and out modules and holds the PPS in time stamp. Query MTPPS is supported only when HCA_CAP.pps is set and modify is supported when HCA_CAP.pps_modify is set. MTPPSE (Management Pulse Per Second Event) configures the different event generation modes for PPS. Supported when HCA_CAP.pps is set. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09Merge tag 'mlx5-4kuar-for-4.11' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09IB/mlx5: Support 4k UAR for libmlx5Eli Cohen
Add fields to structs to convey to kernel an indication whether the library supports multi UARs per page and return to the library the size of a UAR based on the queried value. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09IB/mlx5: Allow future extension of libmlx5 input dataEli Cohen
Current check requests that new fields in struct mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero. This was introduced so new libraries passing additional information to the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified by old kernels that do not support their request by failing the operation. This schecme is problematic since it requires libmlx5 to issue the requests with descending input size for struct mlx5_ib_alloc_ucontext_req_v2. To avoid this, we require that new features that will obey the following rules: If the feature requires one or more fields in the response and the at least one of the fields can be encoded such that a zero value means the kernel ignored the request then this field will provide the indication to the library. If no response is required or if zero is a valid response, a new field should be added that indicates to the library whether its request was processed. Fixes: b368d7cb8ceb ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09IB/mlx5: Use blue flame register allocator in mlx5_ibEli Cohen
Make use of the blue flame registers allocator at mlx5_ib. Since blue flame was not really supported we remove all the code that is related to blue flame and we let all consumers to use the same blue flame register. Once blue flame is supported we will add the code. As part of this patch we also move the definition of struct mlx5_bf to mlx5_ib.h as it is only used by mlx5_ib. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09net/mlx5: Add interface to get reference to a UAREli Cohen
A reference to a UAR is required to generate CQ or EQ doorbells. Since CQ or EQ doorbells can all be generated using the same UAR area without any effect on performance, we are just getting a reference to any available UAR, If one is not available we allocate it but we don't waste the blue flame registers it can provide and we will use them for subsequent allocations. We get a reference to such UAR and put in mlx5_priv so any kernel consumer can make use of it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-08net/mlx5: Introduce blue flame register allocatorEli Cohen
Here is an implementation of an allocator that allocates blue flame registers. A blue flame register is used for generating send doorbells. A blue flame register can be used to generate either a regular doorbell or a blue flame doorbell where the data to be sent is written to the device's I/O memory hence saving the need to read the data from memory. For blue flame kind of doorbells to succeed, the blue flame register need to be mapped as write combining. The user can specify what kind of send doorbells she wishes to use. If she requested write combining mapping but that failed, the allocator will fall back to non write combining mapping and will indicate that to the user. Subsequent patches in this series will make use of this allocator. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-08mlx5: Fix naming convention with respect to UARsEli Cohen
This establishes a solid naming conventions for UARs. A UAR (User Access Region) can have size identical to a system page or can be fixed 4KB depending on a value queried by firmware. Each UAR always has 4 blue flame register which are used to post doorbell to send queue. In addition, a UAR has section used for posting doorbells to CQs or EQs. In this patch we change names to reflect this conventions. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-01-02IB/mlx5: Improve MR checkArtemy Kovalyov
Add "type" field to mlx5_core MKEY struct. Check whether page fault happens on MKEY corresponding to MR. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02IB/mlx5: Add ODP atomics supportArtemy Kovalyov
Handle ODP atomic operations. When initiator of RDMA atomic operation use ODP MR to provide source data handle pagefault properly. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02{net,IB}/mlx5: Refactor page fault handlingArtemy Kovalyov
* Update page fault event according to last specification. * Separate code path for page fault EQ, completion EQ and async EQ. * Move page fault handling work queue from mlx5_ib static variable into mlx5_core page fault EQ. * Allocate memory to store ODP event dynamically as the events arrive, since in atomic context - use mempool. * Make mlx5_ib page fault handler run in process context. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02net/mlx5: Update PAGE_FAULT_RESUME layoutArtemy Kovalyov
Update PAGE_FAULT_RESUME command layout. Three bit fields describing page fault: rdma, rdma_write, req_res gave 8 possible combinations, while only a few were legal. Now they are interpreted as three-bit type field, where former legal combinations turns into corresponding types and unused were added as new types. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02IB/mlx5: Add MR cache for large UMR regionsArtemy Kovalyov
In this change we turn mlx5_ib_update_mtt() into generic mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions supporting both atomic and process contexts and not limited by number of modified entries. Using this function we increase preallocated MRs up to 16GB. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02IB/mlx5: Refactor UMR post send formatArtemy Kovalyov
* Update struct mlx5_wqe_umr_ctrl_seg. * Currenlty UMR send_flags aim only certain use cases: enabled/disable cached MR, modifying XLT for ODP. By making flags independent make UMR more flexible allowing arbitrary manipulations. * Since different UMR formats have different entry sizes UMR request should receive exact size of translation table update instead of number of entries. Rename field npages to xlt_size in struct mlx5_umr_wr and update relevant code accordingly. * Add support of length64 bit. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02net/mlx5: Support new MR featuresArtemy Kovalyov
This patch adds the following items to IFC file. 1. MLX5_MKC_ACCESS_MODE_KSM enum value for creating KSM memory keys. KSM access mode used when indirect MKey associated with fixed memory size entries. 2. null_mkey field that is used to indicate non-present KLM/KSM entries, where it causes the device to generate page fault event when trying to access it. 3. struct mlx5_ifc_cmd_hca_cap_bits capability bits indicating related value/field is supported: * fixed_buffer_size - MLX5_MKC_ACCESS_MODE_KSM * umr_extended_translation_offset - translation_offset_42_16 in UMR ctrl segment * null_mkey - null_mkey in QUERY_SPECIAL_CONTEXTS Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-02net/mlx5: Fix offset naming for reserved fields in hca_cap_bitsMax Gurtovoy
Fix offset for reserved fields. Fixes: 7486216b3a0b ("{net,IB}/mlx5: mlx5_ifc updates") Fixes: b4ff3a36d3e4 ("net/mlx5: Use offset based reserved field names in the IFC header file") Fixes: 7d5e14237a55 ("net/mlx5: Update mlx5_ifc hardware features") Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-28Revert "net/mlx5: Add MPCNT register infrastructure"Gal Pressman
This reverts commit 7f503169cabd70c1f13b9279c50eca7dfb9a7d51. Fixes: 7f503169cabd ("net/mlx5: Add MPCNT register infrastructure") Signed-off-by: Gal Pressman <galp@mellanox.com> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-15Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is the complete update for the rdma stack for this release cycle. Most of it is typical driver and core updates, but there is the entirely new VMWare pvrdma driver. You may have noticed that there were changes in DaveM's pull request to the bnxt Ethernet driver to support a RoCE RDMA driver. The bnxt_re driver was tentatively set to be pulled in this release cycle, but it simply wasn't ready in time and was dropped (a few review comments still to address, and some multi-arch build issues like prefetch() not working across all arches). Summary: - shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - debug cleanups - new connection rejection helpers - SRP updates - various misc fixes - new paravirt driver from vmware" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits) IB: Add vmw_pvrdma driver IB/mlx4: fix improper return value IB/ocrdma: fix bad initialization infiniband: nes: return value of skb_linearize should be handled MAINTAINERS: Update Intel RDMA RNIC driver maintainers MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers IB/core: fix unmap_sg argument qede: fix general protection fault may occur on probe IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc mlx5, calc_sq_size(): Make a debug message more informative mlx5: Remove a set-but-not-used variable mlx5: Use { } instead of { 0 } to init struct IB/srp: Make writing the add_target sysfs attr interruptible IB/srp: Make mapping failures easier to debug IB/srp: Make login failures easier to debug IB/srp: Introduce a local variable in srp_add_one() IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build IB/multicast: Check ib_find_pkey() return value IPoIB: Avoid reading an uninitialized member variable IB/mad: Fix an array index check ...
2016-12-13net/mlx5: Report multi packet WQE capabilitiesLeon Romanovsky
Multi packet WQE enables sending multiple fix sized packets using a single WQE. The exposed field reports such HW support. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-02net/mlx5e: Implement Fragmented Work Queue (WQ)Tariq Toukan
Add new type of struct mlx5_frag_buf which is used to allocate fragmented buffers rather than contiguous, and make the Completion Queues (CQs) use it as they are big (default of 2MB per CQ in Striding RQ). This fixes the failures of type: "mlx5e_open_locked: mlx5e_open_channels failed, -12" due to dma_zalloc_coherent insufficient contiguous coherent memory to satisfy the driver's request when the user tries to setup more or larger rings. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28net/mlx5: Add DCBX firmware commands supportHuy Nguyen
Add set/query commands for DCBX_PARAM register Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>