summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-07-25nvme-fc: revise TRADDR parsingJames Smart
The FC-NVME spec hasn't locked down on the format string for TRADDR. Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>" where the wwn's are hex values but not prefixed by 0x. Most implementations so far expect a string format of "nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport uses the match_u64 parser which requires a leading 0x prefix to set the base properly. If it's not there, a match will either fail or return a base 10 value. The resolution in T11 is pushing out. Therefore, to fix things now and to cover any eventuality and any implementations already in the field, this patch adds support for both formats. The change consists of replacing the token matching routine with a routine that validates the fixed string format, and then builds a local copy of the hex name with a 0x prefix before calling the system parser. Note: the same parser routine exists in both the initiator and target transports. Given this is about the only "shared" item, we chose to replicate rather than create an interdendency on some shared code. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-25nvme: fabrics commands should use the fctype field for data directionJon Derrick
Fabrics commands with opcode 0x7F use the fctype field to indicate data direction. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Sagi Grimberg <sai@grmberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
2017-07-25x86/asm: Make objtool unreachable macros independent from GCC versionJosh Poimboeuf
The ASM_UNREACHABLE macro isn't GCC version-specific, so move it outside the GCC 4.5+ check. Otherwise the 0-day robot will report objtool warnings for uses of ASM_UNREACHABLE with GCC 4.4. Also move the annotate_unreachable() macro so the related macros can stay together. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: aa5d1b81500e ("x86/asm: Add ASM_UNREACHABLE") Link: http://lkml.kernel.org/r/fb18337dbf230fd36450d9faf19a2b2533dbcba1.1500993873.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25KVM: arm/arm64: PMU: Fix overflow interrupt injectionAndrew Jones
kvm_pmu_overflow_set() is called from perf's interrupt handler, making the call of kvm_vgic_inject_irq() from it introduced with "KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad idea, as it's quite easy to try and retake a lock that the interrupted context is already holding. The fix is to use a vcpu kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(), like it was doing before the refactoring. We don't just revert, though, because before the kick was request-less, leaving the vcpu exposed to the request-less vcpu kick race, and also because the kick was used unnecessarily from register access handlers. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-07-25x86/asm: Add ASM_UNREACHABLEKees Cook
This creates an unreachable annotation in asm for CONFIG_STACK_VALIDATION=y. While here, adjust earlier uses of \t\n into \n\t. Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1500921349-10803-3-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25sched/wait: Clean up some documentation warningsJonathan Corbet
A couple of kerneldoc comments in <linux/wait.h> had incorrect names for macro parameters, with this unsightly result: ./include/linux/wait.h:555: warning: No description found for parameter 'wq' ./include/linux/wait.h:555: warning: Excess function parameter 'wq_head' description in 'wait_event_interruptible_hrtimeout' ./include/linux/wait.h:759: warning: No description found for parameter 'wq_head' ./include/linux/wait.h:759: warning: Excess function parameter 'wq' description in 'wait_event_killable' Correct the comments and kill the warnings. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/20170724135800.769c4042@lwn.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24tcp: remove redundant argument from tcp_rcv_established()Matvejchikov Ilya
The last (4th) argument of tcp_rcv_established() is redundant as it always equals to skb->len and the skb itself is always passed as 2th agrument. There is no reason to have it. Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24Merge tag 'rxrpc-rewrite-20170721' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Rearrange headers Here's a pair of patches that rearrange some of the AF_RXRPC header files that are outside of the net/rxrpc/ directory: (1) The bits userspace need are moved to uapi/linux/rxrpc.h. [Should this be af_rxrpc.h instead, I wonder - but there doesn't seem to be precedent for that in the other net UAPI headers.] (2) For the most part, the contents of rxrpc/packet.h are no longer used outside of the AF_RXRPC module, so move them to net/rxrpc/protocol.h with the exception of the standard abort codes which are exposed to userspace when an abort occurs and the security index values which are needed when constructing keys. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24rcutorture: Place event-traced strings into trace bufferPaul E. McKenney
Strings used in event tracing need to be specially handled, for example, being copied to the trace buffer instead of being pointed to by the trace buffer. Although the TPS() macro can be used to "launder" pointed-to strings, this might not be all that effective within a loadable module. This commit therefore copies rcutorture's strings to the trace buffer. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2017-07-24rcutorture: Move SRCU status printing to SRCU implementationsPaul E. McKenney
This commit gets rid of some ugly #ifdefs in rcutorture.c by moving the SRCU status printing to the SRCU implementations. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-07-24srcu: Make process_srcu() be staticPaul E. McKenney
The function process_srcu() is not invoked outside of srcutree.c, so this commit makes it static and drops the EXPORT_SYMBOL_GPL(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-07-24srcu: Move rcu_scheduler_starting() from Tiny RCU to Tiny SRCUPaul E. McKenney
Other than lockdep support, Tiny RCU has no need for the scheduler status. However, Tiny SRCU will need this to control boot-time behavior independent of lockdep. Therefore, this commit moves rcu_scheduler_starting() from kernel/rcu/tiny_plugin.h to kernel/rcu/srcutiny.c. This in turn allows the complete removal of kernel/rcu/tiny_plugin.h. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-07-24init_task: Remove redundant INIT_TASK_RCU_TREE_PREEMPT() macroPaul E. McKenney
Back in the dim distant past, the task_struct structure's RCU-related fields optionally included those needed for CONFIG_RCU_BOOST, even in CONFIG_PREEMPT_RCU builds. The INIT_TASK_RCU_TREE_PREEMPT() macro was used to provide initializers for those optional CONFIG_RCU_BOOST fields. However, the CONFIG_RCU_BOOST fields are now included unconditionally in CONFIG_PREEMPT_RCU builds, so there is no longer any need fro the INIT_TASK_RCU_TREE_PREEMPT() macro. This commit therefore removes it in favor of initializing the ->rcu_blocked_node field directly in the INIT_TASK_RCU_PREEMPT() macro. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-07-24sctp: remove the typedef sctp_abort_chunk_tXin Long
This patch is to remove the typedef sctp_abort_chunk_t, and replace with struct sctp_abort_chunk in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_heartbeat_chunk_tXin Long
This patch is to remove the typedef sctp_heartbeat_chunk_t, and replace with struct sctp_heartbeat_chunk in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_heartbeathdr_tXin Long
This patch is to remove the typedef sctp_heartbeathdr_t, and replace with struct sctp_heartbeathdr in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_sack_chunk_tXin Long
This patch is to remove the typedef sctp_sack_chunk_t, and replace with struct sctp_sack_chunk in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_sackhdr_tXin Long
This patch is to remove the typedef sctp_sackhdr_t, and replace with struct sctp_sackhdr in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_sack_variable_tXin Long
This patch is to remove the typedef sctp_sack_variable_t, and replace with union sctp_sack_variable in the places where it's using this typedef. It is also to fix some indents in sctp_acked(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_dup_tsn_tXin Long
This patch is to remove the typedef sctp_dup_tsn_t, and replace with __be32 in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_gap_ack_block_tXin Long
This patch is to remove the typedef sctp_gap_ack_block_t, and replace with struct sctp_gap_ack_block in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_unrecognized_param_tXin Long
This patch is to remove the typedef sctp_unrecognized_param_t, and replace with struct sctp_unrecognized_param in the places where it's using this typedef. It is also to fix some indents in sctp_sf_do_unexpected_init() and sctp_sf_do_5_1B_init(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_cookie_param_tXin Long
This patch is to remove the typedef sctp_cookie_param_t, and replace with struct sctp_cookie_param in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24sctp: remove the typedef sctp_initack_chunk_tXin Long
This patch is to remove the typedef sctp_initack_chunk_t, and replace with struct sctp_initack_chunk in the places where it's using this typedef. It is also to use sizeof(variable) instead of sizeof(type). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()Rafael J. Wysocki
Put the device interrupts disabling and enabling as well as cpuidle_pause() and cpuidle_resume() called during the "noirq" stages of system suspend into separate functions to allow the core suspend-to-idle code to be optimized (later). The only functional difference this makes is that debug facilities and diagnostic tools will not include the above operations into the "noirq" device suspend/resume duration measurements. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24PM / Domains: Add time accounting to various genpd statesThara Gopinath
This patch adds support to calculate the time spent by the generic power domains in on and various idle states. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24net: add infrastructure to un-offload UDP tunnel portSabrina Dubroca
This adds a new NETDEV_UDP_TUNNEL_DROP_INFO event, similar to NETDEV_UDP_TUNNEL_PUSH_INFO, to signal to un-offload ports. This also adds udp_tunnel_drop_rx_port(), which calls ndo_udp_tunnel_del. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24net: add new netdevice feature for offload of RX port for UDP tunnelsSabrina Dubroca
This adds a new netdevice feature, so that the offloading of RX port for UDP tunnels can be disabled by the administrator on some netdevices, using the "rx-udp_tunnel-port-offload" feature in ethtool. This feature is set for all devices that provide ndo_udp_tunnel_add. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24ACPI / boot: Correct address space of __acpi_map_table()Andy Shevchenko
Sparse complains about wrong address space used in __acpi_map_table() and in __acpi_unmap_table(). arch/x86/kernel/acpi/boot.c:127:29: warning: incorrect type in return expression (different address spaces) arch/x86/kernel/acpi/boot.c:127:29: expected char * arch/x86/kernel/acpi/boot.c:127:29: got void [noderef] <asn:2>* arch/x86/kernel/acpi/boot.c:135:23: warning: incorrect type in argument 1 (different address spaces) arch/x86/kernel/acpi/boot.c:135:23: expected void [noderef] <asn:2>*addr arch/x86/kernel/acpi/boot.c:135:23: got char *map Correct address space to be in align of type of returned and passed parameter. Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24ACPI: NUMA: add missing include in acpi_numa.hRoss Zwisler
Right now if a file includes acpi_numa.h and they don't happen to include linux/numa.h before it, they get the following warning: ./include/acpi/acpi_numa.h:9:5: warning: "MAX_NUMNODES" is not defined [-Wundef] #if MAX_NUMNODES > 256 ^~~~~~~~~~~~ Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24uuid: remove uuid_beChristoph Hellwig
Everything uses uuid_t now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-07-24IB/mlx4: Add support for RSS QPGuy Levi
Add support to work with a RSS QP by using an indirection table object upon QP creation. Other related QP verbs (e.g. modify/destroy/query) were updated as well for that QP mode. Notes: - The RX hash properties are supplied as driver private data. - The RSS QP port is used on the associated WQs in its indirection table. Applying different ports during WQ life time is not allowed. - The expected RSS QP flow is: create, modify(RST->INIT), modify(RST->RTR), destroy. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add support for WQ indirection table related verbsGuy Levi
To enable RSS functionality the IB indirection table object (i.e. ib_rwq_ind_table) should be used. This patch implements the related verbs as of create and destroy an indirection table. In downstream patches the indirection table will be used as part of RSS QP creation. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add support for WQ related verbsGuy Levi
Support create/modify/destroy WQ related verbs. The base IB object to enable RSS functionality is a WQ (i.e. ib_wq). This patch implements the related WQ verbs as of create, modify and destroy. In downstream patches the WQ will be used as part of an indirection table (i.e. ib_rwq_ind_table) to enable RSS QP creation. Notes: ConnectX-3 hardware requires consecutive WQNs list as receive descriptor queues for the RSS QP. Hence, the driver manages consecutive ranges lists per context which the user must respect. Destroying the WQ does not return its WQN back to its range for reusing. However, destroying all WQs from the same range releases the range and in turn releases its WQNs for reusing. Since the WQ object is not a natural object in the hardware, the driver implements the WQ by the hardware QP. As such, the WQ inherits its port from its RSS QP parent upon its RST->INIT transition and by that time its state is applied to the hardware. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24(IB, net)/mlx4: Add resource utilization supportMoshe Shemesh
Adding visibility of resource usage of QPs, CQs and counters used by virtual functions. This feature will be used to give the PF administrator more data while debugging VF status. Usage info was added to ALLOC_RES command, to notify the PF if the resource which is being reserved or allocated for the VF will be used by kernel driver or by user verbs. Updated reservation and allocation functions of QP, CQ and counter with additional usage parameter. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx4: Add inline-receive supportMaor Gottlieb
When inline-receive is enabled, the HCA may write received data into the receive WQE. Inline-receive is enabled by setting its matching bit in the QP context and each single-packet message with payload not exceeding the receive WQE size will be delivered to the WQE. The completion report will indicate that the payload was placed to the WQE. It includes: 1) Return maximum supported size of inline-receive by the hardware in query_device vendor's data part. 2) Enable the feature when requested by the vendor data input. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx5: Expose extended error countersParav Pandit
This patch adds below requester and responder side error counters, which will be exposed by hardware counters interface and are supported as part of query Q counters command extension. +---------------------------+-------------------------------------+ | Name | Description | |---------------------------+-------------------------------------| |resp_local_length_error | Number of times responder detected | | | local length errors | |---------------------------+-------------------------------------| |resp_cqe_error | Number of CQEs completed with error | | | at responder | |---------------------------+-------------------------------------| |req_cqe_error | Number of CQEs completed with error | | | at requester | |---------------------------+-------------------------------------| |req_remote_invalid_request | Number of times requester detected | | | remote invalid request error | |---------------------------+-------------------------------------| |req_remote_access_error | Number of times requester detected | | | remote access error | |---------------------------+-------------------------------------| |resp_remote_access_error | Number of times responder detected | | | remote access error | |---------------------------+-------------------------------------| |resp_cqe_flush_error | Number of CQEs completed with | | | flushed with error at responder | |---------------------------+-------------------------------------| |req_cqe_flush_error | Number of CQEs completed with | | | flushed with error at requester | +---------------------------+-------------------------------------+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx5: Fix existence check for extended address vectorLeon Romanovsky
The extended address vector is the highest bit in be32 variable, but it was compared with the lowest. This patch fixes the endianness of that check and removes already declared define. Fixes: 17d2f88f92ce ("IB/mlx5: Add ODP atomics support") Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24net/mlx5: Report enhanced capabilities for IPoIBYishai Hadas
Report 'ipoib_enhanced_offloads' capabilities from the core layer, it will be used in the next patch from this series. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/uverbs: Enable QP creation with a given source QP numberYishai Hadas
Enable QP creation with a given source QP number, the created QP will use the source QPN as its wire QP number. To create such a QP, root privileges (i.e. CAP_NET_RAW) are required from the user application. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Enable QP creation with a given source QP numberYishai Hadas
Enable QP creation with a given source QP number. The created QP will use the source QPN as its wire QP number. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Set RoCEv2 MGID according to specNoa Osherovich
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding IPv4 address is encoded into the GID according to the following rule: GID= :ffff:<IPv4 address> Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it zeroed and change rdma_is_multicast_addr() to consider the new logic. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx5: Add support to dropless RQMaor Gottlieb
RQs that were configured for "delay drop" will prevent packet drops when their WQEs are depleted. Marking an RQ to be drop-less is done by setting delay_drop_en in RQ context using CREATE_RQ command. Since this feature is globally activated/deactivated by using the SET_DELAY_DROP command on all the marked RQs, we activated/deactivated it according to the number of RQs with 'delay_drop' enabled. When timeout is expired, then the feature is deactivated. Therefore the driver handles the delay drop timeout event and reactivate it. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24net/mlx5: Introduce general notification eventMaor Gottlieb
When delay drop timeout is expired, the firmware raises general notification event of DELAY_DROP_TIMEOUT subtype. In addition the feature is disable so the driver have to reactivate the timeout. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24net/mlx5: Introduce set delay drop commandMaor Gottlieb
Add support to SET_DELAY_DROP command. This command will be used in downstream patches for delay packet drop. The timeout value should be indicated by delay_drop_timeout field. Packet processing will be delayed till timeout value passed or until more WQEs are posted. Setting this value to 0 disables the feature. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Introduce delay drop for a WQMaor Gottlieb
Work queue which is created with IB_WQ_FLAGS_DELAY_DROP won't cause packet drops when there aren't receive WQEs, but will wait until posting of receive WQEs or for some period of time that the device was configured with. It includes: * Add a new creation flag to enable delay drop functionality in a WQ. * A new capability was introduced - IB_RAW_PACKET_CAP_DELAY_DROP, which is the device's ability to delay packet drops when there aren't receive WQEs. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx5: Restore IB guid/policy for virtual functionsBodong Wang
When a user sets port_guid, node_guid or policy of an IB virtual function, save this information in "struct mlx5_vf_context". This information will be restored later when pci_resume is called. To make sure this works, one can use aer-inject to generate PCI errors on mlx5 devices and verify if relevant fields are restored after PCI resume. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/mlx5: Add debug control parameters for congestion controlParav Pandit
This patch adds debug control parameters for congestion control which can be read or written through debugfs. They are for reaction point and notification point nodes. These control parameters are as below: +------------------------------+-----------------------------------------+ | Name | Description | |------------------------------+-----------------------------------------| |rp_clamp_tgt_rate | When set target rate is updated to | | | current rate | |------------------------------+-----------------------------------------| |rp_clamp_tgt_rate_ati | When set update target rate based on | | | timer as well | |------------------------------+-----------------------------------------| |rp_time_reset | time between rate increase if no | | | CNP is received unit in usec | |------------------------------+-----------------------------------------| |rp_byte_reset | Number of bytes between rate inease if | | | no CNP is received | |------------------------------+-----------------------------------------| |rp_threshold | Threshold for reaction point rate | | | control | |------------------------------+-----------------------------------------| |rp_ai_rate | Rate for target rate, unit in Mbps | |------------------------------+-----------------------------------------| |rp_hai_rate | Rate for hyper increase state | | | unit in Mbps | |------------------------------+-----------------------------------------| |rp_min_dec_fac | Minimum factor by which the current | | | transmit rate can be changed when | | | processing a CNP, unit is percerntage | |------------------------------+-----------------------------------------| |rp_min_rate | Minimum value for rate limit, | | | unit in Mbps | |------------------------------+-----------------------------------------| |rp_rate_to_set_on_first_cnp | Rate that is set when first CNP is | | | received, unit is Mbps | |------------------------------+-----------------------------------------| |rp_dce_tcp_g | Used to calculate alpha | |------------------------------+-----------------------------------------| |rp_dce_tcp_rtt | Time between updates of alpha value, | | | unit is usec | |------------------------------+-----------------------------------------| |rp_rate_reduce_monitor_period | Minimum time between consecutive rate | | | reductions | |------------------------------+-----------------------------------------| |rp_initial_alpha_value | Initial value of alpha | |------------------------------+-----------------------------------------| |rp_gd | When CNP is received, flow rate is | | | reduced based on gd, rp_gd is given as | | | log2(rp_gd) | |------------------------------+-----------------------------------------| |np_cnp_dscp | dscp code point for generated cnp | |------------------------------+-----------------------------------------| |np_cnp_prio_mode | 802.1p priority for generated cnp | |------------------------------+-----------------------------------------| |np_cnp_prio | cnp priority mode | +------------------------------+-----------------------------------------+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24net/mlx5: Add raw ethernet local loopback firmware commandHuy Nguyen
Add support for raw ethernet local loopback firmware command. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Add generic function to extract IB speed from netdevYuval Shaia
Logic of retrieving netdev speed from net_device and translating it to IB speed is implemented in rxe, in usnic and in bnxt drivers. Define new function which merges all. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Christian Benvenuti <benve@cisco.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>