summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-16ipmr: Refactor mr_rtm_dumprouteDavid Ahern
Move per-table loops from mr_rtm_dumproute to mr_table_dump and export mr_table_dump for dumps by specific table id. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16net/mpls: Plumb support for filtering route dumpsDavid Ahern
Implement kernel side filtering of routes by egress device index and protocol. MPLS uses only a single table and route type. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16net/ipv6: Plumb support for filtering route dumpsDavid Ahern
Implement kernel side filtering of routes by table id, egress device index, protocol, and route type. If the table id is given in the filter, lookup the table and call fib6_dump_table directly for it. Move the existing route flags check for prefix only routes to the new filter. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16net/ipv4: Plumb support for filtering route dumpsDavid Ahern
Implement kernel side filtering of routes by table id, egress device index, protocol and route type. If the table id is given in the filter, lookup the table and call fib_table_dump directly for it. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16net: Add struct for fib dump filterDavid Ahern
Add struct fib_dump_filter for options on limiting which routes are returned in a dump request. The current list is table id, protocol, route type, rtm_flags and nexthop device index. struct net is needed to lookup the net_device from the index. Declare the filter for each route dump handler and plumb the new arguments from dump handlers to ip_valid_fib_dump_req. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16netlink: Add answer_flags to netlink_callbackDavid Ahern
With dump filtering we need a way to ensure the NLM_F_DUMP_FILTERED flag is set on a message back to the user if the data returned is influenced by some input attributes. Normally this can be done as messages are added to the skb, but if the filter results in no data being returned, the user could be confused as to why. This patch adds answer_flags to the netlink_callback allowing dump handlers to set the NLM_F_DUMP_FILTERED at a minimum in the NLMSG_DONE message ensuring the flag gets back to the user. The netlink_callback space is initialized to 0 via a memset in __netlink_dump_start, so init of the new answer_flags is covered. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16RDMA/mlx5: Add support for flow tag to raw create flowMark Bloch
A user can provide a hint which will be attached to the packet and written to the CQE on receive. This can be used as a way to offload operations into the HW, for example parsing a packet which is a tunneled packet, and if so, pass 0x1 as the hint. The software can use that hint to decapsulate the packet and parse only the inner headers thus saving CPU cycles. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/mlx5: Remove extraneous error checkGal Pressman
Remove double error check from create user RQ error flow. Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Signed-off-by: Gal Pressman <pressmangal@gmail.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16locking/lockdep: Remove duplicated 'lock_class_ops' percpu arrayWaiman Long
Remove the duplicated 'lock_class_ops' percpu array that is not used anywhere. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Fixes: 8ca2b56cd7da ("locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y") Link: http://lkml.kernel.org/r/1539380547-16726-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-10-16 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Convert BPF sockmap and kTLS to both use a new sk_msg API and enable sk_msg BPF integration for the latter, from Daniel and John. 2) Enable BPF syscall side to indicate for maps that they do not support a map lookup operation as opposed to just missing key, from Prashant. 3) Add bpftool map create command which after map creation pins the map into bpf fs for further processing, from Jakub. 4) Add bpftool support for attaching programs to maps allowing sock_map and sock_hash to be used from bpftool, from John. 5) Improve syscall BPF map update/delete path for map-in-map types to wait a RCU grace period for pending references to complete, from Daniel. 6) Couple of follow-up fixes for the BPF socket lookup to get it enabled also when IPv6 is compiled as a module, from Joe. 7) Fix a generic-XDP bug to handle the case when the Ethernet header was mangled and thus update skb's protocol and data, from Jesper. 8) Add a missing BTF header length check between header copies from user space, from Wenwen. 9) Minor fixups in libbpf to use __u32 instead u32 types and include proper perf_event.h uapi header instead of perf internal one, from Yonghong. 10) Allow to pass user-defined flags through EXTRA_CFLAGS and EXTRA_LDFLAGS to bpftool's build, from Jiri. 11) BPF kselftest tweaks to add LWTUNNEL to config fragment and to install with_addr.sh script from flow dissector selftest, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16IB/mlx5: Verify DEVX object typeYishai Hadas
Verify that the input DEVX object type matches the created object. As the obj_id in the firmware is not globally unique the object type must be considered upon checking for a valid object id. Once both the type and the id match we know that the lock was taken on the correct object by the uverbs layer. Fixes: e662e14d801b ("IB/mlx5: Add DEVX support for modify and query commands") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-15Merge branch 'nfp-fix-pedit-set-action-offloads'David S. Miller
Jakub Kicinski says: ==================== nfp: fix pedit set action offloads Pieter says: This set fixes set actions when using multiple pedit actions with partial masks and with multiple keys per pedit action. Additionally it fixes set ipv6 pedit action offloads when using it in combination with other header keys. The problem would only trigger if one combines multiple pedit actions of the same type with partial masks, e.g.: $ tc filter add dev netdev protocol ip parent ffff: \ flower indev netdev \ ip_proto tcp \ action pedit ex munge \ ip src set 11.11.11.11 retain 65535 munge \ ip src set 22.22.22.22 retain 4294901760 pipe \ csum ip and tcp pipe \ mirred egress redirect dev netdev ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15nfp: flower: use offsets provided by pedit instead of index for ipv6Pieter Jansen van Vuuren
Previously when populating the set ipv6 address action, we incorrectly made use of pedit's key index to determine which 32bit word should be set. We now calculate which word has been selected based on the offset provided by the pedit action. Fixes: 354b82bb320e ("nfp: add set ipv6 source and destination address") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15nfp: flower: fix multiple keys per pedit actionPieter Jansen van Vuuren
Previously we only allowed a single header key per pedit action to change the header. This used to result in the last header key in the pedit action to overwrite previous headers. We now keep track of them and allow multiple header keys per pedit action. Fixes: c0b1bd9a8b8a ("nfp: add set ipv4 header action flower offload") Fixes: 354b82bb320e ("nfp: add set ipv6 source and destination address") Fixes: f8b7b0a6b113 ("nfp: add set tcp and udp header action flower offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15nfp: flower: fix pedit set actions for multiple partial masksPieter Jansen van Vuuren
Previously we did not correctly change headers when using multiple pedit actions with partial masks. We now take this into account and no longer just commit the last pedit action. Fixes: c0b1bd9a8b8a ("nfp: add set ipv4 header action flower offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16RDMA/hns: Add FRMR support for hip08Yixian Liu
This patch adds fast register physical memory region (FRMR) support for hip08. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-15rxrpc: Fix a missing rxrpc_put_peer() in the error_report handlerDavid Howells
Fix a missing call to rxrpc_put_peer() on the main path through the rxrpc_error_report() function. This manifests itself as a ref leak whenever an ICMP packet or other error comes in. In commit f334430316e7, the hand-off of the ref to a work item was removed and was not replaced with a put. Fixes: f334430316e7 ("rxrpc: Fix error distribution") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15net: phy: merge phy_start_aneg and phy_start_aneg_privHeiner Kallweit
After commit 9f2959b6b52d ("net: phy: improve handling delayed work") the sync parameter isn't needed any longer in phy_start_aneg_priv(). This allows to merge phy_start_aneg() and phy_start_aneg_priv(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15MIPS: dts: Change upper case to lower caseSongjun Wu
All the upper case in unit-address and hex constants are changed to lower case according to the DT conventions. Signed-off-by: Songjun Wu <songjun.wu@linux.intel.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Reviewed-by: Rob Herring <robh@kernel.org> Patchwork: https://patchwork.linux-mips.org/patch/20768/ Cc: yixin.zhu@linux.intel.com Cc: chuanhua.lei@linux.intel.com Cc: hauke.mehrtens@intel.com Cc: devicetree@vger.kernel.org Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org>
2018-10-15MIPS: generic: Add Network, SPI and I2C to ocelot_defconfigAlexandre Belloni
Add support for the integrated switch, and the SPI and I2C controller found on MSCC Ocelot. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20345/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Loongson-3: Fix BRIDGE irq delivery problemHuacai Chen
After commit e509bd7da149dc349160 ("genirq: Allow migration of chained interrupts by installing default action") Loongson-3 fails at here: setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction); This is because both chained_action and cascade_irqaction don't have IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET timer interrupt can't be delivered during S3. So we set the irqchip of the chained irq to loongson_irq_chip which doesn't disable the chained irq in CP0.Status. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20434/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>
2018-10-15MIPS: Loongson-3: Fix CPU UART irq delivery problemHuacai Chen
Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to other CPUs) may cause interrupts be lost, especially in multi-package machines (Package-0's UART irq cannot be delivered to others). So make mask_loongson_irq() and unmask_loongson_irq() be no-ops. The original problem (UART IRQ may deliver to any core) is also because of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to remove all of the stuff. Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20433/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>
2018-10-15MIPS: Remove unused PREF, PREFE & PREFX macrosPaul Burton
asm/asm.h provides PREF(), PREFE() & PREFX() macros which are now entirely unused. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20908/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: lib: Use kernel_pref & user_pref in memcpy()Paul Burton
memcpy() is the only user of the PREF() & PREFE() macros from asm/asm.h. Switch to using the kernel_pref() & user_pref() macros from asm/asm-eva.h which fit more consistently with other abstractions of EVA vs non-EVA instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20907/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Remove unused CAT macroPaul Burton
asm/asm.h provides a CAT macro which is unused throughout the tree, and if anyone wanted it the generic CONCATENATE macro in linux/kernel.h provides the same functionality. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20905/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Add kernel_pref & user_pref helpersPaul Burton
Add kernel_pref & user_pref macros to asm/asm-eva.h, providing an abstraction around EVA & non-EVA pref instructions consistent with the existing macros we have for cache & load/store instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20906/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Remove unused TTABLE macroPaul Burton
asm/asm.h contains a TTABLE macro to generate "text tables" which would appear to be arrays of pointers to strings. It is unused throughout the kernel tree, so delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20904/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Remove unused PIC macrosPaul Burton
asm/asm.h contains CPRESTORE, CPADD & CPLOAD macros that are intended for use with position independent code, but are not used anywhere in the kernel - along with a comment to that effect. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20903/ Cc: linux-mips@linux-mips.org
2018-10-15MIPS: Remove unused MOVN & MOVZ macrosPaul Burton
We have macros in asm/asm.h to allow for use of the MOVN & MOVZ instructions with compare-and-branch sequences providing compatibility for ISA versions which don't include those instructions. However the macros are unused, and appear to have always been unused. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20909/ Cc: linux-mips@linux-mips.org
2018-10-16RDMA/bnxt_re: Avoid resource leak in case the NQ registration failsSelvin Xavier
In case the NQ alloc/enable fails, free up the already allocated/enabled NQ before reporting failure. Also, track the alloc/enable using proper state checking. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Wait for delayed work to finish before device removalSelvin Xavier
Delayed work bnxt_re_worker would be still running even after cancel_delayed_work returns. This causes crash as the driver proceeds with device removal. To make sure that the work is finished before returning, use cancel_delayed_work_sync. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Limit max_pkey to 16 bit valueDevesh Sharma
Some FW versios return pkey values more than 0xFFFF. pkey_tbl_len of ib_port_attr is 16bit value. So restricting max_pkeys to 0xFFFF. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Fix qp async event reportingDevesh Sharma
Reports affiliated async event on the qp-async event channel instead of global event channel. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Report out of sequence hw countersSelvin Xavier
Expose out of sequence errors received from FW. This counter is a 32 bit counter and driver has to accumulate the counter. Stores the previous value for calculating the difference in the next query. Also, update the HW statistics structure with new fields. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Expose rx discards and drop countersSelvin Xavier
Expose the RoCE discard and drop counters from the HW statistics context Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Prevent driver crash due to NULL pointer in error message printSomnath Kotur
crsqe->resp would be NULL in case the host command timed out before getting a response from HW. Check for NULL pointer to avoid a potential crash while printing the error message. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Drop L2 async events silentlyDevesh Sharma
In some FW versions, RoCE driver also receives an async notification which was directed to L2 driver. RoCE driver does not handle this and print a message to syslog. Drop these notifications silently. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Avoid accessing nq->bar_reg_iomem in failure caseSelvin Xavier
In the failure path, nq->bar_reg_iomem gets accessed without initializing. Avoid this by calling the bnxt_qplib_nq_stop_irq only if the initialization is complete. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Fixes: 6e04b1035689 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Avoid NULL check after accessing the pointerSelvin Xavier
This is reported by smatch check. rcfw->creq_bar_reg_iomem is accessed in bnxt_qplib_rcfw_stop_irq and this variable check afterwards doesn't make sense. Also, rcfw->creq_bar_reg_iomem will never be NULL. So Removing this check. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 6e04b1035689 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Remove the unnecessary version macro definitionSelvin Xavier
Version macro is not required as the driver is not maintaining the version. Removing the references of this macro too. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Fix recursive lock warning in debug kernelSelvin Xavier
Fix possible recursive lock warning. Its a false warning as the locks are part of two differnt HW Queue data structure - cmdq and creq. Debug kernel is throwing the following warning and stack trace. [ 783.914967] ============================================ [ 783.914970] WARNING: possible recursive locking detected [ 783.914973] 4.19.0-rc2+ #33 Not tainted [ 783.914976] -------------------------------------------- [ 783.914979] swapper/2/0 is trying to acquire lock: [ 783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.914999] but task is already holding lock: [ 783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re] [ 783.915013] other info that might help us debug this: [ 783.915016] Possible unsafe locking scenario: [ 783.915019] CPU0 [ 783.915021] ---- [ 783.915034] lock(&(&hwq->lock)->rlock); [ 783.915035] lock(&(&hwq->lock)->rlock); [ 783.915037] *** DEADLOCK *** [ 783.915038] May be due to missing lock nesting notation [ 783.915039] 1 lock held by swapper/2/0: [ 783.915040] #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re] [ 783.915044] stack backtrace: [ 783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ #33 [ 783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014 [ 783.915048] Call Trace: [ 783.915049] <IRQ> [ 783.915054] dump_stack+0x90/0xe3 [ 783.915058] __lock_acquire+0x106c/0x1080 [ 783.915061] ? sched_clock+0x5/0x10 [ 783.915063] lock_acquire+0xbd/0x1a0 [ 783.915065] ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915069] _raw_spin_lock_irqsave+0x4a/0x90 [ 783.915071] ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915073] bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915078] tasklet_action_common.isra.17+0x197/0x1b0 [ 783.915081] __do_softirq+0xcb/0x3a6 [ 783.915084] irq_exit+0xe9/0x100 [ 783.915085] do_IRQ+0x6a/0x120 [ 783.915087] common_interrupt+0xf/0xf [ 783.915088] </IRQ> Use nested notation for the spin_lock to avoid this warning. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/bnxt_re: Add missing spin lock initializationSelvin Xavier
Add the missing initalization of the cq_lock and qplib.flush_lock. Fixes: 942c9b6ca8de ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16Merge branch 'for-rc' into rdma.git for-nextJason Gunthorpe
From git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git This is required to resolve dependencies of the next series of RDMA patches. The code motion conflicts in drivers/infiniband/core/cache.c were resolved. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-15hv_netvsc: fix vf serial matching with pci slot infoHaiyang Zhang
The VF device's serial number is saved as a string in PCI slot's kobj name, not the slot->number. This patch corrects the netvsc driver, so the VF device can be successfully paired with synthetic NIC. Fixes: 00d7ddba1143 ("hv_netvsc: pair VF based on serial number") Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15Merge branch 'tcp-second-round-for-EDT-conversion'David S. Miller
Eric Dumazet says: ==================== tcp: second round for EDT conversion First round of EDT patches left TCP stack in a non optimal state. - High speed flows suffered from loss of performance, addressed by the first patch of this series. - Second patch brings pacing to the current state of networking, since we now reach ~100 Gbit on a single TCP flow. - Third patch implements a mitigation for scheduling delays, like the one we did in sch_fq in the past. - Fourth patch removes one special case in sch_fq for ACK packets. - Fifth patch removes a serious perfomance cost for TCP internal pacing. We should setup the high resolution timer only if really needed. - Sixth patch fixes a typo in BBR. - Last patch is one minor change in cdg congestion control. Neal Cardwell also has a patch series fixing BBR after EDT adoption. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15tcp: cdg: use tcp high resolution clock cacheEric Dumazet
We store in tcp socket a cache of most recent high resolution clock, there is no need to call local_clock() again, since this cache is good enough. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15tcp_bbr: fix typo in bbr_pacing_margin_percentNeal Cardwell
There was a typo in this parameter name. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15tcp: optimize tcp internal pacingEric Dumazet
When TCP implements its own pacing (when no fq packet scheduler is used), it is arming high resolution timer after a packet is sent. But in many cases (like TCP_RR kind of workloads), this high resolution timer expires before the application attempts to write the following packet. This overhead also happens when the flow is ACK clocked and cwnd limited instead of being limited by the pacing rate. This leads to extra overhead (high number of IRQ) Now tcp_wstamp_ns is reserved for the pacing timer only (after commit "tcp: do not change tcp_wstamp_ns in tcp_mstamp_refresh"), we can setup the timer only when a packet is about to be sent, and if tcp_wstamp_ns is in the future. This leads to a ~10% performance increase in TCP_RR workloads. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15net_sched: sch_fq: no longer use skb_is_tcp_pure_ack()Eric Dumazet
With the new EDT model, sch_fq no longer has to special case TCP pure acks, since their skb->tstamp will allow them being sent without pacing delay. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15tcp: mitigate scheduling jitter in EDT pacing modelEric Dumazet
In commit fefa569a9d4b ("net_sched: sch_fq: account for schedule/timers drifts") we added a mitigation for scheduling jitter in fq packet scheduler. This patch does the same in TCP stack, now it is using EDT model. Note that this mitigation is valid for both external (fq packet scheduler) or internal TCP pacing. This uses the same strategy than the above commit, allowing a time credit of half the packet currently sent. Consider following case : An skb is sent, after an idle period of 300 usec. The air-time (skb->len/pacing_rate) is 500 usec Instead of setting the pacing timer to now+500 usec, it will use now+min(500/2, 300) -> now+250usec This is like having a token bucket with a depth of half an skb. Tested: tc qdisc replace dev eth0 root pfifo_fast Before netperf -P0 -H remote -- -q 1000000000 # 8000Mbit 540000 262144 262144 10.00 7710.43 After : netperf -P0 -H remote -- -q 1000000000 # 8000 Mbit 540000 262144 262144 10.00 7999.75 # Much closer to 8000Mbit target Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>