summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx4
AgeCommit message (Collapse)Author
2019-04-01drivers: mellanox: use netdev_xmit_more() helperFlorian Westphal
skb->xmit_more hint is now always 0. This switches the mellanox drivers to the netdev_xmit_more() helper. Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Boris Pismenny <borisp@mellanox.com> Cc: Ilya Lesokhin <ilyal@mellanox.com> Cc: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-24net: devlink: select NET_DEVLINK from driversJiri Pirko
Some drivers are becoming more dependent on NET_DEVLINK being selected in configuration. With upcoming compat functions, the behavior would be wrong in case devlink was not compiled in. So make the drivers select NET_DEVLINK and rely on the functions being there, not just stubs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix qp mtt size calculationJack Morgenstein
Calculation of qp mtt size (in function mlx4_RST2INIT_wrapper) ultimately depends on function roundup_pow_of_two. If the amount of memory required by the QP is less than one page, roundup_pow_of_two is called with argument zero. In this case, the roundup_pow_of_two result is undefined. Calling roundup_pow_of_two with a zero argument resulted in the following stack trace: UBSAN: Undefined behaviour in ./include/linux/log2.h:61:13 shift exponent 64 is too large for 64-bit type 'long unsigned int' CPU: 4 PID: 26939 Comm: rping Tainted: G OE 4.19.0-rc1 Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015 Call Trace: dump_stack+0x9a/0xeb ubsan_epilogue+0x9/0x7c __ubsan_handle_shift_out_of_bounds+0x254/0x29d ? __ubsan_handle_load_invalid_value+0x180/0x180 ? debug_show_all_locks+0x310/0x310 ? sched_clock+0x5/0x10 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x260 ? find_held_lock+0x35/0x1e0 ? mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core] mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core] Fix this by explicitly testing for zero, and returning one if the argument is zero (assuming that the next higher power of 2 in this case should be one). Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix locking in SRIOV mode when switching between events and ↵Jack Morgenstein
polling In procedures mlx4_cmd_use_events() and mlx4_cmd_use_polling(), we need to guarantee that there are no FW commands in progress on the comm channel (for VFs) or wrapped FW commands (on the PF) when SRIOV is active. We do this by also taking the slave_cmd_mutex when SRIOV is active. This is especially important when switching from event to polling, since we free the command-context array during the switch. If there are FW commands in progress (e.g., waiting for a completion event), the completion event handler will access freed memory. Since the decision to use comm_wait or comm_poll is taken before grabbing the event_sem/poll_sem in mlx4_comm_cmd_wait/poll, we must take the slave_cmd_mutex as well (to guarantee that the decision to use events or polling and the call to the appropriate cmd function are atomic). Fixes: a7e1f04905e5 ("net/mlx4_core: Fix deadlock when switching between polling and event fw commands") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-12net/mlx4_core: Fix reset flow when in command polling modeJack Morgenstein
As part of unloading a device, the driver switches from FW command event mode to FW command polling mode. Part of switching over to polling mode is freeing the command context array memory (unfortunately, currently, without NULLing the command context array pointer). The reset flow calls "complete" to complete all outstanding fw commands (if we are in event mode). The check for event vs. polling mode here is to test if the command context array pointer is NULL. If the reset flow is activated after the switch to polling mode, it will attempt (incorrectly) to complete all the commands in the context array -- because the pointer was not NULLed when the driver switched over to polling mode. As a result, we have a use-after-free situation, which results in a kernel crash. For example: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90 PGD 0 Oops: 0000 [#1] SMP Modules linked in: netconsole nfsv3 nfs_acl nfs lockd grace ... CPU: 2 PID: 940 Comm: kworker/2:3 Kdump: loaded Not tainted 3.10.0-862.el7.x86_64 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 Workqueue: events hv_eject_device_work [pci_hyperv] task: ffff8d1734ca0fd0 ti: ffff8d17354bc000 task.ti: ffff8d17354bc000 RIP: 0010:[<ffffffff876c4a8e>] [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90 RSP: 0018:ffff8d17354bfa38 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffff8d17362d42c8 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8d17362d42c8 RBP: ffff8d17354bfa70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000298 R11: ffff8d173610e000 R12: ffff8d17362d42d0 R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8d1802680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000f16d8000 CR4: 00000000001406e0 Call Trace: [<ffffffff876c7adc>] complete+0x3c/0x50 [<ffffffffc04242f0>] mlx4_cmd_wake_completions+0x70/0x90 [mlx4_core] [<ffffffffc041e7b1>] mlx4_enter_error_state+0xe1/0x380 [mlx4_core] [<ffffffffc041fa4b>] mlx4_comm_cmd+0x29b/0x360 [mlx4_core] [<ffffffffc041ff51>] __mlx4_cmd+0x441/0x920 [mlx4_core] [<ffffffff877f62b1>] ? __slab_free+0x81/0x2f0 [<ffffffff87951384>] ? __radix_tree_lookup+0x84/0xf0 [<ffffffffc043a8eb>] mlx4_free_mtt_range+0x5b/0xb0 [mlx4_core] [<ffffffffc043a957>] mlx4_mtt_cleanup+0x17/0x20 [mlx4_core] [<ffffffffc04272c7>] mlx4_free_eq+0xa7/0x1c0 [mlx4_core] [<ffffffffc042803e>] mlx4_cleanup_eq_table+0xde/0x130 [mlx4_core] [<ffffffffc0433e08>] mlx4_unload_one+0x118/0x300 [mlx4_core] [<ffffffffc0434191>] mlx4_remove_one+0x91/0x1f0 [mlx4_core] The fix is to set the command context array pointer to NULL after freeing the array. Fixes: f5aef5aa3506 ("net/mlx4_core: Activate reset flow upon fatal command cases") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04mellanox: Switch to bitmap_zalloc()Andy Shevchenko
Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: devlink: turn devlink into a built-inJakub Kicinski
Being able to build devlink as a module causes growing pains. First all drivers had to add a meta dependency to make sure they are not built in when devlink is built as a module. Now we are struggling to invoke ethtool compat code reliably. Make devlink code built-in, users can still not build it at all but the dynamically loadable module option is removed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two easily resolvable overlapping change conflicts, one in TCP and one in the eBPF verifier. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-18net/mlx4_en: fix spelling mistake: "quiting" -> "quitting"Colin Ian King
There is a spelling mistake in a en_err error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net/mlx4_en: Force CHECKSUM_NONE for short ethernet framesSaeed Mahameed
When an ethernet frame is padded to meet the minimum ethernet frame size, the padding octets are not covered by the hardware checksum. Fortunately the padding octets are usually zero's, which don't affect checksum. However, it is not guaranteed. For example, switches might choose to make other use of these octets. This repeatedly causes kernel hardware checksum fault. Prior to the cited commit below, skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, we need to keep skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP headers, it does not worth the effort as the packets are so small that CHECKSUM_COMPLETE has no significant advantage. Future work: when reporting checksum complete is not an option for IP non-TCP/UDP packets, we can actually fallback to report checksum unnecessary, by looking at cqe IPOK bit. Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net/mlx4_en: Force CHECKSUM_NONE for short ethernet framesSaeed Mahameed
When an ethernet frame is padded to meet the minimum ethernet frame size, the padding octets are not covered by the hardware checksum. Fortunately the padding octets are usually zero's, which don't affect checksum. However, it is not guaranteed. For example, switches might choose to make other use of these octets. This repeatedly causes kernel hardware checksum fault. Prior to the cited commit below, skb checksum was forced to be CHECKSUM_NONE when padding is detected. After it, we need to keep skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP headers, it does not worth the effort as the packets are so small that CHECKSUM_COMPLETE has no significant advantage. Future work: when reporting checksum complete is not an option for IP non-TCP/UDP packets, we can actually fallback to report checksum unnecessary, by looking at cqe IPOK bit. Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08devlink: publish params only after driver init is doneJiri Pirko
Currently, user can do dump or get of param values right after the devlink params are registered. However the driver may not be initialized which is an issue. The same problem happens during notification upon param registration. Allow driver to publish devlink params whenever it is ready to handle get() ops. Note that this cannot be resolved by init reordering, as the "driverinit" params have to be available before the driver is initialized (it needs the param values there). Signed-off-by: Jiri Pirko <jiri@mellanox.com> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2019-01-24net/mlx4_core: A write memory barrier is sufficient in EQ ci updateTariq Toukan
Soften the memory barrier call of mb() by a sufficient wmb() in the consumer index update of the event queues. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24net/mlx4_core: Fix error handling when initializing CQ bufs in the driverJack Morgenstein
Procedure mlx4_init_user_cqes() handles returns by copy_to_user incorrectly. copy_to_user() returns the number of bytes not copied. Thus, a non-zero return should be treated as a -EFAULT error (as is done elsewhere in the kernel). However, mlx4_init_user_cqes() error handling simply returns the number of bytes not copied (instead of -EFAULT). Note, though, that this is a harmless bug: procedure mlx4_alloc_cq() (which is the only caller of mlx4_init_user_cqes()) treats any non-zero return as an error, but that returned error value is processed internally, and not passed further up the call stack. In addition, fixes the following sparse warning: warning: incorrect type in argument 1 (different address spaces) expected void [noderef] <asn:1>*to got void *buf Fixes: e45678973dcb ("{net, IB}/mlx4: Initialize CQ buffers in the driver when possible") Reported by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24net/mlx4_core: Add masking for a few queries on HCA capsAya Levin
Driver reads the query HCA capabilities without the corresponding masks. Without the correct masks, the base addresses of the queues are unaligned. In addition some reserved bits were wrongly read. Using the correct masks, ensures alignment of the base addresses and allows future firmware versions safe use of the reserved bits. Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet") Fixes: 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23net/mlx4: Mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/ethernet/mellanox/mlx4/eq.c: In function ‘mlx4_eq_int’: drivers/net/ethernet/mellanox/mlx4/mlx4.h:219:5: warning: this statement may fall through [-Wimplicit-fallthrough=] if (mlx4_debug_level) \ ^ drivers/net/ethernet/mellanox/mlx4/eq.c:558:4: note: in expansion of macro ‘mlx4_dbg’ mlx4_dbg(dev, "%s: MLX4_EVENT_TYPE_SRQ_LIMIT. srq_no=0x%x, eq 0x%x\n", ^~~~~~~~ drivers/net/ethernet/mellanox/mlx4/eq.c:561:3: note: here case MLX4_EVENT_TYPE_SRQ_CATAS_ERROR: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-17net/mlx4: remove unneeded semicolonYueHaibing
Remove unneeded semicolon. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix regression in multi-SKB responses to RTM_GETADDR, from Arthur Gautier. 2) Fix ipv6 frag parsing in openvswitch, from Yi-Hung Wei. 3) Unbounded recursion in ipv4 and ipv6 GUE tunnels, from Stefano Brivio. 4) Use after free in hns driver, from Yonglong Liu. 5) icmp6_send() needs to handle the case of NULL skb, from Eric Dumazet. 6) Missing rcu read lock in __inet6_bind() when operating on mapped addresses, from David Ahern. 7) Memory leak in tipc-nl_compat_publ_dump(), from Gustavo A. R. Silva. 8) Fix PHY vs r8169 module loading ordering issues, from Heiner Kallweit. 9) Fix bridge vlan memory leak, from Ido Schimmel. 10) Dev refcount leak in AF_PACKET, from Jason Gunthorpe. 11) Infoleak in ipv6_local_error(), flow label isn't completely initialized. From Eric Dumazet. 12) Handle mv88e6390 errata, from Andrew Lunn. 13) Making vhost/vsock CID hashing consistent, from Zha Bin. 14) Fix lack of UMH cleanup when it unexpectedly exits, from Taehee Yoo. 15) Bridge forwarding must clear skb->tstamp, from Paolo Abeni. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) bnxt_en: Fix context memory allocation. bnxt_en: Fix ring checking logic on 57500 chips. mISDN: hfcsusb: Use struct_size() in kzalloc() net: clear skb->tstamp in bridge forwarding path net: bpfilter: disallow to remove bpfilter module while being used net: bpfilter: restart bpfilter_umh when error occurred net: bpfilter: use cleanup callback to release umh_info umh: add exit routine for UMH process isdn: i4l: isdn_tty: Fix some concurrency double-free bugs vhost/vsock: fix vhost vsock cid hashing inconsistent net: stmmac: Prevent RX starvation in stmmac_napi_poll() net: stmmac: Fix the logic of checking if RX Watchdog must be enabled net: stmmac: Check if CBS is supported before configuring net: stmmac: dwxgmac2: Only clear interrupts that are active net: stmmac: Fix PCI module removal leak tools/bpf: fix bpftool map dump with bitfields tools/bpf: test btf bitfield with >=256 struct member offset bpf: fix bpffs bitfield pretty print net: ethernet: mediatek: fix warning in phy_start_aneg tcp: change txhash on SYN-data timeout ...
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-07net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sgStephen Warren
pci_{,un}map_sg are deprecated and replaced by dma_{,un}map_sg. This is especially relevant since the rest of the driver uses the DMA API. Fix the driver to use the replacement APIs. Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-07net/mlx4: Get rid of page operation after dma_alloc_coherentStephen Warren
This patch solves a crash at the time of mlx4 driver unload or system shutdown. The crash occurs because dma_alloc_coherent() returns one value in mlx4_alloc_icm_coherent(), but a different value is passed to dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because when allocated, that pointer is passed to sg_set_buf() to record it, then when freed it is re-calculated by calling lowmem_page_address(sg_page()) which returns a different value. Solve this by recording the value that dma_alloc_coherent() returns, and passing this to dma_free_coherent(). This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get rid of page operation after dma_alloc_coherent"). Based-on-code-from: Christoph Hellwig <hch@lst.de> Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-24net/mlx4_core: drop useless LIST_HEADJulia Lawall
Drop LIST_HEAD where the variable it declares has never been used. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; @@ - LIST_HEAD(x); ... when != x // </smpl> Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17net/mlx4_en: remove fallback after kzalloc_node()Eric Dumazet
kzalloc_node(..., GFP_KERNEL, node) will attempt to allocate memory as close as possible to the node. There is no need to fallback to kzalloc() if this has failed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08net/mlx4_core: Correctly set PFC param if global pause is turned off.Tarick Bedeir
rx_ppp and tx_ppp can be set between 0 and 255, so don't clamp to 1. Fixes: 6e8814ceb7e8 ("net/mlx4_en: Fix mixed PFC and Global pause user control requests") Signed-off-by: Tarick Bedeir <tarick@google.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-03net/mlx4_core: Fix several coding style errorsErez Alfasi
Fix 3 checkpatch errors within mlx4/main.c: - Unnecessary mlx4_debug_level global variable initialization to 0. - Prohibited space before comma. - Whitespaces instead of tab. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-03net/mlx4_core: Fix return codes of unsupported operationsErez Alfasi
Functions __set_port_type and mlx4_check_port_params returned -EINVAL while the proper return code is -EOPNOTSUPP as a result of an unsupported operation. All drivers should generate this and all users should check for it when detecting an unsupported functionality. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-03net/mlx4_en: Fix build break when CONFIG_INET is offSaeed Mahameed
MLX4_EN depends on NETDEVICES, ETHERNET and INET Kconfigs. Make sure they are listed in MLX4_EN Kconfig dependencies. This fixes the following build break: drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: ‘struct iphdr’ declared inside parameter list [enabled by default] struct iphdr *iph) ^ drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function ‘get_fixed_ipv4_csum’: drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing pointer to incomplete type _u8 ipproto = iph->protocol; Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-03net/mlx4_en: Change min MTU size to ETH_MIN_MTUEran Ben Elisha
NIC driver minimal MTU size shall be set to ETH_MIN_MTU, as defined in the RFC791 and in the network stack. Remove old mlx4_en only define for it, which was set to wrong value. Fixes: b80f71f5816f ("ethernet/mellanox: use core min/max MTU checking") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-22{net, IB}/mlx4: Initialize CQ buffers in the driver when possibleDaniel Jurgens
Perform CQ initialization in the driver when the capability is supported by the FW. When passing the CQ to HW indicate that the CQ buffer has been pre-initialized. Doing so decreases CQ creation time. Testing on P8 showed a single 2048 entry CQ creation time was reduced from ~395us to ~170us, which is 2.3x faster. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-11-15net/mlx4: Fix UBSAN warning of signed integer overflowAya Levin
UBSAN: Undefined behavior in drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:626:29 signed integer overflow: 1802201963 + 1802201963 cannot be represented in type 'int' The union of res_reserved and res_port_rsvd[MLX4_MAX_PORTS] monitors granting of reserved resources. The grant operation is calculated and protected, thus both members of the union cannot be negative. Changed type of res_reserved and of res_port_rsvd[MLX4_MAX_PORTS] from signed int to unsigned int, allowing large value. Fixes: 5a0d0a6161ae ("mlx4: Structures and init/teardown for VF resource quotas") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15net/mlx4_core: Fix uninitialized variable compilation warningTariq Toukan
Initialize the uid variable to zero to avoid the compilation warning. Fixes: 7a89399ffad7 ("net/mlx4: Add mlx4_bitmap zone allocator") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15net/mlx4_core: Zero out lkey field in SW2HW_MPT fw commandJack Morgenstein
When re-registering a user mr, the mpt information for the existing mr when running SRIOV is obtained via the QUERY_MPT fw command. The returned information includes the mpt's lkey. This retrieved mpt information is used to move the mpt back to hardware ownership in the rereg flow (via the SW2HW_MPT fw command when running SRIOV). The fw API spec states that for SW2HW_MPT, the lkey field must be zero. Any ConnectX-3 PF driver which checks for strict spec adherence will return failure for SW2HW_MPT if the lkey field is not zero (although the fw in practice ignores this field for SW2HW_MPT). Thus, in order to conform to the fw API spec, set the lkey field to zero before invoking SW2HW_MPT when running SRIOV. Fixes: e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08mlx4: use __vlan_hwaccel helpersMichał Mirosław
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03net/mlx4_en: use __netdev_tx_sent_queue()Eric Dumazet
doorbell only depends on xmit_more and netif_tx_queue_stopped() Using __netdev_tx_sent_queue() avoids messing with BQL stop flag, and is more generic. This patch increases performance on GSO workload by keeping doorbells to the minimum required. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-30net/mlx4_en: add a missing <net/ip.h> includeEric Dumazet
Abdul Haleem reported a build error on ppc : drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: `struct iphdr` declared inside parameter list [enabled by default] struct iphdr *iph) ^ drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function get_fixed_ipv4_csum: drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing pointer to incomplete type __u8 ipproto = iph->protocol; ^ Fixes: 55469bc6b577 ("drivers: net: remove <net/busy_poll.h> inclusion when not needed") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-25drivers: net: remove <net/busy_poll.h> inclusion when not neededEric Dumazet
Drivers using generic NAPI interface no longer need to include <net/busy_poll.h>, since busy polling was moved to core networking stack long ago. See commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()") for reference. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were easy to resolve using immediate context mostly, except the cls_u32.c one where I simply too the entire HEAD chunk. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11net/mlx4_core: Fix warnings during boot on driverinit param set failuresMoshe Shemesh
During boot, mlx4_core sets the driverinit configuration parameters and updates the devlink module on the initial values calling devlink_param_driverinit_value_set(). If devlink_param_driverinit_value_set() returns an error mlx4_core reports kernel module warning. This caused false alarm during boot in case kernel was compiled with CONFIG_NET_DEVLINK off. Fix by removing warning reported in case devlink_param_driverinit_value_set() fails. This actually makes the function mlx4_devlink_set_init_value() redundant to using directly devlink_param_driverinit_value_set() and so removed. It fixes the following kernel trace: mlx4_core 0000:00:06.0: devlink set parameter 0 value failed (err = -95) mlx4_core 0000:00:06.0: devlink set parameter 1 value failed (err = -95) mlx4_core 0000:00:06.0: devlink set parameter 4 value failed (err = -95) mlx4_core 0000:00:06.0: devlink set parameter 5 value failed (err = -95) mlx4_core 0000:00:06.0: devlink set parameter 3 value failed (err = -95) Fixes: bd1b51dc66df ("mlx4: Add mlx4 initial parameters table and register it") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09net/mlx4_en: Use minimal rx and tx ring sizes on kdump kernelAlaa Hleihel
When memory is limited (on kdump kernel), reduce size of rx and tx rings. Also reduce the number of rx rings. Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Version bump conflict in batman-adv, take what's in net-next. iavf conflict, adjustment of netdev_ops in net-next conflicting with poll controller method removal in net. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23mlx4: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. mlx4 uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21net/mlx4: Use cpumask_available for eq->affinity_maskNathan Chancellor
Clang warns that the address of a pointer will always evaluated as true in a boolean context: drivers/net/ethernet/mellanox/mlx4/eq.c:243:11: warning: address of array 'eq->affinity_mask' will always evaluate to 'true' [-Wpointer-bool-conversion] if (!eq->affinity_mask || cpumask_empty(eq->affinity_mask)) ~~~~~^~~~~~~~~~~~~ 1 warning generated. Use cpumask_available, introduced in commit f7e30f01a9e2 ("cpumask: Add helper cpumask_available()"), which does the proper checking and avoids this warning. Link: https://github.com/ClangBuiltLinux/linux/issues/86 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12net: ethernet: Use DIV_ROUND_UP instead of reimplementing its functionzhong jiang
DIV_ROUND_UP has implemented the code-opened function. Therefore, just replace the implementation with DIV_ROUND_UP. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07net/mlx4/en_rx: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 114794 ("Missing break in switch") Addresses-Coverity-ID: 114795 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07net/mlx4/mcg: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 114792 ("Missing break in switch") Addresses-Coverity-ID: 114793 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-25net/mlx4_core: Allow MTTs starting at any indexTariq Toukan
Allow obtaining MTTs starting at any index, thus give a better cache utilization. For this, allow setting log_mtts_per_seg to 0, and use this in default. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Anaty Rahamim Bar Kat <anaty@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>