summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2020-07-10net/mlx5e: Move devlink port register and unregister callsVladyslav Tarasiuk
Register devlink ports upon NIC init. TX and RX health reporters handle errors which may occur early on at driver initialization. And because these reporters are to be moved to port context, they require devlink ports to be already registered. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix call to pm_runtime in the suspend/resume functionsNicolas Ferre
The calls to pm_runtime_force_suspend/resume() functions are only relevant if the device is not configured to act as a WoL wakeup source. Add the device_may_wakeup() test before calling them. Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Sergio Prado <sergio.prado@e-labworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix macb_suspend() by removing call to netif_carrier_off()Nicolas Ferre
As we now use the phylink call to phylink_stop() in the non-WoL path, there is no need for this call to netif_carrier_off() anymore. It can disturb the underlying phylink FSM. Fixes: 7897b071ac3b ("net: macb: convert to phylink") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix macb_get/set_wol() when moving to phylinkNicolas Ferre
Keep previous function goals and integrate phylink actions to them. phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver supports Wake-on-Lan. Initialization of "supported" and "wolopts" members is done in phylink function, no need to keep them in calling function. phylink_ethtool_set_wol() return value is considered and determines if the MAC has to handle WoL or not. The case where the PHY doesn't implement WoL leads to the MAC configuring it to provide this feature. Fixes: 7897b071ac3b ("net: macb: convert to phylink") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Antoine Tenart <antoine.tenart@bootlin.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: mark device wake capable when "magic-packet" property presentNicolas Ferre
Change the way the "magic-packet" DT property is handled in the macb_probe() function, matching DT binding documentation. Now we mark the device as "wakeup capable" instead of calling the device_init_wakeup() function that would enable the wakeup source. For Ethernet WoL, enabling the wakeup_source is done by using ethtool and associated macb_set_wol() function that already calls device_set_wakeup_enable() for this purpose. That would reduce power consumption by cutting more clocks if "magic-packet" property is set but WoL is not configured by ethtool. Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Sergio Prado <sergio.prado@e-labworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix wakeup test in runtime suspend/resume routinesNicolas Ferre
Use the proper struct device pointer to check if the wakeup flag and wakeup source are positioned. Use the one passed by function call which is equivalent to &bp->dev->dev.parent. It's preventing the trigger of a spurious interrupt in case the Wake-on-Lan feature is used. Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10bnxt_en: fix NULL dereference in case SR-IOV configuration failsDavide Caratti
we need to set 'active_vfs' back to 0, if something goes wrong during the allocation of SR-IOV resources: otherwise, further VF configurations will wrongly assume that bp->pf.vf[x] are valid memory locations, and commands like the ones in the following sequence: # echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs # ip link set dev ens1f0np0 up # ip link set dev ens1f0np0 vf 0 trust on will cause a kernel crash similar to this: bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV BUG: kernel NULL pointer dereference, address: 0000000000000014 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 43 PID: 2059 Comm: ip Tainted: G I 5.8.0-rc2.upstream+ #871 Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019 RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en] Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89 RSP: 0018:ffffac6246a1f570 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000b RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b28f538900 RBP: ffff98b28f538900 R08: 0000000000000000 R09: 0000000000000008 R10: ffffffffb9515be0 R11: ffffac6246a1f678 R12: ffff98b28f538000 R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffc05451e0 FS: 00007fde0f688800(0000) GS:ffff98baffd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000014 CR3: 000000104bb0a003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: do_setlink+0x994/0xfe0 __rtnl_newlink+0x544/0x8d0 rtnl_newlink+0x47/0x70 rtnetlink_rcv_msg+0x29f/0x350 netlink_rcv_skb+0x4a/0x110 netlink_unicast+0x21d/0x300 netlink_sendmsg+0x329/0x450 sock_sendmsg+0x5b/0x60 ____sys_sendmsg+0x204/0x280 ___sys_sendmsg+0x88/0xd0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x47/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.") Reported-by: Fei Liu <feliu@redhat.com> CC: Jonathan Toppins <jtoppins@redhat.com> CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'mlx5-updates-2020-07-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-07-09 This series provides updates to mlx5 CT (connection tracking) offloads For more information please see tag log below. Please pull and let me know if there is any problem. The following conflict is expected when net is merged into net-next: to resolve just use the hunks from net-next. <<<<<<< HEAD (net-next) mlx5_tc_ct_del_ft_entry(ct_priv, entry); kfree(entry); ======= (net) mlx5_tc_ct_entry_del_rules(ct_priv, entry); kfree(entry); >>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af mlx5 connection tracking offloads updates: 1) Restore CT state from lookup in zone instead of tupleid On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT entry and restore it, instead of the driver allocated tuple id. This improves flow insertion rate by avoiding the allocation of a header rewrite context to maintain the tupleid. 2) Re-use modify header HW objects for identical modify actions. 3) Expand tunnel register mappings Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel options mappings. Restoring the ct state from zone lookup instead of tuple id requires reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel mappings. Expand tunnel and tunnel options register mappings to 12 bit each. 4) Trivial cleanup and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10mlx4: convert to new udp_tunnel_nic infraJakub Kicinski
Convert to new infra, make use of the ability to sleep in the callback. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10bnxt: convert to new udp_tunnel_nic infraJakub Kicinski
Convert to new infra, taking advantage of sleeping in callbacks. v2: - use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication that the offload is active. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: convert to new udp_tunnel_nic infraJakub Kicinski
Make use of new common udp_tunnel_nic infra. ixgbe supports IPv4 only, and only single VxLAN and Geneve ports (one each). v2: - split out the RXCSUM feature handling to separate change; - declare structs separately; - use ti.type instead of assuming table 0 is VxLAN; - move setting netdev->udp_tunnel_nic_info to its own switch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: don't clear UDP tunnel ports when RXCSUM is disabledJakub Kicinski
It appears the clearing of UDP tunnel ports when RXCSUM is disabled is unnecessary. Driver will not pay attention to checksum bits if RXCSUM is not set, so we can let the hardware parse the packets. Note that the UDP tunnel port NDO handlers don't pay attention to the state of RXCSUM, so the ports could had been re-programmed, anyway. This cleanup simplifies later conversion patch. v2: - break this out of the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10netdevsim: add UDP tunnel port offload supportJakub Kicinski
Add UDP tunnel port handlers to our fake driver so we can test the core infra. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10udp_tunnel: add central NIC RX port offload infrastructureJakub Kicinski
Cater to devices which: (a) may want to sleep in the callbacks; (b) only have IPv4 support; (c) need all the programming to happen while the netdev is up. Drivers attach UDP tunnel offload info struct to their netdevs, where they declare how many UDP ports of various tunnel types they support. Core takes care of tracking which ports to offload. Use a fixed-size array since this matches what almost all drivers do, and avoids a complexity and uncertainty around memory allocations in an atomic context. Make sure that tunnel drivers don't try to replay the ports when new NIC netdev is registered. Automatic replays would mess up reference counting, and will be removed completely once all drivers are converted. v4: - use a #define NULL to avoid build issues with CONFIG_INET=n. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09net/mlx5e: CT: Fix releasing ft entriesRoi Dayan
Before this commit, on ft flush, ft entries were not removed from the ct_tuple hashtables. Fix it. Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Remove unused function paramSaeed Mahameed
"flow" parameter is not used in __mlx5_tc_ct_flow_offload_clear(), remove it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09net/mlx5e: CT: Return err_ptr from internal functionsSaeed Mahameed
Instead of having to deal with converting between int and ERR_PTR for return values in mlx5_tc_ct_flow_offload(), make the internal helper functions return a ptr to mlx5_flow_handle instead of passing it as output param, this will also avoid gcc confusion and false alarms, thus we remove the redundant ERR_PTR rule initialization. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09net/mlx5e: CT: Expand tunnel register mappingsPaul Blakey
Reg_c1 is 32 bits wide. Originally, 24 bit were allocated for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel options mappings. Restoring the ct state from zone lookup instead of tuple id requires reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel mappings. Expand tunnel and tunnel options register mappings to 12 bit each. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Use mapping for zone restore registerPaul Blakey
Use a single byte mapping for zone restore register (zone matching remains 16 bit). This makes room for using the freed 8 bits on register C1 for mapping more tunnels and tunnel options. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Re-use tuple modify headers for identical modify actionsPaul Blakey
After removing the tupleid register which changed per tuple, tuple modify headers set the ct_state, zone, mark, and label registers. For non-natted tuples going through the same tc rules path, their values will be the same, and all their modify headers will be the same. Re-use tuple modify header when possible, by adding each new modify header to an hahstable, and looking up identical ones before creating a new one. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Export sharing of mod headers to a new filePaul Blakey
Refactor sharing of mod headers to new file and while there, remove spin lock and flows list, as this is only used for warn on. Use the generic API in the next patch to re-use tuple modify headers for identical modify actions, Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Restore ct state from lookup in zone instead of tupleidPaul Blakey
Remove tupleid, and replace it with zone_restore, which is the zone an established tuple sets after match. On miss, Use this zone + tuple taken from the skb, to lookup the ct entry and restore it. This improves flow insertion rate by avoiding the allocation of a header rewrite context. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Don't offload tuple rewrites for established tuplesPaul Blakey
Next patches will remove the tupleid registers that is used to restore the ct state on miss, and instead use the tuple on the missed packet to lookup which state to restore. Disable tuple rewrites after connection tracking. For tuple rewrites, inject a ct_state=-trk match so it won't change the tuple for established flows (+trk) that passed connection tracking, and instead miss to software. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Use netdev_info instead of pr_infoOz Shlomo
The next patch will pass the mlx5e_priv struct to the modify_header_match_supported method. Use this opportunity to refactor the existing pr_info call to a netdev_info call. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Allow header rewrite of 5-tuple and ct clear actionPaul Blakey
With ct clear we don't jump to the ct tables, so header rewrite of 5-tuple can be done in place (and not moved to after the CT action). Check for ct clear action, and if so, allow 5-tuple header rewrite. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Save ct entries tuples in hashtablesPaul Blakey
Save original tuple and natted tuple in two new hashtables. This is a pre-step for restoring ct state after hw miss by performing a 5-tuple lookup on the hash tables. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5: E-switch, When eswitch is unsupported, return -EOPNOTSUPPParav Pandit
When eswitch is unsupported, currently -EPERM error code is returned instead of -EOPNOTSUPP. Due to this VF device's devlink virtual port is not enumerated because port_function_get() callback returned -EPERM instead of -EOPNOTSUPP. Hence, return the error code -EOPNOTSUPP when eswitch is unsupported. Fixes: bd93975353d5 ("net/mlx5: E-switch, Introduce and use eswitch support check helper") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Fix memory leak in cleanupEli Britstein
CT entries are deleted via a workqueue from netfilter. If removing the module before that, the rules are cleaned by the driver itself, but the memory entries for them are not freed. Fix that. Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix port buffers cell size valueEran Ben Elisha
Device unit for port buffers size, xoff_threshold and xon_threshold is cells. Fix a bug in driver where cell unit size was hard-coded to 128 bytes. This hard-coded value is buggy, as it is wrong for some hardware versions. Driver to read cell size from SBCAM register and translate bytes to cell units accordingly. In order to fix the bug, this patch exposes SBCAM (Shared buffer capabilities mask) layout and defines. If SBCAM.cap_cell_size is valid, use it for all bytes to cells calculations. If not valid, fallback to 128. Cell size do not change on the fly per device. Instead of issuing SBCAM access reg command every time such translation is needed, cache it in mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz as a param to every function that needs bytes to cells translation. While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to en_dcbnl.c, as it is only used by that file. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix 50G per lane indicationAya Levin
Some released FW versions mistakenly don't set the capability that 50G per lane link-modes are supported for VFs (ptys_extended_ethernet capability bit). When the capability is unset, read PTYS.ext_eth_proto_capability (always reliable). If PTYS.ext_eth_proto_capability is valid (has a non-zero value) conclude that the HCA supports 50G per lane. Otherwise, conclude that the HCA doesn't support 50G per lane. Fixes: a08b4ed1373d ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crashAya Levin
After function reload, CPU mapping used by aRFS RX is broken, leading to a kernel panic. Fix by moving initialization of rx_cpu_rmap from netdev_init to netdev_attach. IRQ table is re-allocated on mlx5_load, but netdev is not re-initialize. Trace of the panic: [ 22.055672] general protection fault, probably for non-canonical address 0x785634120000ff1c: 0000 [#1] SMP PTI [ 22.065010] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.7.0-rc2-for-upstream-perf-2020-04-21_16-34-03-31 #1 [ 22.067967] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 22.071174] RIP: 0010:get_rps_cpu+0x267/0x300 [ 22.075692] RSP: 0018:ffffc90000244d60 EFLAGS: 00010202 [ 22.076888] RAX: ffff888459b0e400 RBX: 0000000000000000 RCX:0000000000000007 [ 22.078364] RDX: 0000000000008884 RSI: ffff888467cb5b00 RDI:0000000000000000 [ 22.079815] RBP: 00000000ff342b27 R08: 0000000000000007 R09:0000000000000003 [ 22.081289] R10: ffffffffffffffff R11: 00000000000070cc R12:ffff888454900000 [ 22.082767] R13: ffffc90000e5a950 R14: ffffc90000244dc0 R15:0000000000000007 [ 22.084190] FS: 0000000000000000(0000) GS:ffff88846fc80000(0000)knlGS:0000000000000000 [ 22.086161] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.087427] CR2: ffffffffffffffff CR3: 0000000464426003 CR4:0000000000760ee0 [ 22.088888] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 [ 22.090336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 [ 22.091764] PKRU: 55555554 [ 22.092618] Call Trace: [ 22.093442] <IRQ> [ 22.094211] ? kvm_clock_get_cycles+0xd/0x10 [ 22.095272] netif_receive_skb_list_internal+0x258/0x2a0 [ 22.096460] gro_normal_list.part.137+0x19/0x40 [ 22.097547] napi_complete_done+0xc6/0x110 [ 22.098685] mlx5e_napi_poll+0x190/0x670 [mlx5_core] [ 22.099859] net_rx_action+0x2a0/0x400 [ 22.100848] __do_softirq+0xd8/0x2a8 [ 22.101829] irq_exit+0xa5/0xb0 [ 22.102750] do_IRQ+0x52/0xd0 [ 22.103654] common_interrupt+0xf/0xf [ 22.104641] </IRQ> Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix VXLAN configuration restore after function reloadAya Levin
When detaching netdev, remove vxlan port configuration using udp_tunnel_drop_rx_info. During function reload, configuration will be restored using udp_tunnel_get_rx_info. This ensures sync between firmware and driver. Use udp_tunnel_get_rx_info even if its physical interface is down. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix usage of rcu-protected pointerVlad Buslov
In mlx5e_configure_flower() flow pointer is protected by rcu read lock. However, after cited commit the pointer is being used outside of rcu read block. Extend the block to protect all pointer accesses. Fixes: 553f9328385d ("net/mlx5e: Support tc block sharing for representors") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mxl5e: Verify that rpriv is not NULLVlad Buslov
In helper function is_flow_rule_duplicate_allowed() verify that rpviv pointer is not NULL before dereferencing it. This can happen when device is in NIC mode and leads to following crash: [90444.046419] BUG: kernel NULL pointer dereference, address: 0000000000000000 [90444.048149] #PF: supervisor read access in kernel mode [90444.049781] #PF: error_code(0x0000) - not-present page [90444.051386] PGD 80000003d35a4067 P4D 80000003d35a4067 PUD 3d35a3067 PMD 0 [90444.053051] Oops: 0000 [#1] SMP PTI [90444.054683] CPU: 16 PID: 31736 Comm: tc Not tainted 5.8.0-rc1+ #1157 [90444.056340] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [90444.058079] RIP: 0010:mlx5e_configure_flower+0x3aa/0x9b0 [mlx5_core] [90444.059753] Code: 24 50 49 8b 95 08 02 00 00 48 b8 00 08 00 00 04 00 00 00 48 21 c2 48 39 c2 74 0a 41 f6 85 0d 02 00 00 20 74 16 48 8b 44 24 20 <48> 8b 00 66 83 78 20 ff 74 07 4d 89 aa e0 00 00 00 48 83 7d 28 00 [90444.063232] RSP: 0018:ffffabe9c61ff768 EFLAGS: 00010246 [90444.065014] RAX: 0000000000000000 RBX: ffff9b13c4c91e80 RCX: 00000000000093fa [90444.066784] RDX: 0000000400000800 RSI: 0000000000000000 RDI: 000000000002d5e0 [90444.068533] RBP: ffff9b174d308468 R08: 0000000000000000 R09: ffff9b17d63003f0 [90444.070285] R10: ffff9b17ea288600 R11: 0000000000000000 R12: ffffabe9c61ff878 [90444.072032] R13: ffff9b174d300000 R14: ffffabe9c61ffbb8 R15: ffff9b174d300880 [90444.073760] FS: 00007f3c23775480(0000) GS:ffff9b13efc80000(0000) knlGS:0000000000000000 [90444.075492] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [90444.077266] CR2: 0000000000000000 CR3: 00000003e2a60002 CR4: 00000000001606e0 [90444.079024] Call Trace: [90444.080753] tc_setup_cb_add+0xca/0x1e0 [90444.082415] fl_hw_replace_filter+0x15f/0x1f0 [cls_flower] [90444.084119] fl_change+0xa59/0x13dc [cls_flower] [90444.085772] ? wait_for_completion+0xa8/0xf0 [90444.087364] tc_new_tfilter+0x3f5/0xa60 [90444.088960] rtnetlink_rcv_msg+0xeb/0x360 [90444.090514] ? __d_lookup_done+0x76/0xe0 [90444.092034] ? proc_alloc_inode+0x16/0x70 [90444.093560] ? prep_new_page+0x8c/0xf0 [90444.095048] ? _cond_resched+0x15/0x30 [90444.096483] ? rtnl_calcit.isra.0+0x110/0x110 [90444.097907] netlink_rcv_skb+0x49/0x110 [90444.099289] netlink_unicast+0x191/0x230 [90444.100629] netlink_sendmsg+0x243/0x480 [90444.101984] sock_sendmsg+0x5e/0x60 [90444.103305] ____sys_sendmsg+0x1f3/0x260 [90444.104597] ? copy_msghdr_from_user+0x5c/0x90 [90444.105916] ? __mod_lruvec_state+0x3c/0xe0 [90444.107210] ___sys_sendmsg+0x81/0xc0 [90444.108484] ? do_filp_open+0xa5/0x100 [90444.109732] ? handle_mm_fault+0x117b/0x1e00 [90444.110970] ? __check_object_size+0x46/0x147 [90444.112205] ? __check_object_size+0x136/0x147 [90444.113402] __sys_sendmsg+0x59/0xa0 [90444.114587] do_syscall_64+0x4d/0x90 [90444.115782] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [90444.116953] RIP: 0033:0x7f3c2393b7b8 [90444.118101] Code: Bad RIP value. [90444.119240] RSP: 002b:00007ffc6ad8e6c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [90444.120408] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3c2393b7b8 [90444.121583] RDX: 0000000000000000 RSI: 00007ffc6ad8e740 RDI: 0000000000000003 [90444.122750] RBP: 000000005eea0c3a R08: 0000000000000001 R09: 00007ffc6ad8e68c [90444.123928] R10: 0000000000404fa8 R11: 0000000000000246 R12: 0000000000000001 [90444.125073] R13: 0000000000000000 R14: 00007ffc6ad92a00 R15: 00000000004866a0 [90444.126221] Modules linked in: act_skbedit act_tunnel_key act_mirred bonding vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_r apl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mlxfw kvm act_ct nf_flow_table nf_nat nf_conntrack irqbypass crct10dif_pclmul nf_defrag_ipv6 igb ipmi_ssif libcrc32c crc32_pclmul crc32c_intel ipmi_si nf_defrag_ipv4 ptp ghash_clmulni_intel mei_me ses iTCO_wdt i2c_i801 pps_core ioatdma iTCO_vendor_support joydev mei enclosure intel_cstate i2c_smbus wmi dca ipmi_devintf intel_uncore lpc_ich ipmi_msghandler pcspkr acpi_pad acpi_power_meter ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas [90444.136253] CR2: 0000000000000000 [90444.137621] ---[ end trace 924af62aa2b151bd ]--- Fixes: 553f9328385d ("net/mlx5e: Support tc block sharing for representors") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5: E-Switch, Fix vlan or qos setting in legacy modeVu Pham
Refactoring eswitch ingress acl codes accidentally inserts extra memset zero that removes vlan and/or qos setting in legacy mode. Fixes: 07bab9502641 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes") Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5: Fix eeprom support for SFP moduleEran Ben Elisha
Fix eeprom SFP query support by setting i2c_addr, offset and page number correctly. Unlike QSFP modules, SFP eeprom params are as follow: - i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511. - Page number is always zero. - Page offset is always relative to zero. As part of eeprom query, query the module ID (SFP / QSFP*) via helper function to set the params accordingly. In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid unnecessary casting. Fixes: a708fb7b1f8d ("net/mlx5e: ethtool, Add support for EEPROM high pages query") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09PCI: Move PCI_VENDOR_ID_REDHAT definition to pci_ids.hHuacai Chen
Instead of duplicating the PCI_VENDOR_ID_REDHAT definition everywhere, move it to include/linux/pci_ids.h. [bhelgaas: also update MDPY_PCI_VENDOR_ID] Link: https://lore.kernel.org/r/1594195170-11119-1-git-send-email-chenhc@lemote.com Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2020-07-09devlink: Move input checks from driver to devlinkDanielle Ratson
Currently, all the input checks are done in driver. After adding the split capability to devlink port, move the checks to devlink. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09devlink: Add a new devlink port split ability attribute and pass to netlinkDanielle Ratson
Add a new attribute that indicates the split ability of devlink port. Drivers are expected to set it via devlink_port_attrs_set(), before registering the port. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09mlxsw: Set port split ability attribute in driverDanielle Ratson
Currently, port attributes like flavour, port number and whether the port was split are set when initializing a port. Set the split ability of the port as well, based on port_mapping->width field and split attribute of devlink port in spectrum, so that it could be easily passed to devlink in the next patch. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09devlink: Add a new devlink port lanes attribute and pass to netlinkDanielle Ratson
Add a new devlink port attribute that indicates the port's number of lanes. Drivers are expected to set it via devlink_port_attrs_set(), before registering the port. The attribute is not passed to user space in case the number of lanes is invalid (0). Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09mlxsw: Set number of port lanes attribute in driverDanielle Ratson
Currently, port attributes like flavour, port number and whether the port was split are set when initializing a port. Set the number of lanes of the port as well so that it could be easily passed to devlink in the next patch. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09devlink: Replace devlink_port_attrs_set parameters with a structDanielle Ratson
Currently, devlink_port_attrs_set accepts a long list of parameters, that most of them are devlink port's attributes. Use the devlink_port_attrs struct to replace the relevant parameters. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09net: phy: mscc: fix ptr_ret.cocci warningskernel test robot
drivers/net/phy/mscc/mscc_ptp.c:1496:1-3: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci Fixes: 7d272e63e097 ("net: phy: mscc: timestamping and PHC support") CC: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09net: systemport: fix double shift of a vlan_tci by VLAN_PRIO_SHIFTColin Ian King
Currently the u16 skb->vlan_tci is being right shifted twice by VLAN_PRIO_SHIFT, once in the macro skb_vlan_tag_get_pri and explicitly by VLAN_PRIO_SHIFT afterwards. The combined shift amount is larger than the u16 so the end result is always zero. Remove the second explicit shift as this is extraneous. Fixes: 6e9fdb60d362 ("net: systemport: Add support for VLAN transmit acceleration") Addresses-Coverity: ("Operands don't affect result") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09net: enetc: use eth_broadcast_addr() to assign broadcastXu Wang
This patch is to use eth_broadcast_addr() to assign broadcast address insetad of memset(). Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09qed: Populate nvm-file attributes while reading nvm config partition.Sudarsana Reddy Kalluru
NVM config file address will be modified when the MBI image is upgraded. Driver would return stale config values if user reads the nvm-config (via ethtool -d) in this state. The fix is to re-populate nvm attribute info while reading the nvm config values/partition. Changes from previous version: ------------------------------- v3: Corrected the formatting in 'Fixes' tag. v2: Added 'Fixes' tag. Fixes: 1ac4329a1cff ("qed: Add configuration information to register dump and debug data") Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08bonding: don't need RTNL for ipsec helpersJarod Wilson
The bond_ipsec_* helpers don't need RTNL, and can potentially get called without it being held, so switch from rtnl_dereference() to rcu_dereference() to access bond struct data. Lightly tested with xfrm bonding, no problems found, should address the syzkaller bug referenced below. Reported-by: syzbot+582c98032903dcc04816@syzkaller.appspotmail.com CC: Huy Nguyen <huyn@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08cxgb4: fix all-mask IP address comparisonRahul Lakkireddy
Convert all-mask IP address to Big Endian, instead, for comparison. Fixes: f286dd8eaad5 ("cxgb4: use correct type for all-mask IP address comparison") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08bonding: deal with xfrm state in all modes and add more error-checkingJarod Wilson
It's possible that device removal happens when the bond is in non-AB mode, and addition happens in AB mode, so bond_ipsec_del_sa() never gets called, which leaves security associations in an odd state if bond_ipsec_add_sa() then gets called after switching the bond into AB. Just call add and delete universally for all modes to keep things consistent. However, it's also possible that this code gets called when the system is shutting down, and the xfrm subsystem has already been disconnected from the bond device, so we need to do some error-checking and bail, lest we hit a null ptr deref. Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load") CC: Huy Nguyen <huyn@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>