From 95ce158b6c93b28842b54b42ad1cb221b9844062 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Thu, 13 Jul 2023 00:34:05 +0200 Subject: dsa: mv88e6xxx: Do a final check before timing out I get sporadic timeouts from the driver when using the MV88E6352. Reading the status again after the loop fixes the problem: the operation is successful but goes undetected. Some added prints show things like this: [ 58.356209] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1b reg 0b, mask 8000, val 0000, data c000 [ 58.367487] mv88e6085 mdio_mux-0.1:00: Timeout waiting for ATU op 4000, fid 0001 (...) [ 61.826293] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1c reg 18, mask 8000, val 0000, data 9860 [ 61.837560] mv88e6085 mdio_mux-0.1:00: Timeout waiting for PHY command 1860 to complete The reason is probably not the commands: I think those are mostly fine with the 50+50ms timeout, but the problem appears when OpenWrt brings up several interfaces in parallel on a system with 7 populated ports: if one of them take more than 50 ms and waits one or more of the others can get stuck on the mutex for the switch and then this can easily multiply. As we sleep and wait, the function loop needs a final check after exiting the loop if we were successful. Suggested-by: Andrew Lunn Cc: Tobias Waldekranz Fixes: 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance of busy bit polling") Signed-off-by: Linus Walleij Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/20230712223405.861899-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski --- drivers/net/dsa/mv88e6xxx/chip.c | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'drivers/net') diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 8b51756bd805..c7d51a539451 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -109,6 +109,13 @@ int mv88e6xxx_wait_mask(struct mv88e6xxx_chip *chip, int addr, int reg, usleep_range(1000, 2000); } + err = mv88e6xxx_read(chip, addr, reg, &data); + if (err) + return err; + + if ((data & mask) == val) + return 0; + dev_err(chip->dev, "Timeout while waiting for switch\n"); return -ETIMEDOUT; } -- cgit From 5e1627cb43ddf1b24b92eb26f8d958a3f5676ccb Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Wed, 12 Jul 2023 10:15:10 -0400 Subject: net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb The syzbot fuzzer identified a problem in the usbnet driver: usb 1-1: BOGUS urb xfer, pipe 3 != type 1 WARNING: CPU: 0 PID: 754 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Modules linked in: CPU: 0 PID: 754 Comm: kworker/0:2 Not tainted 6.4.0-rc7-syzkaller-00014-g692b7dc87ca6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Workqueue: mld mld_ifc_work RIP: 0010:usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Code: 7c 24 18 e8 2c b4 5b fb 48 8b 7c 24 18 e8 42 07 f0 fe 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 c9 fc 8a e8 5a 6f 23 fb <0f> 0b e9 58 f8 ff ff e8 fe b3 5b fb 48 81 c5 c0 05 00 00 e9 84 f7 RSP: 0018:ffffc9000463f568 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88801eb28000 RSI: ffffffff814c03b7 RDI: 0000000000000001 RBP: ffff8881443b7190 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003 R13: ffff88802a77cb18 R14: 0000000000000003 R15: ffff888018262500 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556a99c15a18 CR3: 0000000028c71000 CR4: 0000000000350ef0 Call Trace: usbnet_start_xmit+0xfe5/0x2190 drivers/net/usb/usbnet.c:1453 __netdev_start_xmit include/linux/netdevice.h:4918 [inline] netdev_start_xmit include/linux/netdevice.h:4932 [inline] xmit_one net/core/dev.c:3578 [inline] dev_hard_start_xmit+0x187/0x700 net/core/dev.c:3594 ... This bug is caused by the fact that usbnet trusts the bulk endpoint addresses its probe routine receives in the driver_info structure, and it does not check to see that these endpoints actually exist and have the expected type and directions. The fix is simply to add such a check. Reported-and-tested-by: syzbot+63ee658b9a100ffadbe2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000a56e9105d0cec021@google.com/ Signed-off-by: Alan Stern CC: Oliver Neukum Link: https://lore.kernel.org/r/ea152b6d-44df-4f8a-95c6-4db51143dcc1@rowland.harvard.edu Signed-off-by: Jakub Kicinski --- drivers/net/usb/usbnet.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'drivers/net') diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 283ffddda821..2d14b0d78541 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -1775,6 +1775,10 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) } else if (!info->in || !info->out) status = usbnet_get_endpoints (dev, udev); else { + u8 ep_addrs[3] = { + info->in + USB_DIR_IN, info->out + USB_DIR_OUT, 0 + }; + dev->in = usb_rcvbulkpipe (xdev, info->in); dev->out = usb_sndbulkpipe (xdev, info->out); if (!(info->flags & FLAG_NO_SETINT)) @@ -1784,6 +1788,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) else status = 0; + if (status == 0 && !usb_check_bulk_endpoints(udev, ep_addrs)) + status = -EINVAL; } if (status >= 0 && dev->status) status = init_status (dev, udev); -- cgit From 9845217d60d01d151b45842ef2017a65e8f39f5a Mon Sep 17 00:00:00 2001 From: Mark Brown Date: Wed, 12 Jul 2023 12:16:16 +0100 Subject: net: dsa: ar9331: Use explict flags for regmap single read/write The at9331 is only able to read or write a single register at once. The driver has a custom regmap bus and chooses to tell the regmap core about this by reporting the maximum transfer sizes rather than the explicit flags that exist at the regmap level. Since there are a number of problems with the raw transfer limits and the regmap level flags are better integrated anyway convert the driver to use the flags. No functional change. Signed-off-by: Mark Brown Reviewed-by: Simon Horman Signed-off-by: David S. Miller --- drivers/net/dsa/qca/ar9331.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/dsa/qca/ar9331.c b/drivers/net/dsa/qca/ar9331.c index b2bf78ac485e..3b0937031499 100644 --- a/drivers/net/dsa/qca/ar9331.c +++ b/drivers/net/dsa/qca/ar9331.c @@ -1002,6 +1002,8 @@ static const struct regmap_config ar9331_mdio_regmap_config = { .val_bits = 32, .reg_stride = 4, .max_register = AR9331_SW_REG_PAGE, + .use_single_read = true, + .use_single_write = true, .ranges = ar9331_regmap_range, .num_ranges = ARRAY_SIZE(ar9331_regmap_range), @@ -1018,8 +1020,6 @@ static struct regmap_bus ar9331_sw_bus = { .val_format_endian_default = REGMAP_ENDIAN_NATIVE, .read = ar9331_mdio_read, .write = ar9331_sw_bus_write, - .max_raw_read = 4, - .max_raw_write = 4, }; static int ar9331_sw_probe(struct mdio_device *mdiodev) -- cgit From b685f1a58956fa36cc01123f253351b25bfacfda Mon Sep 17 00:00:00 2001 From: Tanmay Patil Date: Wed, 12 Jul 2023 16:36:57 +0530 Subject: net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field() CPSW ALE has 75 bit ALE entries which are stored within three 32 bit words. The cpsw_ale_get_field() and cpsw_ale_set_field() functions assume that the field will be strictly contained within one word. However, this is not guaranteed to be the case and it is possible for ALE field entries to span across up to two words at the most. Fix the methods to handle getting/setting fields spanning up to two words. Fixes: db82173f23c5 ("netdev: driver: ethernet: add cpsw address lookup engine support") Signed-off-by: Tanmay Patil [s-vadapalli@ti.com: rephrased commit message and added Fixes tag] Signed-off-by: Siddharth Vadapalli Signed-off-by: David S. Miller --- drivers/net/ethernet/ti/cpsw_ale.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 0c5e783e574c..64bf22cd860c 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -106,23 +106,37 @@ struct cpsw_ale_dev_id { static inline int cpsw_ale_get_field(u32 *ale_entry, u32 start, u32 bits) { - int idx; + int idx, idx2; + u32 hi_val = 0; idx = start / 32; + idx2 = (start + bits - 1) / 32; + /* Check if bits to be fetched exceed a word */ + if (idx != idx2) { + idx2 = 2 - idx2; /* flip */ + hi_val = ale_entry[idx2] << ((idx2 * 32) - start); + } start -= idx * 32; idx = 2 - idx; /* flip */ - return (ale_entry[idx] >> start) & BITMASK(bits); + return (hi_val + (ale_entry[idx] >> start)) & BITMASK(bits); } static inline void cpsw_ale_set_field(u32 *ale_entry, u32 start, u32 bits, u32 value) { - int idx; + int idx, idx2; value &= BITMASK(bits); - idx = start / 32; + idx = start / 32; + idx2 = (start + bits - 1) / 32; + /* Check if bits to be set exceed a word */ + if (idx != idx2) { + idx2 = 2 - idx2; /* flip */ + ale_entry[idx2] &= ~(BITMASK(bits + start - (idx2 * 32))); + ale_entry[idx2] |= (value >> ((idx2 * 32) - start)); + } start -= idx * 32; - idx = 2 - idx; /* flip */ + idx = 2 - idx; /* flip */ ale_entry[idx] &= ~(BITMASK(bits) << start); ale_entry[idx] |= (value << start); } -- cgit From 1d6d537dc55d1f42d16290f00157ac387985b95b Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Thu, 13 Jul 2023 03:42:29 +0100 Subject: net: ethernet: mtk_eth_soc: handle probe deferral Move the call to of_get_ethdev_address to mtk_add_mac which is part of the probe function and can hence itself return -EPROBE_DEFER should of_get_ethdev_address return -EPROBE_DEFER. This allows us to entirely get rid of the mtk_init function. The problem of of_get_ethdev_address returning -EPROBE_DEFER surfaced in situations in which the NVMEM provider holding the MAC address has not yet be loaded at the time mtk_eth_soc is initially probed. In this case probing of mtk_eth_soc should be deferred instead of falling back to use a random MAC address, so once the NVMEM provider becomes available probing can be repeated. Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Signed-off-by: Daniel Golle Signed-off-by: David S. Miller --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 834c644b67db..2d15342c260a 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -3846,23 +3846,6 @@ static int mtk_hw_deinit(struct mtk_eth *eth) return 0; } -static int __init mtk_init(struct net_device *dev) -{ - struct mtk_mac *mac = netdev_priv(dev); - struct mtk_eth *eth = mac->hw; - int ret; - - ret = of_get_ethdev_address(mac->of_node, dev); - if (ret) { - /* If the mac address is invalid, use random mac address */ - eth_hw_addr_random(dev); - dev_err(eth->dev, "generated random MAC address %pM\n", - dev->dev_addr); - } - - return 0; -} - static void mtk_uninit(struct net_device *dev) { struct mtk_mac *mac = netdev_priv(dev); @@ -4278,7 +4261,6 @@ static const struct ethtool_ops mtk_ethtool_ops = { }; static const struct net_device_ops mtk_netdev_ops = { - .ndo_init = mtk_init, .ndo_uninit = mtk_uninit, .ndo_open = mtk_open, .ndo_stop = mtk_stop, @@ -4340,6 +4322,17 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np) mac->hw = eth; mac->of_node = np; + err = of_get_ethdev_address(mac->of_node, eth->netdev[id]); + if (err == -EPROBE_DEFER) + return err; + + if (err) { + /* If the mac address is invalid, use random mac address */ + eth_hw_addr_random(eth->netdev[id]); + dev_err(eth->dev, "generated random MAC address %pM\n", + eth->netdev[id]->dev_addr); + } + memset(mac->hwlro_ip, 0, sizeof(mac->hwlro_ip)); mac->hwlro_ip_cnt = 0; -- cgit From 4ad23d2368ccce6da74edc74e69300a4d75a2379 Mon Sep 17 00:00:00 2001 From: Wang Ming Date: Thu, 13 Jul 2023 17:51:06 +0800 Subject: bna: Remove error checking for debugfs_create_dir() It is expected that most callers should _ignore_ the errors return by debugfs_create_dir() in bnad_debugfs_init(). Signed-off-by: Wang Ming Signed-off-by: David S. Miller --- drivers/net/ethernet/brocade/bna/bnad_debugfs.c | 5 ----- 1 file changed, 5 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/brocade/bna/bnad_debugfs.c b/drivers/net/ethernet/brocade/bna/bnad_debugfs.c index 04ad0f2b9677..7246e13dd559 100644 --- a/drivers/net/ethernet/brocade/bna/bnad_debugfs.c +++ b/drivers/net/ethernet/brocade/bna/bnad_debugfs.c @@ -512,11 +512,6 @@ bnad_debugfs_init(struct bnad *bnad) if (!bnad->port_debugfs_root) { bnad->port_debugfs_root = debugfs_create_dir(name, bna_debugfs_root); - if (!bnad->port_debugfs_root) { - netdev_warn(bnad->netdev, - "debugfs root dir creation failed\n"); - return; - } atomic_inc(&bna_debugfs_port_count); -- cgit From a822551c51f0dc063305ca16b9dd0dec7fe1f259 Mon Sep 17 00:00:00 2001 From: Wang Ming Date: Thu, 13 Jul 2023 20:16:03 +0800 Subject: net: ethernet: Remove repeating expression Identify issues that arise by using the tests/doublebitand.cocci semantic patch. Need to remove duplicate expression in if statement. Signed-off-by: Wang Ming Reviewed-by: Jiawen Wu Signed-off-by: David S. Miller --- drivers/net/ethernet/wangxun/libwx/wx_hw.c | 1 - 1 file changed, 1 deletion(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c index 39a9aeee7aab..6321178fc814 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c @@ -1511,7 +1511,6 @@ static void wx_configure_rx(struct wx *wx) psrtype = WX_RDB_PL_CFG_L4HDR | WX_RDB_PL_CFG_L3HDR | WX_RDB_PL_CFG_L2HDR | - WX_RDB_PL_CFG_TUN_TUNHDR | WX_RDB_PL_CFG_TUN_TUNHDR; wr32(wx, WX_RDB_PL_CFG(0), psrtype); -- cgit From 24a3298ac9e6bd8de838ab79f7868207170d556d Mon Sep 17 00:00:00 2001 From: Petr Oros Date: Mon, 19 Jun 2023 12:58:13 +0200 Subject: ice: Unregister netdev and devlink_port only once Since commit 6624e780a577fc ("ice: split ice_vsi_setup into smaller functions") ice_vsi_release does things twice. There is unregister netdev which is unregistered in ice_deinit_eth also. It also unregisters the devlink_port twice which is also unregistered in ice_deinit_eth(). This double deregistration is hidden because devl_port_unregister ignores the return value of xa_erase. [ 68.642167] Call Trace: [ 68.650385] ice_devlink_destroy_pf_port+0xe/0x20 [ice] [ 68.655656] ice_vsi_release+0x445/0x690 [ice] [ 68.660147] ice_deinit+0x99/0x280 [ice] [ 68.664117] ice_remove+0x1b6/0x5c0 [ice] [ 171.103841] Call Trace: [ 171.109607] ice_devlink_destroy_pf_port+0xf/0x20 [ice] [ 171.114841] ice_remove+0x158/0x270 [ice] [ 171.118854] pci_device_remove+0x3b/0xc0 [ 171.122779] device_release_driver_internal+0xc7/0x170 [ 171.127912] driver_detach+0x54/0x8c [ 171.131491] bus_remove_driver+0x77/0xd1 [ 171.135406] pci_unregister_driver+0x2d/0xb0 [ 171.139670] ice_module_exit+0xc/0x55f [ice] Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Signed-off-by: Petr Oros Reviewed-by: Maciej Fijalkowski Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_lib.c | 27 --------------------------- 1 file changed, 27 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 00e3afd507a4..0054d7e64ec3 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2972,39 +2972,12 @@ int ice_vsi_release(struct ice_vsi *vsi) return -ENODEV; pf = vsi->back; - /* do not unregister while driver is in the reset recovery pending - * state. Since reset/rebuild happens through PF service task workqueue, - * it's not a good idea to unregister netdev that is associated to the - * PF that is running the work queue items currently. This is done to - * avoid check_flush_dependency() warning on this wq - */ - if (vsi->netdev && !ice_is_reset_in_progress(pf->state) && - (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state))) { - unregister_netdev(vsi->netdev); - clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state); - } - - if (vsi->type == ICE_VSI_PF) - ice_devlink_destroy_pf_port(pf); - if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) ice_rss_clean(vsi); ice_vsi_close(vsi); ice_vsi_decfg(vsi); - if (vsi->netdev) { - if (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state)) { - unregister_netdev(vsi->netdev); - clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state); - } - if (test_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state)) { - free_netdev(vsi->netdev); - vsi->netdev = NULL; - clear_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state); - } - } - /* retain SW VSI data structure since it is needed to unregister and * free VSI netdev when PF is not in reset recovery pending state,\ * for ex: during rmmod. -- cgit From b3e7b3a6ee92ab927f750a6b19615ce88ece808f Mon Sep 17 00:00:00 2001 From: Michal Swiatkowski Date: Thu, 6 Jul 2023 08:25:51 +0200 Subject: ice: prevent NULL pointer deref during reload Calling ethtool during reload can lead to call trace, because VSI isn't configured for some time, but netdev is alive. To fix it add rtnl lock for VSI deconfig and config. Set ::num_q_vectors to 0 after freeing and add a check for ::tx/rx_rings in ring related ethtool ops. Add proper unroll of filters in ice_start_eth(). Reproduction: $watch -n 0.1 -d 'ethtool -g enp24s0f0np0' $devlink dev reload pci/0000:18:00.0 action driver_reinit Call trace before fix: [66303.926205] BUG: kernel NULL pointer dereference, address: 0000000000000000 [66303.926259] #PF: supervisor read access in kernel mode [66303.926286] #PF: error_code(0x0000) - not-present page [66303.926311] PGD 0 P4D 0 [66303.926332] Oops: 0000 [#1] PREEMPT SMP PTI [66303.926358] CPU: 4 PID: 933821 Comm: ethtool Kdump: loaded Tainted: G OE 6.4.0-rc5+ #1 [66303.926400] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018 [66303.926446] RIP: 0010:ice_get_ringparam+0x22/0x50 [ice] [66303.926649] Code: 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 c0 09 00 00 c7 46 04 e0 1f 00 00 c7 46 10 e0 1f 00 00 48 8b 50 20 <48> 8b 12 0f b7 52 3a 89 56 14 48 8b 40 28 48 8b 00 0f b7 40 58 48 [66303.926722] RSP: 0018:ffffad40472f39c8 EFLAGS: 00010246 [66303.926749] RAX: ffff98a8ada05828 RBX: ffff98a8c46dd060 RCX: ffffad40472f3b48 [66303.926781] RDX: 0000000000000000 RSI: ffff98a8c46dd068 RDI: ffff98a8b23c4000 [66303.926811] RBP: ffffad40472f3b48 R08: 00000000000337b0 R09: 0000000000000000 [66303.926843] R10: 0000000000000001 R11: 0000000000000100 R12: ffff98a8b23c4000 [66303.926874] R13: ffff98a8c46dd060 R14: 000000000000000f R15: ffffad40472f3a50 [66303.926906] FS: 00007f6397966740(0000) GS:ffff98b390900000(0000) knlGS:0000000000000000 [66303.926941] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [66303.926967] CR2: 0000000000000000 CR3: 000000011ac20002 CR4: 00000000007706e0 [66303.926999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [66303.927029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [66303.927060] PKRU: 55555554 [66303.927075] Call Trace: [66303.927094] [66303.927111] ? __die+0x23/0x70 [66303.927140] ? page_fault_oops+0x171/0x4e0 [66303.927176] ? exc_page_fault+0x7f/0x180 [66303.927209] ? asm_exc_page_fault+0x26/0x30 [66303.927244] ? ice_get_ringparam+0x22/0x50 [ice] [66303.927433] rings_prepare_data+0x62/0x80 [66303.927469] ethnl_default_doit+0xe2/0x350 [66303.927501] genl_family_rcv_msg_doit.isra.0+0xe3/0x140 [66303.927538] genl_rcv_msg+0x1b1/0x2c0 [66303.927561] ? __pfx_ethnl_default_doit+0x10/0x10 [66303.927590] ? __pfx_genl_rcv_msg+0x10/0x10 [66303.927615] netlink_rcv_skb+0x58/0x110 [66303.927644] genl_rcv+0x28/0x40 [66303.927665] netlink_unicast+0x19e/0x290 [66303.927691] netlink_sendmsg+0x254/0x4d0 [66303.927717] sock_sendmsg+0x93/0xa0 [66303.927743] __sys_sendto+0x126/0x170 [66303.927780] __x64_sys_sendto+0x24/0x30 [66303.928593] do_syscall_64+0x5d/0x90 [66303.929370] ? __count_memcg_events+0x60/0xa0 [66303.930146] ? count_memcg_events.constprop.0+0x1a/0x30 [66303.930920] ? handle_mm_fault+0x9e/0x350 [66303.931688] ? do_user_addr_fault+0x258/0x740 [66303.932452] ? exc_page_fault+0x7f/0x180 [66303.933193] entry_SYSCALL_64_after_hwframe+0x72/0xdc Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Reviewed-by: Przemek Kitszel Signed-off-by: Michal Swiatkowski Reviewed-by: Simon Horman Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_base.c | 2 ++ drivers/net/ethernet/intel/ice/ice_ethtool.c | 13 +++++++++++-- drivers/net/ethernet/intel/ice/ice_main.c | 10 ++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 4a12316f7b46..b678bdf96f3a 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -800,6 +800,8 @@ void ice_vsi_free_q_vectors(struct ice_vsi *vsi) ice_for_each_q_vector(vsi, v_idx) ice_free_q_vector(vsi, v_idx); + + vsi->num_q_vectors = 0; } /** diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index 8d5cbbd0b3d5..ad4d4702129f 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -2681,8 +2681,13 @@ ice_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring, ring->rx_max_pending = ICE_MAX_NUM_DESC; ring->tx_max_pending = ICE_MAX_NUM_DESC; - ring->rx_pending = vsi->rx_rings[0]->count; - ring->tx_pending = vsi->tx_rings[0]->count; + if (vsi->tx_rings && vsi->rx_rings) { + ring->rx_pending = vsi->rx_rings[0]->count; + ring->tx_pending = vsi->tx_rings[0]->count; + } else { + ring->rx_pending = 0; + ring->tx_pending = 0; + } /* Rx mini and jumbo rings are not supported */ ring->rx_mini_max_pending = 0; @@ -2716,6 +2721,10 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring, return -EINVAL; } + /* Return if there is no rings (device is reloading) */ + if (!vsi->tx_rings || !vsi->rx_rings) + return -EBUSY; + new_tx_cnt = ALIGN(ring->tx_pending, ICE_REQ_DESC_MULTIPLE); if (new_tx_cnt != ring->tx_pending) netdev_info(netdev, "Requested Tx descriptor count rounded up to %d\n", diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 19a5e7f3a075..f02d44455772 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4430,9 +4430,9 @@ static int ice_start_eth(struct ice_vsi *vsi) if (err) return err; - rtnl_lock(); err = ice_vsi_open(vsi); - rtnl_unlock(); + if (err) + ice_fltr_remove_all(vsi); return err; } @@ -4895,6 +4895,7 @@ int ice_load(struct ice_pf *pf) params = ice_vsi_to_params(vsi); params.flags = ICE_VSI_FLAG_INIT; + rtnl_lock(); err = ice_vsi_cfg(vsi, ¶ms); if (err) goto err_vsi_cfg; @@ -4902,6 +4903,7 @@ int ice_load(struct ice_pf *pf) err = ice_start_eth(ice_get_main_vsi(pf)); if (err) goto err_start_eth; + rtnl_unlock(); err = ice_init_rdma(pf); if (err) @@ -4916,9 +4918,11 @@ int ice_load(struct ice_pf *pf) err_init_rdma: ice_vsi_close(ice_get_main_vsi(pf)); + rtnl_lock(); err_start_eth: ice_vsi_decfg(ice_get_main_vsi(pf)); err_vsi_cfg: + rtnl_unlock(); ice_deinit_dev(pf); return err; } @@ -4931,8 +4935,10 @@ void ice_unload(struct ice_pf *pf) { ice_deinit_features(pf); ice_deinit_rdma(pf); + rtnl_lock(); ice_stop_eth(ice_get_main_vsi(pf)); ice_vsi_decfg(ice_get_main_vsi(pf)); + rtnl_unlock(); ice_deinit_dev(pf); } -- cgit From 4bdf79d686b49ac49373b36466acfb93972c7d7c Mon Sep 17 00:00:00 2001 From: Tristram Ha Date: Thu, 13 Jul 2023 17:46:22 -0700 Subject: net: dsa: microchip: correct KSZ8795 static MAC table access The KSZ8795 driver code was modified to use on KSZ8863/73, which has different register definitions. Some of the new KSZ8795 register information are wrong compared to previous code. KSZ8795 also behaves differently in that the STATIC_MAC_TABLE_USE_FID and STATIC_MAC_TABLE_FID bits are off by 1 when doing MAC table reading than writing. To compensate that a special code was added to shift the register value by 1 before applying those bits. This is wrong when the code is running on KSZ8863, so this special code is only executed when KSZ8795 is detected. Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Tristram Ha Reviewed-by: Horatiu Vultur Reviewed-by: Simon Horman Signed-off-by: David S. Miller --- drivers/net/dsa/microchip/ksz8795.c | 8 +++++++- drivers/net/dsa/microchip/ksz_common.c | 8 ++++---- drivers/net/dsa/microchip/ksz_common.h | 7 +++++++ 3 files changed, 18 insertions(+), 5 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c index 84d502589f8e..91aba470fb2f 100644 --- a/drivers/net/dsa/microchip/ksz8795.c +++ b/drivers/net/dsa/microchip/ksz8795.c @@ -506,7 +506,13 @@ static int ksz8_r_sta_mac_table(struct ksz_device *dev, u16 addr, (data_hi & masks[STATIC_MAC_TABLE_FWD_PORTS]) >> shifts[STATIC_MAC_FWD_PORTS]; alu->is_override = (data_hi & masks[STATIC_MAC_TABLE_OVERRIDE]) ? 1 : 0; - data_hi >>= 1; + + /* KSZ8795 family switches have STATIC_MAC_TABLE_USE_FID and + * STATIC_MAC_TABLE_FID definitions off by 1 when doing read on the + * static MAC table compared to doing write. + */ + if (ksz_is_ksz87xx(dev)) + data_hi >>= 1; alu->is_static = true; alu->is_use_fid = (data_hi & masks[STATIC_MAC_TABLE_USE_FID]) ? 1 : 0; alu->fid = (data_hi & masks[STATIC_MAC_TABLE_FID]) >> diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 813b91a816bb..b18cd170ec06 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -331,13 +331,13 @@ static const u32 ksz8795_masks[] = { [STATIC_MAC_TABLE_VALID] = BIT(21), [STATIC_MAC_TABLE_USE_FID] = BIT(23), [STATIC_MAC_TABLE_FID] = GENMASK(30, 24), - [STATIC_MAC_TABLE_OVERRIDE] = BIT(26), - [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(24, 20), + [STATIC_MAC_TABLE_OVERRIDE] = BIT(22), + [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(20, 16), [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(6, 0), - [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(8), + [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7), [DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7), [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 29), - [DYNAMIC_MAC_TABLE_FID] = GENMASK(26, 20), + [DYNAMIC_MAC_TABLE_FID] = GENMASK(22, 16), [DYNAMIC_MAC_TABLE_SRC_PORT] = GENMASK(26, 24), [DYNAMIC_MAC_TABLE_TIMESTAMP] = GENMASK(28, 27), [P_MII_TX_FLOW_CTRL] = BIT(5), diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h index 28444e5924f9..a4de58847dea 100644 --- a/drivers/net/dsa/microchip/ksz_common.h +++ b/drivers/net/dsa/microchip/ksz_common.h @@ -601,6 +601,13 @@ static inline void ksz_regmap_unlock(void *__mtx) mutex_unlock(mtx); } +static inline bool ksz_is_ksz87xx(struct ksz_device *dev) +{ + return dev->chip_id == KSZ8795_CHIP_ID || + dev->chip_id == KSZ8794_CHIP_ID || + dev->chip_id == KSZ8765_CHIP_ID; +} + static inline bool ksz_is_ksz88x3(struct ksz_device *dev) { return dev->chip_id == KSZ8830_CHIP_ID; -- cgit From 162d626f3013215b82b6514ca14f20932c7ccce5 Mon Sep 17 00:00:00 2001 From: Heiner Kallweit Date: Fri, 14 Jul 2023 07:39:36 +0200 Subject: r8169: fix ASPM-related problem for chip version 42 and 43 Referenced commit missed that for chip versions 42 and 43 ASPM remained disabled in the respective rtl_hw_start_...() routines. This resulted in problems as described in the referenced bug ticket. Therefore re-instantiate the previous logic. Fixes: 5fc3f6c90cca ("r8169: consolidate disabling ASPM before EPHY access") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217635 Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller --- drivers/net/ethernet/realtek/r8169_main.c | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 9445f04f8d48..fce4a2b908c2 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2747,6 +2747,13 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable) return; if (enable) { + /* On these chip versions ASPM can even harm + * bus communication of other PCI devices. + */ + if (tp->mac_version == RTL_GIGA_MAC_VER_42 || + tp->mac_version == RTL_GIGA_MAC_VER_43) + return; + rtl_mod_config5(tp, 0, ASPM_en); rtl_mod_config2(tp, 0, ClkReqEn); -- cgit From 2603be9e8167ddc7bea95dcfab9ffc33414215aa Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Fri, 7 Jul 2023 13:43:10 +0200 Subject: can: gs_usb: gs_can_open(): improve error handling The gs_usb driver handles USB devices with more than 1 CAN channel. The RX path for all channels share the same bulk endpoint (the transmitted bulk data encodes the channel number). These per-device resources are allocated and submitted by the first opened channel. During this allocation, the resources are either released immediately in case of a failure or the URBs are anchored. All anchored URBs are finally killed with gs_usb_disconnect(). Currently, gs_can_open() returns with an error if the allocation of a URB or a buffer fails. However, if usb_submit_urb() fails, the driver continues with the URBs submitted so far, even if no URBs were successfully submitted. Treat every error as fatal and free all allocated resources immediately. Switch to goto-style error handling, to prepare the driver for more per-device resource allocation. Cc: stable@vger.kernel.org Cc: John Whittington Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-1-9017cefcd9d5@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/gs_usb.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index d476c2884008..85b7b59c8426 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -833,6 +833,7 @@ static int gs_can_open(struct net_device *netdev) .mode = cpu_to_le32(GS_CAN_MODE_START), }; struct gs_host_frame *hf; + struct urb *urb = NULL; u32 ctrlmode; u32 flags = 0; int rc, i; @@ -856,13 +857,14 @@ static int gs_can_open(struct net_device *netdev) if (!parent->active_channels) { for (i = 0; i < GS_MAX_RX_URBS; i++) { - struct urb *urb; u8 *buf; /* alloc rx urb */ urb = usb_alloc_urb(0, GFP_KERNEL); - if (!urb) - return -ENOMEM; + if (!urb) { + rc = -ENOMEM; + goto out_usb_kill_anchored_urbs; + } /* alloc rx buffer */ buf = kmalloc(dev->parent->hf_size_rx, @@ -870,8 +872,8 @@ static int gs_can_open(struct net_device *netdev) if (!buf) { netdev_err(netdev, "No memory left for USB buffer\n"); - usb_free_urb(urb); - return -ENOMEM; + rc = -ENOMEM; + goto out_usb_free_urb; } /* fill, anchor, and submit rx urb */ @@ -894,9 +896,7 @@ static int gs_can_open(struct net_device *netdev) netdev_err(netdev, "usb_submit failed (err=%d)\n", rc); - usb_unanchor_urb(urb); - usb_free_urb(urb); - break; + goto out_usb_unanchor_urb; } /* Drop reference, @@ -945,7 +945,8 @@ static int gs_can_open(struct net_device *netdev) if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) gs_usb_timestamp_stop(dev); dev->can.state = CAN_STATE_STOPPED; - return rc; + + goto out_usb_kill_anchored_urbs; } parent->active_channels++; @@ -953,6 +954,18 @@ static int gs_can_open(struct net_device *netdev) netif_start_queue(netdev); return 0; + +out_usb_unanchor_urb: + usb_unanchor_urb(urb); +out_usb_free_urb: + usb_free_urb(urb); +out_usb_kill_anchored_urbs: + if (!parent->active_channels) + usb_kill_anchored_urbs(&dev->tx_submitted); + + close_candev(netdev); + + return rc; } static int gs_usb_get_state(const struct net_device *netdev, -- cgit From 5886e4d5ecec3e22844efed90b2dd383ef804b3a Mon Sep 17 00:00:00 2001 From: Marc Kleine-Budde Date: Fri, 7 Jul 2023 18:44:23 +0200 Subject: can: gs_usb: fix time stamp counter initialization If the gs_usb device driver is unloaded (or unbound) before the interface is shut down, the USB stack first calls the struct usb_driver::disconnect and then the struct net_device_ops::ndo_stop callback. In gs_usb_disconnect() all pending bulk URBs are killed, i.e. no more RX'ed CAN frames are send from the USB device to the host. Later in gs_can_close() a reset control message is send to each CAN channel to remove the controller from the CAN bus. In this race window the USB device can still receive CAN frames from the bus and internally queue them to be send to the host. At least in the current version of the candlelight firmware, the queue of received CAN frames is not emptied during the reset command. After loading (or binding) the gs_usb driver, new URBs are submitted during the struct net_device_ops::ndo_open callback and the candlelight firmware starts sending its already queued CAN frames to the host. However, this scenario was not considered when implementing the hardware timestamp function. The cycle counter/time counter infrastructure is set up (gs_usb_timestamp_init()) after the USBs are submitted, resulting in a NULL pointer dereference if timecounter_cyc2time() (via the call chain: gs_usb_receive_bulk_callback() -> gs_usb_set_timestamp() -> gs_usb_skb_set_timestamp()) is called too early. Move the gs_usb_timestamp_init() function before the URBs are submitted to fix this problem. For a comprehensive solution, we need to consider gs_usb devices with more than 1 channel. The cycle counter/time counter infrastructure is setup per channel, but the RX URBs are per device. Once gs_can_open() of _a_ channel has been called, and URBs have been submitted, the gs_usb_receive_bulk_callback() can be called for _all_ available channels, even for channels that are not running, yet. As cycle counter/time counter has not set up, this will again lead to a NULL pointer dereference. Convert the cycle counter/time counter from a "per channel" to a "per device" functionality. Also set it up, before submitting any URBs to the device. Further in gs_usb_receive_bulk_callback(), don't process any URBs for not started CAN channels, only resubmit the URB. Fixes: 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") Closes: https://github.com/candle-usb/candleLight_fw/issues/137#issuecomment-1623532076 Cc: stable@vger.kernel.org Cc: John Whittington Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-2-9017cefcd9d5@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/usb/gs_usb.c | 101 +++++++++++++++++++++++-------------------- 1 file changed, 53 insertions(+), 48 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index 85b7b59c8426..f418066569fc 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -303,12 +303,6 @@ struct gs_can { struct can_bittiming_const bt_const, data_bt_const; unsigned int channel; /* channel number */ - /* time counter for hardware timestamps */ - struct cyclecounter cc; - struct timecounter tc; - spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */ - struct delayed_work timestamp; - u32 feature; unsigned int hf_size_tx; @@ -325,6 +319,13 @@ struct gs_usb { struct gs_can *canch[GS_MAX_INTF]; struct usb_anchor rx_submitted; struct usb_device *udev; + + /* time counter for hardware timestamps */ + struct cyclecounter cc; + struct timecounter tc; + spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */ + struct delayed_work timestamp; + unsigned int hf_size_rx; u8 active_channels; }; @@ -388,15 +389,15 @@ static int gs_cmd_reset(struct gs_can *dev) GFP_KERNEL); } -static inline int gs_usb_get_timestamp(const struct gs_can *dev, +static inline int gs_usb_get_timestamp(const struct gs_usb *parent, u32 *timestamp_p) { __le32 timestamp; int rc; - rc = usb_control_msg_recv(dev->udev, 0, GS_USB_BREQ_TIMESTAMP, + rc = usb_control_msg_recv(parent->udev, 0, GS_USB_BREQ_TIMESTAMP, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_INTERFACE, - dev->channel, 0, + 0, 0, ×tamp, sizeof(timestamp), USB_CTRL_GET_TIMEOUT, GFP_KERNEL); @@ -410,20 +411,20 @@ static inline int gs_usb_get_timestamp(const struct gs_can *dev, static u64 gs_usb_timestamp_read(const struct cyclecounter *cc) __must_hold(&dev->tc_lock) { - struct gs_can *dev = container_of(cc, struct gs_can, cc); + struct gs_usb *parent = container_of(cc, struct gs_usb, cc); u32 timestamp = 0; int err; - lockdep_assert_held(&dev->tc_lock); + lockdep_assert_held(&parent->tc_lock); /* drop lock for synchronous USB transfer */ - spin_unlock_bh(&dev->tc_lock); - err = gs_usb_get_timestamp(dev, ×tamp); - spin_lock_bh(&dev->tc_lock); + spin_unlock_bh(&parent->tc_lock); + err = gs_usb_get_timestamp(parent, ×tamp); + spin_lock_bh(&parent->tc_lock); if (err) - netdev_err(dev->netdev, - "Error %d while reading timestamp. HW timestamps may be inaccurate.", - err); + dev_err(&parent->udev->dev, + "Error %d while reading timestamp. HW timestamps may be inaccurate.", + err); return timestamp; } @@ -431,14 +432,14 @@ static u64 gs_usb_timestamp_read(const struct cyclecounter *cc) __must_hold(&dev static void gs_usb_timestamp_work(struct work_struct *work) { struct delayed_work *delayed_work = to_delayed_work(work); - struct gs_can *dev; + struct gs_usb *parent; - dev = container_of(delayed_work, struct gs_can, timestamp); - spin_lock_bh(&dev->tc_lock); - timecounter_read(&dev->tc); - spin_unlock_bh(&dev->tc_lock); + parent = container_of(delayed_work, struct gs_usb, timestamp); + spin_lock_bh(&parent->tc_lock); + timecounter_read(&parent->tc); + spin_unlock_bh(&parent->tc_lock); - schedule_delayed_work(&dev->timestamp, + schedule_delayed_work(&parent->timestamp, GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ); } @@ -446,37 +447,38 @@ static void gs_usb_skb_set_timestamp(struct gs_can *dev, struct sk_buff *skb, u32 timestamp) { struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); + struct gs_usb *parent = dev->parent; u64 ns; - spin_lock_bh(&dev->tc_lock); - ns = timecounter_cyc2time(&dev->tc, timestamp); - spin_unlock_bh(&dev->tc_lock); + spin_lock_bh(&parent->tc_lock); + ns = timecounter_cyc2time(&parent->tc, timestamp); + spin_unlock_bh(&parent->tc_lock); hwtstamps->hwtstamp = ns_to_ktime(ns); } -static void gs_usb_timestamp_init(struct gs_can *dev) +static void gs_usb_timestamp_init(struct gs_usb *parent) { - struct cyclecounter *cc = &dev->cc; + struct cyclecounter *cc = &parent->cc; cc->read = gs_usb_timestamp_read; cc->mask = CYCLECOUNTER_MASK(32); cc->shift = 32 - bits_per(NSEC_PER_SEC / GS_USB_TIMESTAMP_TIMER_HZ); cc->mult = clocksource_hz2mult(GS_USB_TIMESTAMP_TIMER_HZ, cc->shift); - spin_lock_init(&dev->tc_lock); - spin_lock_bh(&dev->tc_lock); - timecounter_init(&dev->tc, &dev->cc, ktime_get_real_ns()); - spin_unlock_bh(&dev->tc_lock); + spin_lock_init(&parent->tc_lock); + spin_lock_bh(&parent->tc_lock); + timecounter_init(&parent->tc, &parent->cc, ktime_get_real_ns()); + spin_unlock_bh(&parent->tc_lock); - INIT_DELAYED_WORK(&dev->timestamp, gs_usb_timestamp_work); - schedule_delayed_work(&dev->timestamp, + INIT_DELAYED_WORK(&parent->timestamp, gs_usb_timestamp_work); + schedule_delayed_work(&parent->timestamp, GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ); } -static void gs_usb_timestamp_stop(struct gs_can *dev) +static void gs_usb_timestamp_stop(struct gs_usb *parent) { - cancel_delayed_work_sync(&dev->timestamp); + cancel_delayed_work_sync(&parent->timestamp); } static void gs_update_state(struct gs_can *dev, struct can_frame *cf) @@ -560,6 +562,9 @@ static void gs_usb_receive_bulk_callback(struct urb *urb) if (!netif_device_present(netdev)) return; + if (!netif_running(netdev)) + goto resubmit_urb; + if (hf->echo_id == -1) { /* normal rx */ if (hf->flags & GS_CAN_FLAG_FD) { skb = alloc_canfd_skb(dev->netdev, &cfd); @@ -856,6 +861,9 @@ static int gs_can_open(struct net_device *netdev) } if (!parent->active_channels) { + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_init(parent); + for (i = 0; i < GS_MAX_RX_URBS; i++) { u8 *buf; @@ -926,13 +934,9 @@ static int gs_can_open(struct net_device *netdev) flags |= GS_CAN_MODE_FD; /* if hardware supports timestamps, enable it */ - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) { + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) flags |= GS_CAN_MODE_HW_TIMESTAMP; - /* start polling timestamp */ - gs_usb_timestamp_init(dev); - } - /* finally start device */ dev->can.state = CAN_STATE_ERROR_ACTIVE; dm.flags = cpu_to_le32(flags); @@ -942,8 +946,6 @@ static int gs_can_open(struct net_device *netdev) GFP_KERNEL); if (rc) { netdev_err(netdev, "Couldn't start device (err=%d)\n", rc); - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) - gs_usb_timestamp_stop(dev); dev->can.state = CAN_STATE_STOPPED; goto out_usb_kill_anchored_urbs; @@ -960,9 +962,13 @@ out_usb_unanchor_urb: out_usb_free_urb: usb_free_urb(urb); out_usb_kill_anchored_urbs: - if (!parent->active_channels) + if (!parent->active_channels) { usb_kill_anchored_urbs(&dev->tx_submitted); + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_stop(parent); + } + close_candev(netdev); return rc; @@ -1011,14 +1017,13 @@ static int gs_can_close(struct net_device *netdev) netif_stop_queue(netdev); - /* stop polling timestamp */ - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) - gs_usb_timestamp_stop(dev); - /* Stop polling */ parent->active_channels--; if (!parent->active_channels) { usb_kill_anchored_urbs(&parent->rx_submitted); + + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_stop(parent); } /* Stop sending URBs */ -- cgit From 5f4fa1672d98fe99d2297b03add35346f1685d6b Mon Sep 17 00:00:00 2001 From: Ding Hui Date: Tue, 9 May 2023 19:11:47 +0800 Subject: iavf: Fix use-after-free in free_netdev We do netif_napi_add() for all allocated q_vectors[], but potentially do netif_napi_del() for part of them, then kfree q_vectors and leave invalid pointers at dev->napi_list. Reproducer: [root@host ~]# cat repro.sh #!/bin/bash pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=() function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) } function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) } function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() } trap "on_exit; exit" EXIT while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!) wait Result: [ 4093.900222] ================================================================== [ 4093.900230] BUG: KASAN: use-after-free in free_netdev+0x308/0x390 [ 4093.900232] Read of size 8 at addr ffff88b4dc145640 by task repro.sh/6699 [ 4093.900233] [ 4093.900236] CPU: 10 PID: 6699 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 4093.900238] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 4093.900239] Call Trace: [ 4093.900244] dump_stack+0x71/0xab [ 4093.900249] print_address_description+0x6b/0x290 [ 4093.900251] ? free_netdev+0x308/0x390 [ 4093.900252] kasan_report+0x14a/0x2b0 [ 4093.900254] free_netdev+0x308/0x390 [ 4093.900261] iavf_remove+0x825/0xd20 [iavf] [ 4093.900265] pci_device_remove+0xa8/0x1f0 [ 4093.900268] device_release_driver_internal+0x1c6/0x460 [ 4093.900271] pci_stop_bus_device+0x101/0x150 [ 4093.900273] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900275] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900277] ? pci_iov_add_virtfn+0xe10/0xe10 [ 4093.900278] ? pci_get_subsys+0x90/0x90 [ 4093.900280] sriov_disable+0xed/0x3e0 [ 4093.900282] ? bus_find_device+0x12d/0x1a0 [ 4093.900290] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900298] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 4093.900299] ? pci_get_device+0x7c/0x90 [ 4093.900300] ? pci_get_subsys+0x90/0x90 [ 4093.900306] ? pci_vfs_assigned.part.7+0x144/0x210 [ 4093.900309] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900315] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900318] sriov_numvfs_store+0x214/0x290 [ 4093.900320] ? sriov_totalvfs_show+0x30/0x30 [ 4093.900321] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900323] ? __check_object_size+0x15a/0x350 [ 4093.900326] kernfs_fop_write+0x280/0x3f0 [ 4093.900329] vfs_write+0x145/0x440 [ 4093.900330] ksys_write+0xab/0x160 [ 4093.900332] ? __ia32_sys_read+0xb0/0xb0 [ 4093.900334] ? fput_many+0x1a/0x120 [ 4093.900335] ? filp_close+0xf0/0x130 [ 4093.900338] do_syscall_64+0xa0/0x370 [ 4093.900339] ? page_fault+0x8/0x30 [ 4093.900341] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900357] RIP: 0033:0x7f16ad4d22c0 [ 4093.900359] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 4093.900360] RSP: 002b:00007ffd6491b7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 4093.900362] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f16ad4d22c0 [ 4093.900363] RDX: 0000000000000002 RSI: 0000000001a41408 RDI: 0000000000000001 [ 4093.900364] RBP: 0000000001a41408 R08: 00007f16ad7a1780 R09: 00007f16ae1f2700 [ 4093.900364] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 4093.900365] R13: 0000000000000001 R14: 00007f16ad7a0620 R15: 0000000000000001 [ 4093.900367] [ 4093.900368] Allocated by task 820: [ 4093.900371] kasan_kmalloc+0xa6/0xd0 [ 4093.900373] __kmalloc+0xfb/0x200 [ 4093.900376] iavf_init_interrupt_scheme+0x63b/0x1320 [iavf] [ 4093.900380] iavf_watchdog_task+0x3d51/0x52c0 [iavf] [ 4093.900382] process_one_work+0x56a/0x11f0 [ 4093.900383] worker_thread+0x8f/0xf40 [ 4093.900384] kthread+0x2a0/0x390 [ 4093.900385] ret_from_fork+0x1f/0x40 [ 4093.900387] 0xffffffffffffffff [ 4093.900387] [ 4093.900388] Freed by task 6699: [ 4093.900390] __kasan_slab_free+0x137/0x190 [ 4093.900391] kfree+0x8b/0x1b0 [ 4093.900394] iavf_free_q_vectors+0x11d/0x1a0 [iavf] [ 4093.900397] iavf_remove+0x35a/0xd20 [iavf] [ 4093.900399] pci_device_remove+0xa8/0x1f0 [ 4093.900400] device_release_driver_internal+0x1c6/0x460 [ 4093.900401] pci_stop_bus_device+0x101/0x150 [ 4093.900402] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900403] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900404] sriov_disable+0xed/0x3e0 [ 4093.900409] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900415] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900416] sriov_numvfs_store+0x214/0x290 [ 4093.900417] kernfs_fop_write+0x280/0x3f0 [ 4093.900418] vfs_write+0x145/0x440 [ 4093.900419] ksys_write+0xab/0x160 [ 4093.900420] do_syscall_64+0xa0/0x370 [ 4093.900421] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900422] 0xffffffffffffffff [ 4093.900422] [ 4093.900424] The buggy address belongs to the object at ffff88b4dc144200 which belongs to the cache kmalloc-8k of size 8192 [ 4093.900425] The buggy address is located 5184 bytes inside of 8192-byte region [ffff88b4dc144200, ffff88b4dc146200) [ 4093.900425] The buggy address belongs to the page: [ 4093.900427] page:ffffea00d3705000 refcount:1 mapcount:0 mapping:ffff88bf04415c80 index:0x0 compound_mapcount: 0 [ 4093.900430] flags: 0x10000000008100(slab|head) [ 4093.900433] raw: 0010000000008100 dead000000000100 dead000000000200 ffff88bf04415c80 [ 4093.900434] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000 [ 4093.900434] page dumped because: kasan: bad access detected [ 4093.900435] [ 4093.900435] Memory state around the buggy address: [ 4093.900436] ffff88b4dc145500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900437] ffff88b4dc145580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] >ffff88b4dc145600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] ^ [ 4093.900439] ffff88b4dc145680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ffff88b4dc145700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ================================================================== Although the patch #2 (of 2) can avoid the issue triggered by this repro.sh, there still are other potential risks that if num_active_queues is changed to less than allocated q_vectors[] by unexpected, the mismatched netif_napi_add/del() can also cause UAF. Since we actually call netif_napi_add() for all allocated q_vectors unconditionally in iavf_alloc_q_vectors(), so we should fix it by letting netif_napi_del() match to netif_napi_add(). Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ding Hui Cc: Donglin Peng Cc: Huang Cun Reviewed-by: Simon Horman Reviewed-by: Madhu Chittim Reviewed-by: Leon Romanovsky Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index a483eb185c99..89d88c3a7add 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1828,19 +1828,16 @@ static int iavf_alloc_q_vectors(struct iavf_adapter *adapter) static void iavf_free_q_vectors(struct iavf_adapter *adapter) { int q_idx, num_q_vectors; - int napi_vectors; if (!adapter->q_vectors) return; num_q_vectors = adapter->num_msix_vectors - NONQ_VECS; - napi_vectors = adapter->num_active_queues; for (q_idx = 0; q_idx < num_q_vectors; q_idx++) { struct iavf_q_vector *q_vector = &adapter->q_vectors[q_idx]; - if (q_idx < napi_vectors) - netif_napi_del(&q_vector->napi); + netif_napi_del(&q_vector->napi); } kfree(adapter->q_vectors); adapter->q_vectors = NULL; -- cgit From 7c4bced3caa749ce468b0c5de711c98476b23a52 Mon Sep 17 00:00:00 2001 From: Ding Hui Date: Tue, 9 May 2023 19:11:48 +0800 Subject: iavf: Fix out-of-bounds when setting channels on remove If we set channels greater during iavf_remove(), and waiting reset done would be timeout, then returned with error but changed num_active_queues directly, that will lead to OOB like the following logs. Because the num_active_queues is greater than tx/rx_rings[] allocated actually. Reproducer: [root@host ~]# cat repro.sh #!/bin/bash pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=() function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) } function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) } function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() } trap "on_exit; exit" EXIT while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!) wait Result: [ 3506.152887] iavf 0000:41:02.0: Removing device [ 3510.400799] ================================================================== [ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536 [ 3510.400823] [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 3510.400835] Call Trace: [ 3510.400851] dump_stack+0x71/0xab [ 3510.400860] print_address_description+0x6b/0x290 [ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400868] kasan_report+0x14a/0x2b0 [ 3510.400873] iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400880] iavf_remove+0x2b6/0xc70 [iavf] [ 3510.400884] ? iavf_free_all_rx_resources+0x160/0x160 [iavf] [ 3510.400891] ? wait_woken+0x1d0/0x1d0 [ 3510.400895] ? notifier_call_chain+0xc1/0x130 [ 3510.400903] pci_device_remove+0xa8/0x1f0 [ 3510.400910] device_release_driver_internal+0x1c6/0x460 [ 3510.400916] pci_stop_bus_device+0x101/0x150 [ 3510.400919] pci_stop_and_remove_bus_device+0xe/0x20 [ 3510.400924] pci_iov_remove_virtfn+0x187/0x420 [ 3510.400927] ? pci_iov_add_virtfn+0xe10/0xe10 [ 3510.400929] ? pci_get_subsys+0x90/0x90 [ 3510.400932] sriov_disable+0xed/0x3e0 [ 3510.400936] ? bus_find_device+0x12d/0x1a0 [ 3510.400953] i40e_free_vfs+0x754/0x1210 [i40e] [ 3510.400966] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 3510.400968] ? pci_get_device+0x7c/0x90 [ 3510.400970] ? pci_get_subsys+0x90/0x90 [ 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210 [ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.400996] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 3510.401001] sriov_numvfs_store+0x214/0x290 [ 3510.401005] ? sriov_totalvfs_show+0x30/0x30 [ 3510.401007] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.401011] ? __check_object_size+0x15a/0x350 [ 3510.401018] kernfs_fop_write+0x280/0x3f0 [ 3510.401022] vfs_write+0x145/0x440 [ 3510.401025] ksys_write+0xab/0x160 [ 3510.401028] ? __ia32_sys_read+0xb0/0xb0 [ 3510.401031] ? fput_many+0x1a/0x120 [ 3510.401032] ? filp_close+0xf0/0x130 [ 3510.401038] do_syscall_64+0xa0/0x370 [ 3510.401041] ? page_fault+0x8/0x30 [ 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 3510.401073] RIP: 0033:0x7f3a9bb842c0 [ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0 [ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001 [ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700 [ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001 [ 3510.401090] [ 3510.401093] Allocated by task 76795: [ 3510.401098] kasan_kmalloc+0xa6/0xd0 [ 3510.401099] __kmalloc+0xfb/0x200 [ 3510.401104] iavf_init_interrupt_scheme+0x26f/0x1310 [iavf] [ 3510.401108] iavf_watchdog_task+0x1d58/0x4050 [iavf] [ 3510.401114] process_one_work+0x56a/0x11f0 [ 3510.401115] worker_thread+0x8f/0xf40 [ 3510.401117] kthread+0x2a0/0x390 [ 3510.401119] ret_from_fork+0x1f/0x40 [ 3510.401122] 0xffffffffffffffff [ 3510.401123] In timeout handling, we should keep the original num_active_queues and reset num_req_queues to 0. Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Ding Hui Cc: Donglin Peng Cc: Huang Cun Reviewed-by: Leon Romanovsky Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index 6f171d1d85b7..92443f8e9fbd 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -1863,7 +1863,7 @@ static int iavf_set_channels(struct net_device *netdev, } if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) { adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; - adapter->num_active_queues = num_req; + adapter->num_req_queues = 0; return -EOPNOTSUPP; } -- cgit From a77ed5c5b768e9649be240a2d864e5cd9c6a2015 Mon Sep 17 00:00:00 2001 From: Ahmed Zaki Date: Fri, 19 May 2023 15:46:02 -0600 Subject: iavf: use internal state to free traffic IRQs If the system tries to close the netdev while iavf_reset_task() is running, __LINK_STATE_START will be cleared and netif_running() will return false in iavf_reinit_interrupt_scheme(). This will result in iavf_free_traffic_irqs() not being called and a leak as follows: [7632.489326] remove_proc_entry: removing non-empty directory 'irq/999', leaking at least 'iavf-enp24s0f0v0-TxRx-0' [7632.490214] WARNING: CPU: 0 PID: 10 at fs/proc/generic.c:718 remove_proc_entry+0x19b/0x1b0 is shown when pci_disable_msix() is later called. Fix by using the internal adapter state. The traffic IRQs will always exist if state == __IAVF_RUNNING. Fixes: 5b36e8d04b44 ("i40evf: Enable VF to request an alternate queue allocation") Signed-off-by: Ahmed Zaki Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 89d88c3a7add..058e19c6f94a 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1929,15 +1929,16 @@ static void iavf_free_rss(struct iavf_adapter *adapter) /** * iavf_reinit_interrupt_scheme - Reallocate queues and vectors * @adapter: board private structure + * @running: true if adapter->state == __IAVF_RUNNING * * Returns 0 on success, negative on failure **/ -static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter) +static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter, bool running) { struct net_device *netdev = adapter->netdev; int err; - if (netif_running(netdev)) + if (running) iavf_free_traffic_irqs(adapter); iavf_free_misc_irq(adapter); iavf_reset_interrupt_capability(adapter); @@ -3053,7 +3054,7 @@ continue_reset: if ((adapter->flags & IAVF_FLAG_REINIT_MSIX_NEEDED) || (adapter->flags & IAVF_FLAG_REINIT_ITR_NEEDED)) { - err = iavf_reinit_interrupt_scheme(adapter); + err = iavf_reinit_interrupt_scheme(adapter, running); if (err) goto reset_err; } -- cgit From c2ed2403f12c74a74a0091ed5d830e72c58406e8 Mon Sep 17 00:00:00 2001 From: Marcin Szycik Date: Mon, 5 Jun 2023 10:52:22 -0400 Subject: iavf: Wait for reset in callbacks which trigger it There was a fail when trying to add the interface to bonding right after changing the MTU on the interface. It was caused by bonding interface unable to open the interface due to interface being in __RESETTING state because of MTU change. Add new reset_waitqueue to indicate that reset has finished. Add waiting for reset to finish in callbacks which trigger hw reset: iavf_set_priv_flags(), iavf_change_mtu() and iavf_set_ringparam(). We use a 5000ms timeout period because on Hyper-V based systems, this operation takes around 3000-4000ms. In normal circumstances, it doesn't take more than 500ms to complete. Add a function iavf_wait_for_reset() to reuse waiting for reset code and use it also in iavf_set_channels(), which already waits for reset. We don't use error handling in iavf_set_channels() as this could cause the device to be in incorrect state if the reset was scheduled but hit timeout or the waitng function was interrupted by a signal. Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Marcin Szycik Co-developed-by: Dawid Wesierski Signed-off-by: Dawid Wesierski Signed-off-by: Sylwester Dziedziuch Signed-off-by: Kamil Maziarz Signed-off-by: Mateusz Palczewski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf.h | 2 + drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 31 ++++++++------- drivers/net/ethernet/intel/iavf/iavf_main.c | 51 ++++++++++++++++++++++++- drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 1 + 4 files changed, 68 insertions(+), 17 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index f80f2735e688..a5cab19eb6a8 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -257,6 +257,7 @@ struct iavf_adapter { struct work_struct adminq_task; struct delayed_work client_task; wait_queue_head_t down_waitqueue; + wait_queue_head_t reset_waitqueue; wait_queue_head_t vc_waitqueue; struct iavf_q_vector *q_vectors; struct list_head vlan_filter_list; @@ -582,4 +583,5 @@ void iavf_add_adv_rss_cfg(struct iavf_adapter *adapter); void iavf_del_adv_rss_cfg(struct iavf_adapter *adapter); struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter, const u8 *macaddr); +int iavf_wait_for_reset(struct iavf_adapter *adapter); #endif /* _IAVF_H_ */ diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index 92443f8e9fbd..b7141c2a941d 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -484,6 +484,7 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) { struct iavf_adapter *adapter = netdev_priv(netdev); u32 orig_flags, new_flags, changed_flags; + int ret = 0; u32 i; orig_flags = READ_ONCE(adapter->flags); @@ -533,10 +534,13 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing private flags timeout or interrupted waiting for reset"); } } - return 0; + return ret; } /** @@ -627,6 +631,7 @@ static int iavf_set_ringparam(struct net_device *netdev, { struct iavf_adapter *adapter = netdev_priv(netdev); u32 new_rx_count, new_tx_count; + int ret = 0; if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending)) return -EINVAL; @@ -673,9 +678,12 @@ static int iavf_set_ringparam(struct net_device *netdev, if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing ring parameters timeout or interrupted waiting for reset"); } - return 0; + return ret; } /** @@ -1830,7 +1838,7 @@ static int iavf_set_channels(struct net_device *netdev, { struct iavf_adapter *adapter = netdev_priv(netdev); u32 num_req = ch->combined_count; - int i; + int ret = 0; if ((adapter->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_ADQ) && adapter->num_tc) { @@ -1854,20 +1862,11 @@ static int iavf_set_channels(struct net_device *netdev, adapter->flags |= IAVF_FLAG_REINIT_ITR_NEEDED; iavf_schedule_reset(adapter); - /* wait for the reset is done */ - for (i = 0; i < IAVF_RESET_WAIT_COMPLETE_COUNT; i++) { - msleep(IAVF_RESET_WAIT_MS); - if (adapter->flags & IAVF_FLAG_RESET_PENDING) - continue; - break; - } - if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) { - adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; - adapter->num_req_queues = 0; - return -EOPNOTSUPP; - } + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing channel count timeout or interrupted waiting for reset"); - return 0; + return ret; } /** diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 058e19c6f94a..b89933aa5bfe 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -166,6 +166,45 @@ static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev) return netdev_priv(pci_get_drvdata(pdev)); } +/** + * iavf_is_reset_in_progress - Check if a reset is in progress + * @adapter: board private structure + */ +static bool iavf_is_reset_in_progress(struct iavf_adapter *adapter) +{ + if (adapter->state == __IAVF_RESETTING || + adapter->flags & (IAVF_FLAG_RESET_PENDING | + IAVF_FLAG_RESET_NEEDED)) + return true; + + return false; +} + +/** + * iavf_wait_for_reset - Wait for reset to finish. + * @adapter: board private structure + * + * Returns 0 if reset finished successfully, negative on timeout or interrupt. + */ +int iavf_wait_for_reset(struct iavf_adapter *adapter) +{ + int ret = wait_event_interruptible_timeout(adapter->reset_waitqueue, + !iavf_is_reset_in_progress(adapter), + msecs_to_jiffies(5000)); + + /* If ret < 0 then it means wait was interrupted. + * If ret == 0 then it means we got a timeout while waiting + * for reset to finish. + * If ret > 0 it means reset has finished. + */ + if (ret > 0) + return 0; + else if (ret < 0) + return -EINTR; + else + return -EBUSY; +} + /** * iavf_allocate_dma_mem_d - OS specific memory alloc for shared code * @hw: pointer to the HW structure @@ -3149,6 +3188,7 @@ continue_reset: adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; + wake_up(&adapter->reset_waitqueue); mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); @@ -4313,6 +4353,7 @@ static int iavf_close(struct net_device *netdev) static int iavf_change_mtu(struct net_device *netdev, int new_mtu) { struct iavf_adapter *adapter = netdev_priv(netdev); + int ret = 0; netdev_dbg(netdev, "changing MTU from %d to %d\n", netdev->mtu, new_mtu); @@ -4325,9 +4366,14 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu) if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret < 0) + netdev_warn(netdev, "MTU change interrupted waiting for reset"); + else if (ret) + netdev_warn(netdev, "MTU change timed out waiting for reset"); } - return 0; + return ret; } #define NETIF_VLAN_OFFLOAD_FEATURES (NETIF_F_HW_VLAN_CTAG_RX | \ @@ -4928,6 +4974,9 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /* Setup the wait queue for indicating transition to down status */ init_waitqueue_head(&adapter->down_waitqueue); + /* Setup the wait queue for indicating transition to running state */ + init_waitqueue_head(&adapter->reset_waitqueue); + /* Setup the wait queue for indicating virtchannel events */ init_waitqueue_head(&adapter->vc_waitqueue); diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 7c0578b5457b..1bab896aaf40 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -2285,6 +2285,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter, case VIRTCHNL_OP_ENABLE_QUEUES: /* enable transmits */ iavf_irq_enable(adapter, true); + wake_up(&adapter->reset_waitqueue); adapter->flags &= ~IAVF_FLAG_QUEUES_DISABLED; break; case VIRTCHNL_OP_DISABLE_QUEUES: -- cgit From d2806d960e8387945b53edf7ed9d71ab3ab5f073 Mon Sep 17 00:00:00 2001 From: Marcin Szycik Date: Mon, 5 Jun 2023 10:52:23 -0400 Subject: Revert "iavf: Detach device during reset task" This reverts commit aa626da947e9cd30c4cf727493903e1adbb2c0a0. Detaching device during reset was not fully fixing the rtnl locking issue, as there could be a situation where callback was already in progress before detaching netdev. Furthermore, detaching netdevice causes TX timeouts if traffic is running. To reproduce: ip netns exec ns1 iperf3 -c $PEER_IP -t 600 --logfile /dev/null & while :; do for i in 200 7000 400 5000 300 3000 ; do ip netns exec ns1 ip link set $VF1 mtu $i sleep 2 done sleep 10 done Currently, callbacks such as iavf_change_mtu() wait for the reset. If the reset fails to acquire the rtnl_lock, they schedule the netdev update for later while continuing the reset flow. Operations like MTU changes are performed under the rtnl_lock. Therefore, when the operation finishes, another callback that uses rtnl_lock can start. Signed-off-by: Dawid Wesierski Signed-off-by: Marcin Szycik Signed-off-by: Mateusz Palczewski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index b89933aa5bfe..957ad6e63f6f 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -2977,11 +2977,6 @@ static void iavf_reset_task(struct work_struct *work) int i = 0, err; bool running; - /* Detach interface to avoid subsequent NDO callbacks */ - rtnl_lock(); - netif_device_detach(netdev); - rtnl_unlock(); - /* When device is being removed it doesn't make sense to run the reset * task, just return in such a case. */ @@ -2989,7 +2984,7 @@ static void iavf_reset_task(struct work_struct *work) if (adapter->state != __IAVF_REMOVE) queue_work(adapter->wq, &adapter->reset_task); - goto reset_finish; + return; } while (!mutex_trylock(&adapter->client_lock)) @@ -3192,7 +3187,7 @@ continue_reset: mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); - goto reset_finish; + return; reset_err: if (running) { set_bit(__IAVF_VSI_DOWN, adapter->vsi.state); @@ -3213,10 +3208,6 @@ reset_err: } dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n"); -reset_finish: - rtnl_lock(); - netif_device_attach(netdev); - rtnl_unlock(); } /** -- cgit From d916d273041b0b5652434ff27aec716ff90992ac Mon Sep 17 00:00:00 2001 From: Marcin Szycik Date: Mon, 5 Jun 2023 10:52:24 -0400 Subject: Revert "iavf: Do not restart Tx queues after reset task failure" This reverts commit 08f1c147b7265245d67321585c68a27e990e0c4b. Netdev is no longer being detached during reset, so this fix can be reverted. We leave the removal of "hacky" IFF_UP flag update. Signed-off-by: Marcin Szycik Signed-off-by: Mateusz Palczewski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf_main.c | 15 --------------- 1 file changed, 15 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 957ad6e63f6f..0c12a752429c 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -3042,11 +3042,6 @@ static void iavf_reset_task(struct work_struct *work) iavf_disable_vf(adapter); mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); - if (netif_running(netdev)) { - rtnl_lock(); - dev_close(netdev); - rtnl_unlock(); - } return; /* Do not attempt to reinit. It's dead, Jim. */ } @@ -3197,16 +3192,6 @@ reset_err: mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); - - if (netif_running(netdev)) { - /* Close device to ensure that Tx queues will not be started - * during netif_device_attach() at the end of the reset task. - */ - rtnl_lock(); - dev_close(netdev); - rtnl_unlock(); - } - dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n"); } -- cgit From d1639a17319ba78a018280cd2df6577a7e5d9fab Mon Sep 17 00:00:00 2001 From: Ahmed Zaki Date: Mon, 5 Jun 2023 10:52:25 -0400 Subject: iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies A driver's lock (crit_lock) is used to serialize all the driver's tasks. Lockdep, however, shows a circular dependency between rtnl and crit_lock. This happens when an ndo that already holds the rtnl requests the driver to reset, since the reset task (in some paths) tries to grab rtnl to either change real number of queues of update netdev features. [566.241851] ====================================================== [566.241893] WARNING: possible circular locking dependency detected [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G OE [566.241984] ------------------------------------------------------ [566.242025] repro.sh/2604 is trying to acquire lock: [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf] [566.242167] but task is already holding lock: [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf] [566.242300] which lock already depends on the new lock. [566.242353] the existing dependency chain (in reverse order) is: [566.242401] -> #1 (rtnl_mutex){+.+.}-{3:3}: [566.242451] __mutex_lock+0xc1/0xbb0 [566.242489] iavf_init_interrupt_scheme+0x179/0x440 [iavf] [566.242560] iavf_watchdog_task+0x80b/0x1400 [iavf] [566.242627] process_one_work+0x2b3/0x560 [566.242663] worker_thread+0x4f/0x3a0 [566.242696] kthread+0xf2/0x120 [566.242730] ret_from_fork+0x29/0x50 [566.242763] -> #0 (&adapter->crit_lock){+.+.}-{3:3}: [566.242815] __lock_acquire+0x15ff/0x22b0 [566.242869] lock_acquire+0xd2/0x2c0 [566.242901] __mutex_lock+0xc1/0xbb0 [566.242934] iavf_close+0x3c/0x240 [iavf] [566.242997] __dev_close_many+0xac/0x120 [566.243036] dev_close_many+0x8b/0x140 [566.243071] unregister_netdevice_many_notify+0x165/0x7c0 [566.243116] unregister_netdevice_queue+0xd3/0x110 [566.243157] iavf_remove+0x6c1/0x730 [iavf] [566.243217] pci_device_remove+0x33/0xa0 [566.243257] device_release_driver_internal+0x1bc/0x240 [566.243299] pci_stop_bus_device+0x6c/0x90 [566.243338] pci_stop_and_remove_bus_device+0xe/0x20 [566.243380] pci_iov_remove_virtfn+0xd1/0x130 [566.243417] sriov_disable+0x34/0xe0 [566.243448] ice_free_vfs+0x2da/0x330 [ice] [566.244383] ice_sriov_configure+0x88/0xad0 [ice] [566.245353] sriov_numvfs_store+0xde/0x1d0 [566.246156] kernfs_fop_write_iter+0x15e/0x210 [566.246921] vfs_write+0x288/0x530 [566.247671] ksys_write+0x74/0xf0 [566.248408] do_syscall_64+0x58/0x80 [566.249145] entry_SYSCALL_64_after_hwframe+0x72/0xdc [566.249886] other info that might help us debug this: [566.252014] Possible unsafe locking scenario: [566.253432] CPU0 CPU1 [566.254118] ---- ---- [566.254800] lock(rtnl_mutex); [566.255514] lock(&adapter->crit_lock); [566.256233] lock(rtnl_mutex); [566.256897] lock(&adapter->crit_lock); [566.257388] *** DEADLOCK *** The deadlock can be triggered by a script that is continuously resetting the VF adapter while doing other operations requiring RTNL, e.g: while :; do ip link set $VF up ethtool --set-channels $VF combined 2 ip link set $VF down ip link set $VF up ethtool --set-channels $VF combined 4 ip link set $VF down done Any operation that triggers a reset can substitute "ethtool --set-channles" As a fix, add a new task "finish_config" that do all the work which needs rtnl lock. With the exception of iavf_remove(), all work that require rtnl should be called from this task. As for iavf_remove(), at the point where we need to call unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent finish_config work cannot restart after that since the task is guarded by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config(). Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Ahmed Zaki Signed-off-by: Mateusz Palczewski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf.h | 2 + drivers/net/ethernet/intel/iavf/iavf_main.c | 114 +++++++++++++++++------- drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 1 + 3 files changed, 85 insertions(+), 32 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index a5cab19eb6a8..bf5e3c8e97e0 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -255,6 +255,7 @@ struct iavf_adapter { struct workqueue_struct *wq; struct work_struct reset_task; struct work_struct adminq_task; + struct work_struct finish_config; struct delayed_work client_task; wait_queue_head_t down_waitqueue; wait_queue_head_t reset_waitqueue; @@ -521,6 +522,7 @@ int iavf_process_config(struct iavf_adapter *adapter); int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter); void iavf_schedule_reset(struct iavf_adapter *adapter); void iavf_schedule_request_stats(struct iavf_adapter *adapter); +void iavf_schedule_finish_config(struct iavf_adapter *adapter); void iavf_reset(struct iavf_adapter *adapter); void iavf_set_ethtool_ops(struct net_device *netdev); void iavf_update_stats(struct iavf_adapter *adapter); diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 0c12a752429c..2a2ae3c4337b 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1690,10 +1690,10 @@ static int iavf_set_interrupt_capability(struct iavf_adapter *adapter) adapter->msix_entries[vector].entry = vector; err = iavf_acquire_msix_vectors(adapter, v_budget); + if (!err) + iavf_schedule_finish_config(adapter); out: - netif_set_real_num_rx_queues(adapter->netdev, pairs); - netif_set_real_num_tx_queues(adapter->netdev, pairs); return err; } @@ -1913,9 +1913,7 @@ static int iavf_init_interrupt_scheme(struct iavf_adapter *adapter) goto err_alloc_queues; } - rtnl_lock(); err = iavf_set_interrupt_capability(adapter); - rtnl_unlock(); if (err) { dev_err(&adapter->pdev->dev, "Unable to setup interrupt capabilities\n"); @@ -2001,6 +1999,78 @@ err: return err; } +/** + * iavf_finish_config - do all netdev work that needs RTNL + * @work: our work_struct + * + * Do work that needs both RTNL and crit_lock. + **/ +static void iavf_finish_config(struct work_struct *work) +{ + struct iavf_adapter *adapter; + int pairs, err; + + adapter = container_of(work, struct iavf_adapter, finish_config); + + /* Always take RTNL first to prevent circular lock dependency */ + rtnl_lock(); + mutex_lock(&adapter->crit_lock); + + if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) && + adapter->netdev_registered && + !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) { + netdev_update_features(adapter->netdev); + adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES; + } + + switch (adapter->state) { + case __IAVF_DOWN: + if (!adapter->netdev_registered) { + err = register_netdevice(adapter->netdev); + if (err) { + dev_err(&adapter->pdev->dev, "Unable to register netdev (%d)\n", + err); + + /* go back and try again.*/ + iavf_free_rss(adapter); + iavf_free_misc_irq(adapter); + iavf_reset_interrupt_capability(adapter); + iavf_change_state(adapter, + __IAVF_INIT_CONFIG_ADAPTER); + goto out; + } + adapter->netdev_registered = true; + } + + /* Set the real number of queues when reset occurs while + * state == __IAVF_DOWN + */ + fallthrough; + case __IAVF_RUNNING: + pairs = adapter->num_active_queues; + netif_set_real_num_rx_queues(adapter->netdev, pairs); + netif_set_real_num_tx_queues(adapter->netdev, pairs); + break; + + default: + break; + } + +out: + mutex_unlock(&adapter->crit_lock); + rtnl_unlock(); +} + +/** + * iavf_schedule_finish_config - Set the flags and schedule a reset event + * @adapter: board private structure + **/ +void iavf_schedule_finish_config(struct iavf_adapter *adapter) +{ + if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) + queue_work(adapter->wq, &adapter->finish_config); +} + /** * iavf_process_aq_command - process aq_required flags * and sends aq command @@ -2638,22 +2708,8 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter) netif_carrier_off(netdev); adapter->link_up = false; - - /* set the semaphore to prevent any callbacks after device registration - * up to time when state of driver will be set to __IAVF_DOWN - */ - rtnl_lock(); - if (!adapter->netdev_registered) { - err = register_netdevice(netdev); - if (err) { - rtnl_unlock(); - goto err_register; - } - } - - adapter->netdev_registered = true; - netif_tx_stop_all_queues(netdev); + if (CLIENT_ALLOWED(adapter)) { err = iavf_lan_add_device(adapter); if (err) @@ -2666,7 +2722,6 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter) iavf_change_state(adapter, __IAVF_DOWN); set_bit(__IAVF_VSI_DOWN, adapter->vsi.state); - rtnl_unlock(); iavf_misc_irq_enable(adapter); wake_up(&adapter->down_waitqueue); @@ -2686,10 +2741,11 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter) /* request initial VLAN offload settings */ iavf_set_vlan_offload_features(adapter, 0, netdev->features); + iavf_schedule_finish_config(adapter); return; + err_mem: iavf_free_rss(adapter); -err_register: iavf_free_misc_irq(adapter); err_sw_init: iavf_reset_interrupt_capability(adapter); @@ -2716,15 +2772,6 @@ static void iavf_watchdog_task(struct work_struct *work) goto restart_watchdog; } - if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) && - adapter->netdev_registered && - !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) && - rtnl_trylock()) { - netdev_update_features(adapter->netdev); - rtnl_unlock(); - adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES; - } - if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) iavf_change_state(adapter, __IAVF_COMM_FAILED); @@ -4942,6 +4989,7 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) INIT_WORK(&adapter->reset_task, iavf_reset_task); INIT_WORK(&adapter->adminq_task, iavf_adminq_task); + INIT_WORK(&adapter->finish_config, iavf_finish_config); INIT_DELAYED_WORK(&adapter->watchdog_task, iavf_watchdog_task); INIT_DELAYED_WORK(&adapter->client_task, iavf_client_task); queue_delayed_work(adapter->wq, &adapter->watchdog_task, @@ -5084,13 +5132,15 @@ static void iavf_remove(struct pci_dev *pdev) usleep_range(500, 1000); } cancel_delayed_work_sync(&adapter->watchdog_task); + cancel_work_sync(&adapter->finish_config); + rtnl_lock(); if (adapter->netdev_registered) { - rtnl_lock(); unregister_netdevice(netdev); adapter->netdev_registered = false; - rtnl_unlock(); } + rtnl_unlock(); + if (CLIENT_ALLOWED(adapter)) { err = iavf_lan_del_device(adapter); if (err) diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 1bab896aaf40..073ac29ed84c 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -2237,6 +2237,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter, iavf_process_config(adapter); adapter->flags |= IAVF_FLAG_SETUP_NETDEV_FEATURES; + iavf_schedule_finish_config(adapter); iavf_set_queue_vlan_tag_loc(adapter); -- cgit From c34743daca0eb1dc855831a5210f0800a850088e Mon Sep 17 00:00:00 2001 From: Ahmed Zaki Date: Mon, 5 Jun 2023 10:52:26 -0400 Subject: iavf: fix reset task race with iavf_remove() The reset task is currently scheduled from the watchdog or adminq tasks. First, all direct calls to schedule the reset task are replaced with the iavf_schedule_reset(), which is modified to accept the flag showing the type of reset. To prevent the reset task from starting once iavf_remove() starts, we need to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now easily added to iavf_schedule_reset(). Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task. It is redundant since all callers who set the flag immediately schedules the reset task. Fixes: 3ccd54ef44eb ("iavf: Fix init state closure on remove") Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage") Signed-off-by: Ahmed Zaki Signed-off-by: Mateusz Palczewski Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/iavf/iavf.h | 2 +- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 8 +++---- drivers/net/ethernet/intel/iavf/iavf_main.c | 32 +++++++++---------------- drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 3 +-- 4 files changed, 16 insertions(+), 29 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index bf5e3c8e97e0..8cbdebc5b698 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -520,7 +520,7 @@ int iavf_up(struct iavf_adapter *adapter); void iavf_down(struct iavf_adapter *adapter); int iavf_process_config(struct iavf_adapter *adapter); int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter); -void iavf_schedule_reset(struct iavf_adapter *adapter); +void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags); void iavf_schedule_request_stats(struct iavf_adapter *adapter); void iavf_schedule_finish_config(struct iavf_adapter *adapter); void iavf_reset(struct iavf_adapter *adapter); diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index b7141c2a941d..2f47cfa7f06e 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -532,8 +532,7 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) /* issue a reset to force legacy-rx change to take effect */ if (changed_flags & IAVF_FLAG_LEGACY_RX) { if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret) netdev_warn(netdev, "Changing private flags timeout or interrupted waiting for reset"); @@ -676,8 +675,7 @@ static int iavf_set_ringparam(struct net_device *netdev, } if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret) netdev_warn(netdev, "Changing ring parameters timeout or interrupted waiting for reset"); @@ -1860,7 +1858,7 @@ static int iavf_set_channels(struct net_device *netdev, adapter->num_req_queues = num_req; adapter->flags |= IAVF_FLAG_REINIT_ITR_NEEDED; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 2a2ae3c4337b..3a88d413ddee 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -301,12 +301,14 @@ static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs) /** * iavf_schedule_reset - Set the flags and schedule a reset event * @adapter: board private structure + * @flags: IAVF_FLAG_RESET_PENDING or IAVF_FLAG_RESET_NEEDED **/ -void iavf_schedule_reset(struct iavf_adapter *adapter) +void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags) { - if (!(adapter->flags & - (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; + if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) && + !(adapter->flags & + (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) { + adapter->flags |= flags; queue_work(adapter->wq, &adapter->reset_task); } } @@ -334,7 +336,7 @@ static void iavf_tx_timeout(struct net_device *netdev, unsigned int txqueue) struct iavf_adapter *adapter = netdev_priv(netdev); adapter->tx_timeout_count++; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); } /** @@ -2478,7 +2480,7 @@ int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter) adapter->vsi_res->num_queue_pairs); adapter->flags |= IAVF_FLAG_REINIT_MSIX_NEEDED; adapter->num_req_queues = adapter->vsi_res->num_queue_pairs; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); return -EAGAIN; } @@ -2775,14 +2777,6 @@ static void iavf_watchdog_task(struct work_struct *work) if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) iavf_change_state(adapter, __IAVF_COMM_FAILED); - if (adapter->flags & IAVF_FLAG_RESET_NEEDED) { - adapter->aq_required = 0; - adapter->current_op = VIRTCHNL_OP_UNKNOWN; - mutex_unlock(&adapter->crit_lock); - queue_work(adapter->wq, &adapter->reset_task); - return; - } - switch (adapter->state) { case __IAVF_STARTUP: iavf_startup(adapter); @@ -2910,11 +2904,10 @@ static void iavf_watchdog_task(struct work_struct *work) /* check for hw reset */ reg_val = rd32(hw, IAVF_VF_ARQLEN1) & IAVF_VF_ARQLEN1_ARQENABLE_MASK; if (!reg_val) { - adapter->flags |= IAVF_FLAG_RESET_PENDING; adapter->aq_required = 0; adapter->current_op = VIRTCHNL_OP_UNKNOWN; dev_err(&adapter->pdev->dev, "Hardware reset detected\n"); - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING); mutex_unlock(&adapter->crit_lock); queue_delayed_work(adapter->wq, &adapter->watchdog_task, HZ * 2); @@ -3288,9 +3281,7 @@ static void iavf_adminq_task(struct work_struct *work) } while (pending); mutex_unlock(&adapter->crit_lock); - if ((adapter->flags & - (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) || - adapter->state == __IAVF_RESETTING) + if (iavf_is_reset_in_progress(adapter)) goto freedom; /* check for error indications */ @@ -4387,8 +4378,7 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu) } if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret < 0) netdev_warn(netdev, "MTU change interrupted waiting for reset"); diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 073ac29ed84c..be3c007ce90a 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -1961,9 +1961,8 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter, case VIRTCHNL_EVENT_RESET_IMPENDING: dev_info(&adapter->pdev->dev, "Reset indication received from the PF\n"); if (!(adapter->flags & IAVF_FLAG_RESET_PENDING)) { - adapter->flags |= IAVF_FLAG_RESET_PENDING; dev_info(&adapter->pdev->dev, "Scheduling reset task\n"); - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING); } break; default: -- cgit From 9efa1a5407e81265ea502cab83be4de503decc49 Mon Sep 17 00:00:00 2001 From: Fedor Ross Date: Thu, 4 May 2023 21:50:59 +0200 Subject: can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout The mcp251xfd controller needs an idle bus to enter 'Normal CAN 2.0 mode' or . The maximum length of a CAN frame is 736 bits (64 data bytes, CAN-FD, EFF mode, worst case bit stuffing and interframe spacing). For low bit rates like 10 kbit/s the arbitrarily chosen MCP251XFD_POLL_TIMEOUT_US of 1 ms is too small. Otherwise during polling for the CAN controller to enter 'Normal CAN 2.0 mode' the timeout limit is exceeded and the configuration fails with: | $ ip link set dev can1 up type can bitrate 10000 | [ 731.911072] mcp251xfd spi2.1 can1: Controller failed to enter mode CAN 2.0 Mode (6) and stays in Configuration Mode (4) (con=0x068b0760, osc=0x00000468). | [ 731.927192] mcp251xfd spi2.1 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. | [ 731.938101] A link change request failed with some changes committed already. Interface can1 may have been left with an inconsistent configuration, please check. | RTNETLINK answers: Connection timed out Make MCP251XFD_POLL_TIMEOUT_US timeout calculation dynamic. Use maximum of 1ms and bit time of 1 full 64 data bytes CAN-FD frame in EFF mode, worst case bit stuffing and interframe spacing at the current bit rate. For easier backporting define the macro MCP251XFD_FRAME_LEN_MAX_BITS that holds the max frame length in bits, which is 736. This can be replaced by can_frame_bits(true, true, true, true, CANFD_MAX_DLEN) in a cleanup patch later. Fixes: 55e5b97f003e8 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Signed-off-by: Fedor Ross Signed-off-by: Marek Vasut Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230717-mcp251xfd-fix-increase-poll-timeout-v5-1-06600f34c684@pengutronix.de Signed-off-by: Marc Kleine-Budde --- drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 10 ++++++++-- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 1 + 2 files changed, 9 insertions(+), 2 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c index 68df6d4641b5..eebf967f4711 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c @@ -227,6 +227,8 @@ static int __mcp251xfd_chip_set_mode(const struct mcp251xfd_priv *priv, const u8 mode_req, bool nowait) { + const struct can_bittiming *bt = &priv->can.bittiming; + unsigned long timeout_us = MCP251XFD_POLL_TIMEOUT_US; u32 con = 0, con_reqop, osc = 0; u8 mode; int err; @@ -246,12 +248,16 @@ __mcp251xfd_chip_set_mode(const struct mcp251xfd_priv *priv, if (mode_req == MCP251XFD_REG_CON_MODE_SLEEP || nowait) return 0; + if (bt->bitrate) + timeout_us = max_t(unsigned long, timeout_us, + MCP251XFD_FRAME_LEN_MAX_BITS * USEC_PER_SEC / + bt->bitrate); + err = regmap_read_poll_timeout(priv->map_reg, MCP251XFD_REG_CON, con, !mcp251xfd_reg_invalid(con) && FIELD_GET(MCP251XFD_REG_CON_OPMOD_MASK, con) == mode_req, - MCP251XFD_POLL_SLEEP_US, - MCP251XFD_POLL_TIMEOUT_US); + MCP251XFD_POLL_SLEEP_US, timeout_us); if (err != -ETIMEDOUT && err != -EBADMSG) return err; diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h index 7024ff0cc2c0..24510b3b8020 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h @@ -387,6 +387,7 @@ static_assert(MCP251XFD_TIMESTAMP_WORK_DELAY_SEC < #define MCP251XFD_OSC_STAB_TIMEOUT_US (10 * MCP251XFD_OSC_STAB_SLEEP_US) #define MCP251XFD_POLL_SLEEP_US (10) #define MCP251XFD_POLL_TIMEOUT_US (USEC_PER_MSEC) +#define MCP251XFD_FRAME_LEN_MAX_BITS (736) /* Misc */ #define MCP251XFD_NAPI_WEIGHT 32 -- cgit From 2033ab90380d46e0e9f0520fd6776a73d107fd95 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Sat, 15 Jul 2023 18:36:05 +0300 Subject: vrf: Fix lockdep splat in output path Cited commit converted the neighbour code to use the standard RCU variant instead of the RCU-bh variant, but the VRF code still uses rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup code in its IPv4 and IPv6 output paths, resulting in lockdep splats [1][2]. Can be reproduced using [3]. Fix by switching to rcu_read_lock() / rcu_read_unlock(). [1] ============================= WARNING: suspicious RCU usage 6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted ----------------------------- include/net/neighbour.h:302 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ping/183: #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030 stack backtrace: CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 Call Trace: dump_stack_lvl+0xc1/0xf0 lockdep_rcu_suspicious+0x211/0x3b0 vrf_output+0x1380/0x2030 ip_push_pending_frames+0x125/0x2a0 raw_sendmsg+0x200d/0x33c0 inet_sendmsg+0xa2/0xe0 __sys_sendto+0x2aa/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [2] ============================= WARNING: suspicious RCU usage 6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted ----------------------------- include/net/neighbour.h:302 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ping6/182: #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310 stack backtrace: CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 Call Trace: dump_stack_lvl+0xc1/0xf0 lockdep_rcu_suspicious+0x211/0x3b0 vrf_output6+0xd32/0x1310 ip6_local_out+0xb4/0x1a0 ip6_send_skb+0xbc/0x340 ip6_push_pending_frames+0xe5/0x110 rawv6_sendmsg+0x2e6e/0x3e50 inet_sendmsg+0xa2/0xe0 __sys_sendto+0x2aa/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [3] #!/bin/bash ip link add name vrf-red up numtxqueues 2 type vrf table 10 ip link add name swp1 up master vrf-red type dummy ip address add 192.0.2.1/24 dev swp1 ip address add 2001:db8:1::1/64 dev swp1 ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1 ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1 ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh") Reported-by: Naresh Kamboju Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/ Signed-off-by: Ido Schimmel Reviewed-by: David Ahern Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20230715153605.4068066-1-idosch@nvidia.com Signed-off-by: Paolo Abeni --- drivers/net/vrf.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index bdb3a76a352e..6043e63b42f9 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -664,7 +664,7 @@ static int vrf_finish_output6(struct net *net, struct sock *sk, skb->protocol = htons(ETH_P_IPV6); skb->dev = dev; - rcu_read_lock_bh(); + rcu_read_lock(); nexthop = rt6_nexthop((struct rt6_info *)dst, &ipv6_hdr(skb)->daddr); neigh = __ipv6_neigh_lookup_noref(dst->dev, nexthop); if (unlikely(!neigh)) @@ -672,10 +672,10 @@ static int vrf_finish_output6(struct net *net, struct sock *sk, if (!IS_ERR(neigh)) { sock_confirm_neigh(skb, neigh); ret = neigh_output(neigh, skb, false); - rcu_read_unlock_bh(); + rcu_read_unlock(); return ret; } - rcu_read_unlock_bh(); + rcu_read_unlock(); IP6_INC_STATS(dev_net(dst->dev), ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES); @@ -889,7 +889,7 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s } } - rcu_read_lock_bh(); + rcu_read_lock(); neigh = ip_neigh_for_gw(rt, skb, &is_v6gw); if (!IS_ERR(neigh)) { @@ -898,11 +898,11 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s sock_confirm_neigh(skb, neigh); /* if crossing protocols, can not use the cached header */ ret = neigh_output(neigh, skb, is_v6gw); - rcu_read_unlock_bh(); + rcu_read_unlock(); return ret; } - rcu_read_unlock_bh(); + rcu_read_unlock(); vrf_tx_error(skb->dev, skb); return -EINVAL; } -- cgit From 8fcd7c7b3a38ab5e452f542fda8f7940e77e479a Mon Sep 17 00:00:00 2001 From: Geetha sowjanya Date: Sun, 16 Jul 2023 15:07:41 +0530 Subject: octeontx2-pf: Dont allocate BPIDs for LBK interfaces Current driver enables backpressure for LBK interfaces. But these interfaces do not support this feature. Hence, this patch fixes the issue by skipping the backpressure configuration for these interfaces. Fixes: 75f36270990c ("octeontx2-pf: Support to enable/disable pause frames via ethtool"). Signed-off-by: Geetha sowjanya Signed-off-by: Sunil Goutham Link: https://lore.kernel.org/r/20230716093741.28063-1-gakula@marvell.com Signed-off-by: Paolo Abeni --- drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c index fe8ea4e531b7..9551b422622a 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c @@ -1454,8 +1454,9 @@ static int otx2_init_hw_resources(struct otx2_nic *pf) if (err) goto err_free_npa_lf; - /* Enable backpressure */ - otx2_nix_config_bp(pf, true); + /* Enable backpressure for CGX mapped PF/VFs */ + if (!is_otx2_lbkvf(pf->pdev)) + otx2_nix_config_bp(pf, true); /* Init Auras and pools used by NIX RQ, for free buffer ptrs */ err = otx2_rq_aura_pool_init(pf); -- cgit From 78adb4bcf99effbb960c5f9091e2e062509d1030 Mon Sep 17 00:00:00 2001 From: Florian Kauer Date: Mon, 17 Jul 2023 10:54:44 -0700 Subject: igc: Prevent garbled TX queue with XDP ZEROCOPY In normal operation, each populated queue item has next_to_watch pointing to the last TX desc of the packet, while each cleaned item has it set to 0. In particular, next_to_use that points to the next (necessarily clean) item to use has next_to_watch set to 0. When the TX queue is used both by an application using AF_XDP with ZEROCOPY as well as a second non-XDP application generating high traffic, the queue pointers can get in an invalid state where next_to_use points to an item where next_to_watch is NOT set to 0. However, the implementation assumes at several places that this is never the case, so if it does hold, bad things happen. In particular, within the loop inside of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. Finally, this prevents any further transmission via this queue and it never gets unblocked or signaled. Secondly, if the queue is in this garbled state, the inner loop of igc_clean_tx_ring() will never terminate, completely hogging a CPU core. The reason is that igc_xdp_xmit_zc() reads next_to_use before acquiring the lock, and writing it back (potentially unmodified) later. If it got modified before locking, the outdated next_to_use is written pointing to an item that was already used elsewhere (and thus next_to_watch got written). Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") Signed-off-by: Florian Kauer Reviewed-by: Kurt Kanzenbach Tested-by: Kurt Kanzenbach Acked-by: Vinicius Costa Gomes Reviewed-by: Simon Horman Tested-by: Naama Meir Signed-off-by: Tony Nguyen Link: https://lore.kernel.org/r/20230717175444.3217831-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/intel/igc/igc_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 9f93f0f4f752..f36bc2a1849a 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2828,9 +2828,8 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) struct netdev_queue *nq = txring_txq(ring); union igc_adv_tx_desc *tx_desc = NULL; int cpu = smp_processor_id(); - u16 ntu = ring->next_to_use; struct xdp_desc xdp_desc; - u16 budget; + u16 budget, ntu; if (!netif_carrier_ok(ring->netdev)) return; @@ -2840,6 +2839,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) /* Avoid transmit queue timeout since we share it with the slow path */ txq_trans_cond_update(nq); + ntu = ring->next_to_use; budget = igc_desc_unused(ring); while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { -- cgit From e7002b3b20c58bce4a88c15aca8e6cc894e3a7ed Mon Sep 17 00:00:00 2001 From: Subbaraya Sundeep Date: Mon, 17 Jul 2023 11:46:43 +0530 Subject: octeontx2-pf: mcs: Generate hash key using ecb(aes) Hardware generated encryption and ICV tags are found to be wrong when tested with IEEE MACSEC test vectors. This is because as per the HRM, the hash key (derived by AES-ECB block encryption of an all 0s block with the SAK) has to be programmed by the software in MCSX_RS_MCS_CPM_TX_SLAVE_SA_PLCY_MEM_4X register. Hence fix this by generating hash key in software and configuring in hardware. Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Subbaraya Sundeep Reviewed-by: Kalesh AP Link: https://lore.kernel.org/r/1689574603-28093-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski --- .../ethernet/marvell/octeontx2/nic/cn10k_macsec.c | 137 +++++++++++++++------ 1 file changed, 100 insertions(+), 37 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c index 6e2fb24be8c1..59b138214af2 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c @@ -4,6 +4,7 @@ * Copyright (C) 2022 Marvell. */ +#include #include #include #include "otx2_common.h" @@ -42,6 +43,56 @@ #define MCS_TCI_E 0x08 /* encryption */ #define MCS_TCI_C 0x04 /* changed text */ +#define CN10K_MAX_HASH_LEN 16 +#define CN10K_MAX_SAK_LEN 32 + +static int cn10k_ecb_aes_encrypt(struct otx2_nic *pfvf, u8 *sak, + u16 sak_len, u8 *hash) +{ + u8 data[CN10K_MAX_HASH_LEN] = { 0 }; + struct skcipher_request *req = NULL; + struct scatterlist sg_src, sg_dst; + struct crypto_skcipher *tfm; + DECLARE_CRYPTO_WAIT(wait); + int err; + + tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0); + if (IS_ERR(tfm)) { + dev_err(pfvf->dev, "failed to allocate transform for ecb-aes\n"); + return PTR_ERR(tfm); + } + + req = skcipher_request_alloc(tfm, GFP_KERNEL); + if (!req) { + dev_err(pfvf->dev, "failed to allocate request for skcipher\n"); + err = -ENOMEM; + goto free_tfm; + } + + err = crypto_skcipher_setkey(tfm, sak, sak_len); + if (err) { + dev_err(pfvf->dev, "failed to set key for skcipher\n"); + goto free_req; + } + + /* build sg list */ + sg_init_one(&sg_src, data, CN10K_MAX_HASH_LEN); + sg_init_one(&sg_dst, hash, CN10K_MAX_HASH_LEN); + + skcipher_request_set_callback(req, 0, crypto_req_done, &wait); + skcipher_request_set_crypt(req, &sg_src, &sg_dst, + CN10K_MAX_HASH_LEN, NULL); + + err = crypto_skcipher_encrypt(req); + err = crypto_wait_req(err, &wait); + +free_req: + skcipher_request_free(req); +free_tfm: + crypto_free_skcipher(tfm); + return err; +} + static struct cn10k_mcs_txsc *cn10k_mcs_get_txsc(struct cn10k_mcs_cfg *cfg, struct macsec_secy *secy) { @@ -330,19 +381,53 @@ fail: return ret; } +static int cn10k_mcs_write_keys(struct otx2_nic *pfvf, + struct macsec_secy *secy, + struct mcs_sa_plcy_write_req *req, + u8 *sak, u8 *salt, ssci_t ssci) +{ + u8 hash_rev[CN10K_MAX_HASH_LEN]; + u8 sak_rev[CN10K_MAX_SAK_LEN]; + u8 salt_rev[MACSEC_SALT_LEN]; + u8 hash[CN10K_MAX_HASH_LEN]; + u32 ssci_63_32; + int err, i; + + err = cn10k_ecb_aes_encrypt(pfvf, sak, secy->key_len, hash); + if (err) { + dev_err(pfvf->dev, "Generating hash using ECB(AES) failed\n"); + return err; + } + + for (i = 0; i < secy->key_len; i++) + sak_rev[i] = sak[secy->key_len - 1 - i]; + + for (i = 0; i < CN10K_MAX_HASH_LEN; i++) + hash_rev[i] = hash[CN10K_MAX_HASH_LEN - 1 - i]; + + for (i = 0; i < MACSEC_SALT_LEN; i++) + salt_rev[i] = salt[MACSEC_SALT_LEN - 1 - i]; + + ssci_63_32 = (__force u32)cpu_to_be32((__force u32)ssci); + + memcpy(&req->plcy[0][0], sak_rev, secy->key_len); + memcpy(&req->plcy[0][4], hash_rev, CN10K_MAX_HASH_LEN); + memcpy(&req->plcy[0][6], salt_rev, MACSEC_SALT_LEN); + req->plcy[0][7] |= (u64)ssci_63_32 << 32; + + return 0; +} + static int cn10k_mcs_write_rx_sa_plcy(struct otx2_nic *pfvf, struct macsec_secy *secy, struct cn10k_mcs_rxsc *rxsc, u8 assoc_num, bool sa_in_use) { - unsigned char *src = rxsc->sa_key[assoc_num]; struct mcs_sa_plcy_write_req *plcy_req; - u8 *salt_p = rxsc->salt[assoc_num]; + u8 *sak = rxsc->sa_key[assoc_num]; + u8 *salt = rxsc->salt[assoc_num]; struct mcs_rx_sc_sa_map *map_req; struct mbox *mbox = &pfvf->mbox; - u64 ssci_salt_95_64 = 0; - u8 reg, key_len; - u64 salt_63_0; int ret; mutex_lock(&mbox->lock); @@ -360,20 +445,10 @@ static int cn10k_mcs_write_rx_sa_plcy(struct otx2_nic *pfvf, goto fail; } - for (reg = 0, key_len = 0; key_len < secy->key_len; key_len += 8) { - memcpy((u8 *)&plcy_req->plcy[0][reg], - (src + reg * 8), 8); - reg++; - } - - if (secy->xpn) { - memcpy((u8 *)&salt_63_0, salt_p, 8); - memcpy((u8 *)&ssci_salt_95_64, salt_p + 8, 4); - ssci_salt_95_64 |= (__force u64)rxsc->ssci[assoc_num] << 32; - - plcy_req->plcy[0][6] = salt_63_0; - plcy_req->plcy[0][7] = ssci_salt_95_64; - } + ret = cn10k_mcs_write_keys(pfvf, secy, plcy_req, sak, + salt, rxsc->ssci[assoc_num]); + if (ret) + goto fail; plcy_req->sa_index[0] = rxsc->hw_sa_id[assoc_num]; plcy_req->sa_cnt = 1; @@ -586,13 +661,10 @@ static int cn10k_mcs_write_tx_sa_plcy(struct otx2_nic *pfvf, struct cn10k_mcs_txsc *txsc, u8 assoc_num) { - unsigned char *src = txsc->sa_key[assoc_num]; struct mcs_sa_plcy_write_req *plcy_req; - u8 *salt_p = txsc->salt[assoc_num]; + u8 *sak = txsc->sa_key[assoc_num]; + u8 *salt = txsc->salt[assoc_num]; struct mbox *mbox = &pfvf->mbox; - u64 ssci_salt_95_64 = 0; - u8 reg, key_len; - u64 salt_63_0; int ret; mutex_lock(&mbox->lock); @@ -603,19 +675,10 @@ static int cn10k_mcs_write_tx_sa_plcy(struct otx2_nic *pfvf, goto fail; } - for (reg = 0, key_len = 0; key_len < secy->key_len; key_len += 8) { - memcpy((u8 *)&plcy_req->plcy[0][reg], (src + reg * 8), 8); - reg++; - } - - if (secy->xpn) { - memcpy((u8 *)&salt_63_0, salt_p, 8); - memcpy((u8 *)&ssci_salt_95_64, salt_p + 8, 4); - ssci_salt_95_64 |= (__force u64)txsc->ssci[assoc_num] << 32; - - plcy_req->plcy[0][6] = salt_63_0; - plcy_req->plcy[0][7] = ssci_salt_95_64; - } + ret = cn10k_mcs_write_keys(pfvf, secy, plcy_req, sak, + salt, txsc->ssci[assoc_num]); + if (ret) + goto fail; plcy_req->plcy[0][8] = assoc_num; plcy_req->sa_index[0] = txsc->hw_sa_id[assoc_num]; -- cgit From 78a93c31003cc53aca5d67b1bbe2d5b9fc37cc4d Mon Sep 17 00:00:00 2001 From: Yuanjun Gong Date: Mon, 17 Jul 2023 22:46:21 +0800 Subject: drivers: net: fix return value check in emac_tso_csum() in emac_tso_csum(), return an error code if an unexpected value is returned by pskb_trim(). Signed-off-by: Yuanjun Gong Reviewed-by: Kuniyuki Iwashima Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 0d80447d4d3b..d5c688a8d7be 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -1260,8 +1260,11 @@ static int emac_tso_csum(struct emac_adapter *adpt, if (skb->protocol == htons(ETH_P_IP)) { u32 pkt_len = ((unsigned char *)ip_hdr(skb) - skb->data) + ntohs(ip_hdr(skb)->tot_len); - if (skb->len > pkt_len) - pskb_trim(skb, pkt_len); + if (skb->len > pkt_len) { + ret = pskb_trim(skb, pkt_len); + if (unlikely(ret)) + return ret; + } } hdr_len = skb_tcp_all_headers(skb); -- cgit From bce5603365d8184734ba7e6b22e74bd2c90a7167 Mon Sep 17 00:00:00 2001 From: Yuanjun Gong Date: Mon, 17 Jul 2023 22:46:52 +0800 Subject: drivers:net: fix return value check in ocelot_fdma_receive_skb ocelot_fdma_receive_skb should return false if an unexpected value is returned by pskb_trim. Signed-off-by: Yuanjun Gong Reviewed-by: Kuniyuki Iwashima Signed-off-by: David S. Miller --- drivers/net/ethernet/mscc/ocelot_fdma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/mscc/ocelot_fdma.c b/drivers/net/ethernet/mscc/ocelot_fdma.c index 8e3894cf5f7c..83a3ce0c568e 100644 --- a/drivers/net/ethernet/mscc/ocelot_fdma.c +++ b/drivers/net/ethernet/mscc/ocelot_fdma.c @@ -368,7 +368,8 @@ static bool ocelot_fdma_receive_skb(struct ocelot *ocelot, struct sk_buff *skb) if (unlikely(!ndev)) return false; - pskb_trim(skb, skb->len - ETH_FCS_LEN); + if (pskb_trim(skb, skb->len - ETH_FCS_LEN)) + return false; skb->dev = ndev; skb->protocol = eth_type_trans(skb, skb->dev); -- cgit From cf2ffdea0839398cb0551762af7f5efb0a6e0fea Mon Sep 17 00:00:00 2001 From: Heiner Kallweit Date: Tue, 18 Jul 2023 13:11:31 +0200 Subject: r8169: revert 2ab19de62d67 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll") There have been reports that on a number of systems this change breaks network connectivity. Therefore effectively revert it. Mainly affected seem to be systems where BIOS denies ASPM access to OS. Due to later changes we can't do a direct revert. Fixes: 2ab19de62d67 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/netdev/e47bac0d-e802-65e1-b311-6acb26d5cf10@freenet.de/T/ Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217596 Signed-off-by: Heiner Kallweit Link: https://lore.kernel.org/r/57f13ec0-b216-d5d8-363d-5b05528ec5fb@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index fce4a2b908c2..8a8b7d8a5c3f 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -623,6 +623,7 @@ struct rtl8169_private { int cfg9346_usage_count; unsigned supports_gmii:1; + unsigned aspm_manageable:1; dma_addr_t counters_phys_addr; struct rtl8169_counters *counters; struct rtl8169_tc_offsets tc_offset; @@ -2746,7 +2747,8 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable) if (tp->mac_version < RTL_GIGA_MAC_VER_32) return; - if (enable) { + /* Don't enable ASPM in the chip if OS can't control ASPM */ + if (enable && tp->aspm_manageable) { /* On these chip versions ASPM can even harm * bus communication of other PCI devices. */ @@ -5165,6 +5167,16 @@ done: rtl_rar_set(tp, mac_addr); } +/* register is set if system vendor successfully tested ASPM 1.2 */ +static bool rtl_aspm_is_safe(struct rtl8169_private *tp) +{ + if (tp->mac_version >= RTL_GIGA_MAC_VER_61 && + r8168_mac_ocp_read(tp, 0xc0b2) & 0xf) + return true; + + return false; +} + static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) { struct rtl8169_private *tp; @@ -5234,6 +5246,19 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) xid); tp->mac_version = chipset; + /* Disable ASPM L1 as that cause random device stop working + * problems as well as full system hangs for some PCIe devices users. + * Chips from RTL8168h partially have issues with L1.2, but seem + * to work fine with L1 and L1.1. + */ + if (rtl_aspm_is_safe(tp)) + rc = 0; + else if (tp->mac_version >= RTL_GIGA_MAC_VER_46) + rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2); + else + rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1); + tp->aspm_manageable = !rc; + tp->dash_type = rtl_check_dash(tp); tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK; -- cgit From e31a9fedc7d8d80722b19628e66fcb5a36981780 Mon Sep 17 00:00:00 2001 From: Heiner Kallweit Date: Tue, 18 Jul 2023 13:12:32 +0200 Subject: Revert "r8169: disable ASPM during NAPI poll" This reverts commit e1ed3e4d91112027b90c7ee61479141b3f948e6a. Turned out the change causes a performance regression. Link: https://lore.kernel.org/netdev/20230713124914.GA12924@green245/T/ Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit Link: https://lore.kernel.org/r/055c6bc2-74fa-8c67-9897-3f658abb5ae7@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/realtek/r8169_main.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 8a8b7d8a5c3f..5eb50b265c0b 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4523,10 +4523,6 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) } if (napi_schedule_prep(&tp->napi)) { - rtl_unlock_config_regs(tp); - rtl_hw_aspm_clkreq_enable(tp, false); - rtl_lock_config_regs(tp); - rtl_irq_disable(tp); __napi_schedule(&tp->napi); } @@ -4586,14 +4582,9 @@ static int rtl8169_poll(struct napi_struct *napi, int budget) work_done = rtl_rx(dev, tp, budget); - if (work_done < budget && napi_complete_done(napi, work_done)) { + if (work_done < budget && napi_complete_done(napi, work_done)) rtl_irq_enable(tp); - rtl_unlock_config_regs(tp); - rtl_hw_aspm_clkreq_enable(tp, true); - rtl_lock_config_regs(tp); - } - return work_done; } -- cgit From 9f9d4c1a2e82174a4e799ec405284a2b0de32b6a Mon Sep 17 00:00:00 2001 From: Daniel Golle Date: Wed, 19 Jul 2023 01:39:36 +0100 Subject: net: ethernet: mtk_eth_soc: always mtk_get_ib1_pkt_type entries and bind debugfs files would display wrong data on NETSYS_V2 and later because instead of using mtk_get_ib1_pkt_type the driver would use MTK_FOE_IB1_PACKET_TYPE which corresponds to NETSYS_V1(.x) SoCs. Use mtk_get_ib1_pkt_type so entries and bind records display correctly. Fixes: 03a3180e5c09e ("net: ethernet: mtk_eth_soc: introduce flow offloading support for mt7986") Signed-off-by: Daniel Golle Acked-by: Lorenzo Bianconi Link: https://lore.kernel.org/r/c0ae03d0182f4d27b874cbdf0059bc972c317f3c.1689727134.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/net') diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c index 316fe2e70fea..1a97feca77f2 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c @@ -98,7 +98,7 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind) acct = mtk_foe_entry_get_mib(ppe, i, NULL); - type = FIELD_GET(MTK_FOE_IB1_PACKET_TYPE, entry->ib1); + type = mtk_get_ib1_pkt_type(ppe->eth, entry->ib1); seq_printf(m, "%05x %s %7s", i, mtk_foe_entry_state_str(state), mtk_foe_pkt_type_str(type)); -- cgit From 1c613beaf877c0c0d755853dc62687e2013e55c4 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Thu, 20 Jul 2023 03:02:31 +0300 Subject: net: phy: prevent stale pointer dereference in phy_init() mdio_bus_init() and phy_driver_register() both have error paths, and if those are ever hit, ethtool will have a stale pointer to the phy_ethtool_phy_ops stub structure, which references memory from a module that failed to load (phylib). It is probably hard to force an error in this code path even manually, but the error teardown path of phy_init() should be the same as phy_exit(), which is now simply not the case. Fixes: 55d8f053ce1b ("net: phy: Register ethtool PHY operations") Link: https://lore.kernel.org/netdev/ZLaiJ4G6TaJYGJyU@shell.armlinux.org.uk/ Suggested-by: Russell King (Oracle) Signed-off-by: Vladimir Oltean Link: https://lore.kernel.org/r/20230720000231.1939689-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/phy/phy_device.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) (limited to 'drivers/net') diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 0c2014accba7..61921d4dbb13 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3451,23 +3451,30 @@ static int __init phy_init(void) { int rc; + ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops); + rc = mdio_bus_init(); if (rc) - return rc; + goto err_ethtool_phy_ops; - ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops); features_init(); rc = phy_driver_register(&genphy_c45_driver, THIS_MODULE); if (rc) - goto err_c45; + goto err_mdio_bus; rc = phy_driver_register(&genphy_driver, THIS_MODULE); - if (rc) { - phy_driver_unregister(&genphy_c45_driver); + if (rc) + goto err_c45; + + return 0; + err_c45: - mdio_bus_exit(); - } + phy_driver_unregister(&genphy_c45_driver); +err_mdio_bus: + mdio_bus_exit(); +err_ethtool_phy_ops: + ethtool_set_ethtool_phy_ops(NULL); return rc; } -- cgit