linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-10-25	iavf: in iavf_down, disable queues when removing the driver	Michal Schmidt
	In iavf_down, we're skipping the scheduling of certain operations if the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES request must not be skipped in this case, because iavf_close waits for the transition to the __IAVF_DOWN state, which happens in iavf_virtchnl_completion after the queues are released. Without this fix, "rmmod iavf" takes half a second per interface that's up and prints the "Device resources not yet released" warning. Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Tested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-24	i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR	Ivan Vecera
	The I40E_TXR_FLAGS_WB_ON_ITR is i40e_ring flag and not i40e_pf one. Fixes: 8e0764b4d6be42 ("i40e/i40evf: Add support for writeback on ITR feature for X722") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231023212714.178032-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-23	sfc: cleanup and reduce netlink error messages	Pieter Jansen van Vuuren
	Reduce the length of netlink error messages as they are likely to be truncated anyway. Additionally, reword netlink error messages so they are more consistent with previous messages. Fixes: 9dbc8d2b9a02 ("sfc: add decrement ipv6 hop limit by offloading set hop limit actions") Fixes: 3c9561c0a5b9 ("sfc: support TC decap rules matching on enc_ip_tos") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310202136.4u7bv0hp-lkp@intel.com/ Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20231020140149.30490-1-pieter.jansen-van-vuuren@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-22	net: chelsio: cxgb4: add an error code check in t4_load_phy_fw	Su Hui
	t4_set_params_timeout() can return -EINVAL if failed, add check for this. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-21	net: ethernet: adi: adin1110: Fix uninitialized variable	Dell Jin
	The spi_transfer struct has to have all it's fields initialized to 0 in this case, since not all of them are set before starting the transfer. Otherwise, spi_sync_transfer() will sometimes return an error. Fixes: a526a3cc9c8d ("net: ethernet: adi: adin1110: Fix SPI transfers") Signed-off-by: Dell Jin <dell.jin.code@outlook.com> Signed-off-by: Ciprian Regus <ciprian.regus@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-21	net: stmmac: update MAC capabilities when tx queues are updated	Michael Sit Wei Hong
	Upon boot up, the driver will configure the MAC capabilities based on the maximum number of tx and rx queues. When the user changes the tx queues to single queue, the MAC should be capable of supporting Half Duplex, but the driver does not update the MAC capabilities when it is configured so. Using the stmmac_reinit_queues() to check the number of tx queues and set the MAC capabilities accordingly. Fixes: 0366f7e06a6b ("net: stmmac: add ethtool support for get/set channels") Cc: <stable@vger.kernel.org> # 5.17+ Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com> Signed-off-by: Gan, Yi Fang <yi.fang.gan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-21	net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF	Rob Herring
	Commit b0377116decd ("net: ethernet: Use device_get_match_data()") dropped the unconditional use of xgene_enet_of_match resulting in this warning: drivers/net/ethernet/apm/xgene/xgene_enet_main.c:2004:34: warning: unused variable 'xgene_enet_of_match' [-Wunused-const-variable] The fix is to drop of_match_ptr() which is not necessary because DT is always used for this driver (well, it could in theory support ACPI only, but CONFIG_OF is always enabled for arm64). Fixes: b0377116decd ("net: ethernet: Use device_get_match_data()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310170627.2Kvf6ZHY-lkp@intel.com/ Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	i40e: sync next_to_clean and next_to_process for programming status desc	Tirthendu Sarkar
	When a programming status desc is encountered on the rx_ring, next_to_process is bumped along with cleaned_count but next_to_clean is not. This causes I40E_DESC_UNUSED() macro to misbehave resulting in overwriting whole ring with new buffers. Update next_to_clean to point to next_to_process on seeing a programming status desc if not in the middle of handling a multi-frag packet. Also, bump cleaned_count only for such case as otherwise next_to_clean buffer may be returned to hardware on reaching clean_threshold. Fixes: e9031f2da1ae ("i40e: introduce next_to_process to i40e_ring") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reported-by: hq.dev+kernel@msdfc.xyz Reported by: Solomon Peachy <pizza@shaftnet.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217678 Tested-by: hq.dev+kernel@msdfc.xyz Tested by: Indrek Järve <incx@dustbite.net> Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231019203852.3663665-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-20	igc: Fix ambiguity in the ethtool advertising	Sasha Neftin
	The 'ethtool_convert_link_mode_to_legacy_u32' method does not allow us to advertise 2500M speed support and TP (twisted pair) properly. Convert to 'ethtool_link_ksettings_test_link_mode' to advertise supported speed and eliminate ambiguity. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Suggested-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231019203641.3661960-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-20	igb: Fix potential memory leak in igb_add_ethtool_nfc_entry	Mateusz Palczewski
	Add check for return of igb_update_ethtool_nfc_entry so that in case of any potential errors the memory alocated for input will be freed. Fixes: 0e71def25281 ("igb: add support of RX network flow classification") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	treewide: Spelling fix in comment	Kunwu Chan
	reques -> request Fixes: 09dde54c6a69 ("PS3: gelic: Add wireless support for PS3") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	i40e: Fix I40E_FLAG_VF_VLAN_PRUNING value	Ivan Vecera
	Commit c87c938f62d8f1 ("i40e: Add VF VLAN pruning") added new PF flag I40E_FLAG_VF_VLAN_PRUNING but its value collides with existing I40E_FLAG_TOTAL_PORT_SHUTDOWN_ENABLED flag. Move the affected flag at the end of the flags and fix its value. Reproducer: [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close on [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning on [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off [ 6323.142585] i40e 0000:02:00.0: Setting link-down-on-close not supported on this port (because total-port-shutdown is enabled) netlink error: Operation not supported [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning off [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off The link-down-on-close flag cannot be modified after setting vf-vlan-pruning because vf-vlan-pruning shares the same bit with total-port-shutdown flag that prevents any modification of link-down-on-close flag. Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning") Cc: Mateusz Palczewski <mateusz.palczewski@intel.com> Cc: Simon Horman <horms@kernel.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	iavf: initialize waitqueues before starting watchdog_task	Michal Schmidt
	It is not safe to initialize the waitqueues after queueing the watchdog_task. It will be using them. The chance of this causing a real problem is very small, because there will be some sleeping before any of the waitqueues get used. I got a crash only after inserting an artificial sleep in iavf_probe. Queue the watchdog_task as the last step in iavf_probe. Add a comment to prevent repeating the mistake. Fixes: fe2647ab0c99 ("i40evf: prevent VF close returning before state transitions to DOWN") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1	Mirsad Goran Todorovac
	KCSAN reported the following data-race bug: ================================================================== BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169 race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21: rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169 __napi_poll (net/core/dev.c:6527) net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727) __do_softirq (kernel/softirq.c:553) __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632) irq_exit_rcu (kernel/softirq.c:647) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14)) asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645) cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) cpuidle_enter (drivers/cpuidle/cpuidle.c:390) call_cpuidle (kernel/sched/idle.c:135) do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282) cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) value changed: 0x80003fff -> 0x3402805f Reported by Kernel Concurrency Sanitizer on: CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41 Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 ================================================================== drivers/net/ethernet/realtek/r8169_main.c: ========================================== 4429 → 4430 status = le32_to_cpu(desc->opts1); 4431 if (status & DescOwn) 4432 break; 4433 4434 /* This barrier is needed to keep us from reading 4435 * any other fields out of the Rx descriptor until 4436 * we know the status of DescOwn 4437 */ 4438 dma_rmb(); 4439 4440 if (unlikely(status & RxRES)) { 4441 if (net_ratelimit()) 4442 netdev_warn(dev, "Rx ERROR. status = %08x\n", Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from happening: 4429 → 4430 status = le32_to_cpu(READ_ONCE(desc->opts1)); 4431 if (status & DescOwn) 4432 break; 4433 As the consequence of this fix, this KCSAN warning was eliminated. Fixes: 6202806e7c03a ("r8169: drop member opts1_mask from struct rtl8169_private") Suggested-by: Marco Elver <elver@google.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: nic_swsd@realtek.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/ Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Acked-by: Marco Elver <elver@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	r8169: fix the KCSAN reported data-race in rtl_tx while reading ↵	Mirsad Goran Todorovac
	TxDescArray[entry].opts1 KCSAN reported the following data-race: ================================================================== BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169 race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21: rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169 __napi_poll (net/core/dev.c:6527) net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727) __do_softirq (kernel/softirq.c:553) __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632) irq_exit_rcu (kernel/softirq.c:647) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14)) asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645) cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) cpuidle_enter (drivers/cpuidle/cpuidle.c:390) call_cpuidle (kernel/sched/idle.c:135) do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282) cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) value changed: 0xb0000042 -> 0x00000000 Reported by Kernel Concurrency Sanitizer on: CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41 Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 ================================================================== The read side is in drivers/net/ethernet/realtek/r8169_main.c ========================================= 4355 static void rtl_tx(struct net_device dev, struct rtl8169_private tp, 4356 int budget) 4357 { 4358 unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0; 4359 struct sk_buff skb; 4360 4361 dirty_tx = tp->dirty_tx; 4362 4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) { 4364 unsigned int entry = dirty_tx % NUM_TX_DESC; 4365 u32 status; 4366 → 4367 status = le32_to_cpu(tp->TxDescArray[entry].opts1); 4368 if (status & DescOwn) 4369 break; 4370 4371 skb = tp->tx_skb[entry].skb; 4372 rtl8169_unmap_tx_skb(tp, entry); 4373 4374 if (skb) { 4375 pkts_compl++; 4376 bytes_compl += skb->len; 4377 napi_consume_skb(skb, budget); 4378 } 4379 dirty_tx++; 4380 } 4381 4382 if (tp->dirty_tx != dirty_tx) { 4383 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl); 4384 WRITE_ONCE(tp->dirty_tx, dirty_tx); 4385 4386 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl, 4387 rtl_tx_slots_avail(tp), 4388 R8169_TX_START_THRS); 4389 / 4390 * 8168 hack: TxPoll requests are lost when the Tx packets are 4391 * too close. Let's kick an extra TxPoll request when a burst 4392 * of start_xmit activity is detected (if it is not detected, 4393 * it is slow enough). -- FR 4394 * If skb is NULL then we come here again once a tx irq is 4395 * triggered after the last fragment is marked transmitted. 4396 */ 4397 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb) 4398 rtl8169_doorbell(tp); 4399 } 4400 } tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes this KCSAN warning. 4366 → 4367 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1)); 4368 if (status & DescOwn) 4369 break; 4370 Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: nic_swsd@realtek.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Marco Elver <elver@google.com> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/ Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Acked-by: Marco Elver <elver@google.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20	r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx	Mirsad Goran Todorovac
	KCSAN reported the following data-race: ================================================================== BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169] write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29: rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169 dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560) sch_direct_xmit (net/sched/sch_generic.c:342) __dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306) ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233) __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293) ip_finish_output (net/ipv4/ip_output.c:328) ip_output (net/ipv4/ip_output.c:435) ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486) udp_send_skb (net/ipv4/udp.c:963) udp_sendmsg (net/ipv4/udp.c:1246) inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4)) sock_sendmsg (net/socket.c:730 net/socket.c:753) __sys_sendto (net/socket.c:2177) __x64_sys_sendto (net/socket.c:2185) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21: rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169 __napi_poll (net/core/dev.c:6527) net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727) __do_softirq (kernel/softirq.c:553) __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632) irq_exit_rcu (kernel/softirq.c:647) common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14)) asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636) cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) cpuidle_enter (drivers/cpuidle/cpuidle.c:390) call_cpuidle (kernel/sched/idle.c:135) do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282) cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) value changed: 0x002f4815 -> 0x002f4816 Reported by Kernel Concurrency Sanitizer on: CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41 Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 ================================================================== The write side of drivers/net/ethernet/realtek/r8169_main.c is: ================== 4251 /* rtl_tx needs to see descriptor changes before updated tp->cur_tx / 4252 smp_wmb(); 4253 → 4254 WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1); 4255 4256 stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp), 4257 R8169_TX_STOP_THRS, 4258 R8169_TX_START_THRS); The read side is the function rtl_tx(): 4355 static void rtl_tx(struct net_device dev, struct rtl8169_private tp, 4356 int budget) 4357 { 4358 unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0; 4359 struct sk_buff skb; 4360 4361 dirty_tx = tp->dirty_tx; 4362 4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) { 4364 unsigned int entry = dirty_tx % NUM_TX_DESC; 4365 u32 status; 4366 4367 status = le32_to_cpu(tp->TxDescArray[entry].opts1); 4368 if (status & DescOwn) 4369 break; 4370 4371 skb = tp->tx_skb[entry].skb; 4372 rtl8169_unmap_tx_skb(tp, entry); 4373 4374 if (skb) { 4375 pkts_compl++; 4376 bytes_compl += skb->len; 4377 napi_consume_skb(skb, budget); 4378 } 4379 dirty_tx++; 4380 } 4381 4382 if (tp->dirty_tx != dirty_tx) { 4383 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl); 4384 WRITE_ONCE(tp->dirty_tx, dirty_tx); 4385 4386 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl, 4387 rtl_tx_slots_avail(tp), 4388 R8169_TX_START_THRS); 4389 /* 4390 * 8168 hack: TxPoll requests are lost when the Tx packets are 4391 * too close. Let's kick an extra TxPoll request when a burst 4392 * of start_xmit activity is detected (if it is not detected, 4393 * it is slow enough). -- FR 4394 * If skb is NULL then we come here again once a tx irq is 4395 * triggered after the last fragment is marked transmitted. 4396 */ → 4397 if (tp->cur_tx != dirty_tx && skb) 4398 rtl8169_doorbell(tp); 4399 } 4400 } Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the line 4363: 4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) { but the same solution is required for protecting the other access to tp->cur_tx: → 4397 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb) 4398 rtl8169_doorbell(tp); The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397 might have suffered read tearing under some compiler optimisations. The fix eliminated the KCSAN data-race report for this bug. It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363 and line 4397. This test should certainly not be cached by the compiler in some register for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line 4254 in the meantime. Fixes: 94d8a98e6235c ("r8169: reduce number of workaround doorbell rings") Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: nic_swsd@realtek.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Marco Elver <elver@google.com> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/ Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Acked-by: Marco Elver <elver@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-19	i40e: xsk: remove count_mask	Maciej Fijalkowski
	Cited commit introduced a neat way of updating next_to_clean that does not require boundary checks on each increment. This was done by masking the new value with (ring length - 1) mask. Problem is that this is applicable only for power of 2 ring sizes, for every other size this assumption can not be made. In turn, it leads to cleaning descriptors out of order as well as splats: [ 1388.411915] Workqueue: events xp_release_deferred [ 1388.411919] RIP: 0010:xp_free+0x1a/0x50 [ 1388.411921] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 8b 57 70 48 8d 47 70 48 89 e5 48 39 d0 74 06 <5d> c3 cc cc cc cc 48 8b 57 60 83 82 b8 00 00 00 01 48 8b 57 60 48 [ 1388.411922] RSP: 0018:ffa0000000a83cb0 EFLAGS: 00000206 [ 1388.411923] RAX: ff11000119aa5030 RBX: 000000000000001d RCX: ff110001129b6e50 [ 1388.411924] RDX: ff11000119aa4fa0 RSI: 0000000055555554 RDI: ff11000119aa4fc0 [ 1388.411925] RBP: ffa0000000a83cb0 R08: 0000000000000000 R09: 0000000000000000 [ 1388.411926] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000115829b80 [ 1388.411927] R13: 000000000000005f R14: 0000000000000000 R15: ff11000119aa4fc0 [ 1388.411928] FS: 0000000000000000(0000) GS:ff11000277e00000(0000) knlGS:0000000000000000 [ 1388.411929] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1388.411930] CR2: 00007f1f564e6c14 CR3: 000000000783c005 CR4: 0000000000771ef0 [ 1388.411931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1388.411931] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1388.411932] PKRU: 55555554 [ 1388.411933] Call Trace: [ 1388.411934] <IRQ> [ 1388.411935] ? show_regs+0x6e/0x80 [ 1388.411937] ? watchdog_timer_fn+0x1d2/0x240 [ 1388.411939] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 1388.411941] ? __hrtimer_run_queues+0x10e/0x290 [ 1388.411945] ? clockevents_program_event+0xae/0x130 [ 1388.411947] ? hrtimer_interrupt+0x105/0x240 [ 1388.411949] ? __sysvec_apic_timer_interrupt+0x54/0x150 [ 1388.411952] ? sysvec_apic_timer_interrupt+0x7f/0x90 [ 1388.411955] </IRQ> [ 1388.411955] <TASK> [ 1388.411956] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 1388.411958] ? xp_free+0x1a/0x50 [ 1388.411960] i40e_xsk_clean_rx_ring+0x5d/0x100 [i40e] [ 1388.411968] i40e_clean_rx_ring+0x14c/0x170 [i40e] [ 1388.411977] i40e_queue_pair_disable+0xda/0x260 [i40e] [ 1388.411986] i40e_xsk_pool_setup+0x192/0x1d0 [i40e] [ 1388.411993] i40e_reconfig_rss_queues+0x1f0/0x1450 [i40e] [ 1388.412002] xp_disable_drv_zc+0x73/0xf0 [ 1388.412004] ? mutex_lock+0x17/0x50 [ 1388.412007] xp_release_deferred+0x2b/0xc0 [ 1388.412010] process_one_work+0x178/0x350 [ 1388.412011] ? __pfx_worker_thread+0x10/0x10 [ 1388.412012] worker_thread+0x2f7/0x420 [ 1388.412014] ? __pfx_worker_thread+0x10/0x10 [ 1388.412015] kthread+0xf8/0x130 [ 1388.412017] ? __pfx_kthread+0x10/0x10 [ 1388.412019] ret_from_fork+0x3d/0x60 [ 1388.412021] ? __pfx_kthread+0x10/0x10 [ 1388.412023] ret_from_fork_asm+0x1b/0x30 [ 1388.412026] </TASK> It comes from picking wrong ring entries when cleaning xsk buffers during pool detach. Remove the count_mask logic and use they boundary check when updating next_to_process (which used to be a next_to_clean). Fixes: c8a8ca3408dc ("i40e: remove unnecessary memory writes of the next to clean pointer") Reported-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Tested-by: Tushar Vyavahare <tushar.vyavahare@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231018163908.40841-1-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19	net: ti: icssg-prueth: Fix r30 CMDs bitmasks	MD Danish Anwar
	The bitmasks for EMAC_PORT_DISABLE and EMAC_PORT_FORWARD r30 commands are wrong in the driver. Update the bitmasks of these commands to the correct ones as used by the ICSSG firmware. These bitmasks are backwards compatible and work with any ICSSG firmware version. Fixes: e9b4ece7d74b ("net: ti: icssg-prueth: Add Firmware config and classification APIs.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231018150715.3085380-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19	net: ethernet: ti: Fix mixed module-builtin object	MD Danish Anwar
	With CONFIG_TI_K3_AM65_CPSW_NUSS=y and CONFIG_TI_ICSSG_PRUETH=m, k3-cppi-desc-pool.o is linked to a module and also to vmlinux even though the expected CFLAGS are different between builtins and modules. The build system is complaining about the following: k3-cppi-desc-pool.o is added to multiple modules: icssg-prueth ti-am65-cpsw-nuss Introduce the new module, k3-cppi-desc-pool, to provide the common functions to ti-am65-cpsw-nuss and icssg-prueth. Fixes: 128d5874c082 ("net: ti: icssg-prueth: Add ICSSG ethernet driver") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20231018064936.3146846-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-18	octeon_ep: update BQL sent bytes before ringing doorbell	Shinas Rasheed
	Sometimes Tx is completed immediately after doorbell is updated, which causes Tx completion routing to update completion bytes before the same packet bytes are updated in sent bytes in transmit function, hence hitting BUG_ON() in dql_completed(). To avoid this, update BQL sent bytes before ringing doorbell. Fixes: 37d79d059606 ("octeon_ep: add Tx/Rx processing and interrupt support") Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://lore.kernel.org/r/20231017105030.2310966-1-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-17	gve: Do not fully free QPL pages on prefill errors	Shailend Chand
	The prefill function should have only removed the page count bias it added. Fully freeing the page will cause gve_free_queue_page_list to free a page the driver no longer owns. Fixes: 82fd151d38d9 ("gve: Reduce alloc and copy costs in the GQ rx path") Signed-off-by: Shailend Chand <shailend@google.com> Link: https://lore.kernel.org/r/20231014014121.2843922-1-shailend@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-15	qed: fix LL2 RX buffer allocation	Manish Chopra
	Driver allocates the LL2 rx buffers from kmalloc() area to construct the skb using slab_build_skb() The required size allocation seems to have overlooked for accounting both skb_shared_info size and device placement padding bytes which results into the below panic when doing skb_put() for a standard MTU sized frame. skbuff: skb_over_panic: text:ffffffffc0b0225f len:1514 put:1514 head:ff3dabceaf39c000 data:ff3dabceaf39c042 tail:0x62c end:0x566 dev:<NULL> … skb_panic+0x48/0x4a skb_put.cold+0x10/0x10 qed_ll2b_complete_rx_packet+0x14f/0x260 [qed] qed_ll2_rxq_handle_completion.constprop.0+0x169/0x200 [qed] qed_ll2_rxq_completion+0xba/0x320 [qed] qed_int_sp_dpc+0x1a7/0x1e0 [qed] This patch fixes this by accouting skb_shared_info and device placement padding size bytes when allocating the buffers. Cc: David S. Miller <davem@davemloft.net> Fixes: 0a7fb11c23c0 ("qed: Add Light L2 support") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13	Merge tag 'mlx5-fixes-2023-10-12' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-10-12 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix VF representors reporting zero counters to "ip -s" command net/mlx5e: Don't offload internal port if filter device is out device net/mlx5e: Take RTNL lock before triggering netdev notifiers net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown net/mlx5e: RX, Fix page_pool allocation failure recovery for legacy rq net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq net/mlx5: Handle fw tracer change ownership event based on MTRC net/mlx5: Bridge, fix peer entry ageing in LAG mode net/mlx5: E-switch, register event handler before arming the event net/mlx5: Perform DMA operations in the right locations ==================== Link: https://lore.kernel.org/r/20231012195127.129585-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	ice: Fix safe mode when DDP is missing	Mateusz Pacuszka
	One thing is broken in the safe mode, that is ice_deinit_features() is being executed even that ice_init_features() was not causing stack trace during pci_unregister_driver(). Add check on the top of the function. Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-4-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	ice: reset first in crash dump kernels	Jesse Brandeburg
	When the system boots into the crash dump kernel after a panic, the ice networking device may still have pending transactions that can cause errors or machine checks when the device is re-enabled. This can prevent the crash dump kernel from loading the driver or collecting the crash data. To avoid this issue, perform a function level reset (FLR) on the ice device via PCIe config space before enabling it on the crash kernel. This will clear any outstanding transactions and stop all queues and interrupts. Restore the config space after the FLR, otherwise it was found in testing that the driver wouldn't load successfully. The following sequence causes the original issue: - Load the ice driver with modprobe ice - Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs - Trigger a crash with echo c > /proc/sysrq-trigger - Load the ice driver again (or let it load automatically) with modprobe ice - The system crashes again during pcim_enable_device() Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") Reported-by: Vishal Agrawal <vagrawal@redhat.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	i40e: prevent crash on probe if hw registers have invalid values	Michal Schmidt
	The hardware provides the indexes of the first and the last available queue and VF. From the indexes, the driver calculates the numbers of queues and VFs. In theory, a faulty device might say the last index is smaller than the first index. In that case, the driver's calculation would underflow, it would attempt to write to non-existent registers outside of the ioremapped range and crash. I ran into this not by having a faulty device, but by an operator error. I accidentally ran a QE test meant for i40e devices on an ice device. The test used 'echo i40e > /sys/...ice PCI device.../driver_override', bound the driver to the device and crashed in one of the wr32 calls in i40e_clear_hw. Add checks to prevent underflows in the calculations of num_queues and num_vfs. With this fix, the wrong device probing reports errors and returns a failure without crashing. Fixes: 838d41d92a90 ("i40e: clear all queues and interrupts") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20231011233334.336092-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	net: ti: icssg-prueth: Fix tx_total_bytes count	MD Danish Anwar
	ICSSG HW stats on TX side considers 8 preamble bytes as data bytes. Due to this the tx_bytes of ICSSG interface doesn't match the rx_bytes of the link partner. There is no public errata available yet. As a workaround to fix this, decrease tx_bytes by 8 bytes for every tx frame. Fixes: c1e10d5dc7a1 ("net: ti: icssg-prueth: Add ICSSG Stats") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20231012064626.977466-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	tcp: allow again tcp_disconnect() when threads are waiting	Paolo Abeni
	As reported by Tom, .NET and applications build on top of it rely on connect(AF_UNSPEC) to async cancel pending I/O operations on TCP socket. The blamed commit below caused a regression, as such cancellation can now fail. As suggested by Eric, this change addresses the problem explicitly causing blocking I/O operation to terminate immediately (with an error) when a concurrent disconnect() is executed. Instead of tracking the number of threads blocked on a given socket, track the number of disconnect() issued on such socket. If such counter changes after a blocking operation releasing and re-acquiring the socket lock, error out the current operation. Fixes: 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting") Reported-by: Tom Deseyn <tdeseyn@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886305 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/f3b95e47e3dbed840960548aebaa8d954372db41.1697008693.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13	ice: fix over-shifted variable	Jesse Brandeburg
	Since the introduction of the ice driver the code has been double-shifting the RSS enabling field, because the define already has shifts in it and can't have the regular pattern of "a << shiftval & mask" applied. Most places in the code got it right, but one line was still wrong. Fix this one location for easy backports to stable. An in-progress patch fixes the defines to "standard" and will be applied as part of the regular -next process sometime after this one. Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> CC: stable@vger.kernel.org Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231010203101.406248-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12	net/mlx5e: Fix VF representors reporting zero counters to "ip -s" command	Amir Tzin
	Although vf_vport entry of struct mlx5e_stats is never updated, its values are mistakenly copied to the caller structure in the VF representor .ndo_get_stat_64 callback mlx5e_rep_get_stats(). Remove redundant entry and use the updated one, rep_stats, instead. Fixes: 64b68e369649 ("net/mlx5: Refactor and expand rep vport stat group") Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5e: Don't offload internal port if filter device is out device	Jianbo Liu
	In the cited commit, if the routing device is ovs internal port, the out device is set to uplink, and packets go out after encapsulation. If filter device is uplink, it can trigger the following syndrome: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 3966): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xcdb051), err(-22) Fix this issue by not offloading internal port if filter device is out device. In this case, packets are not forwarded to the root table to be processed, the termination table is used instead to forward them from uplink to uplink. Fixes: 100ad4e2d758 ("net/mlx5e: Offload internal port as encap route device") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5e: Take RTNL lock before triggering netdev notifiers	Lama Kayal
	Hold RTNL lock when calling xdp_set_features() with a registered netdev, as the call triggers the netdev notifiers. This could happen when switching from nic profile to uplink representor for example. Similar logic which fixed a similar scenario was previously introduced in the following commit: commit 72cc65497065 net/mlx5e: Take RTNL lock when needed before calling xdp_set_features(). This fixes the following assertion and warning call trace: RTNL: assertion failed at net/core/dev.c (1961) WARNING: CPU: 13 PID: 2529 at net/core/dev.c:1961 call_netdevice_notifiers_info+0x7c/0x80 Modules linked in: rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_core zram zsmalloc fuse CPU: 13 PID: 2529 Comm: devlink Not tainted 6.5.0_for_upstream_min_debug_2023_09_07_20_04 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:call_netdevice_notifiers_info+0x7c/0x80 Code: 8f ff 80 3d 77 0d 16 01 00 75 c5 ba a9 07 00 00 48 c7 c6 c4 bb 0d 82 48 c7 c7 18 c8 06 82 c6 05 5b 0d 16 01 01 e8 44 f6 8c ff <0f> 0b eb a2 0f 1f 44 00 00 55 48 89 e5 41 54 48 83 e4 f0 48 83 ec RSP: 0018:ffff88819930f7f0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff8309f740 RCX: 0000000000000027 RDX: ffff88885fb5b5c8 RSI: 0000000000000001 RDI: ffff88885fb5b5c0 RBP: 0000000000000028 R08: ffff88887ffabaa8 R09: 0000000000000003 R10: ffff88887fecbac0 R11: ffff88887ff7bac0 R12: ffff88819930f810 R13: ffff88810b7fea40 R14: ffff8881154e8fd8 R15: ffff888107e881a0 FS: 00007f3ad248f800(0000) GS:ffff88885fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000563b85f164e0 CR3: 0000000113b5c006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn+0x79/0x120 ? call_netdevice_notifiers_info+0x7c/0x80 ? report_bug+0x17c/0x190 ? handle_bug+0x3c/0x60 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? call_netdevice_notifiers_info+0x7c/0x80 call_netdevice_notifiers+0x2e/0x50 mlx5e_set_xdp_feature+0x21/0x50 [mlx5_core] mlx5e_build_rep_params+0x97/0x130 [mlx5_core] mlx5e_init_ul_rep+0x9f/0x100 [mlx5_core] mlx5e_netdev_init_profile+0x76/0x110 [mlx5_core] mlx5e_netdev_attach_profile+0x1f/0x90 [mlx5_core] mlx5e_netdev_change_profile+0x92/0x160 [mlx5_core] mlx5e_vport_rep_load+0x329/0x4a0 [mlx5_core] mlx5_esw_offloads_rep_load+0x9e/0xf0 [mlx5_core] esw_offloads_enable+0x4bc/0xe90 [mlx5_core] mlx5_eswitch_enable_locked+0x3c8/0x570 [mlx5_core] ? kmalloc_trace+0x25/0x80 mlx5_devlink_eswitch_mode_set+0x224/0x680 [mlx5_core] ? devlink_get_from_attrs_lock+0x9e/0x110 devlink_nl_cmd_eswitch_set_doit+0x60/0xe0 genl_family_rcv_msg_doit+0xd0/0x120 genl_rcv_msg+0x180/0x2b0 ? devlink_get_from_attrs_lock+0x110/0x110 ? devlink_nl_cmd_eswitch_get_doit+0x290/0x290 ? devlink_pernet_pre_exit+0xf0/0xf0 ? genl_family_rcv_msg_dumpit+0xf0/0xf0 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1fc/0x2c0 netlink_sendmsg+0x232/0x4a0 sock_sendmsg+0x38/0x60 ? _copy_from_user+0x2a/0x60 __sys_sendto+0x110/0x160 ? handle_mm_fault+0x161/0x260 ? do_user_addr_fault+0x276/0x620 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f3ad231340a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffd70aad4b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000c36b00 RCX:00007f3ad231340a RDX: 0000000000000038 RSI: 0000000000c36b00 RDI: 0000000000000003 RBP: 0000000000c36910 R08: 00007f3ad2625200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 </TASK> ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ Fixes: 4d5ab0ad964d ("net/mlx5e: take into account device reconfiguration for xdp_features flag") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown	Dragos Tatulea
	When mlx5e_xdp_xmit is called without the XDP_XMIT_FLUSH set it is possible that it leaves a mpwqe session open. That is ok during runtime: the session will be closed on the next call to mlx5e_xdp_xmit. But having a mpwqe session still open at XDP sq close time is problematic: the pc counter is not updated before flushing the contents of the xdpi_fifo. This results in leaking page fragments. The fix is to always close the mpwqe session at the end of mlx5e_xdp_xmit, regardless of the XDP_XMIT_FLUSH flag being set or not. Fixes: 5e0d2eef771e ("net/mlx5e: XDP, Support Enhanced Multi-Packet TX WQE") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5e: RX, Fix page_pool allocation failure recovery for legacy rq	Dragos Tatulea
	When a page allocation fails during refill in mlx5e_refill_rx_wqes, the page will be released again on the next refill call. This triggers the page_pool negative page fragment count warning below: [ 338.326070] WARNING: CPU: 4 PID: 0 at include/net/page_pool/helpers.h:130 mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] ... [ 338.328993] RIP: 0010:mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 338.329094] Call Trace: [ 338.329097] <IRQ> [ 338.329100] ? __warn+0x7d/0x120 [ 338.329105] ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 338.329173] ? report_bug+0x155/0x180 [ 338.329179] ? handle_bug+0x3c/0x60 [ 338.329183] ? exc_invalid_op+0x13/0x60 [ 338.329187] ? asm_exc_invalid_op+0x16/0x20 [ 338.329192] ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 338.329259] mlx5e_post_rx_wqes+0x210/0x5a0 [mlx5_core] [ 338.329327] ? mlx5e_poll_rx_cq+0x88/0x6f0 [mlx5_core] [ 338.329394] mlx5e_napi_poll+0x127/0x6b0 [mlx5_core] [ 338.329461] __napi_poll+0x25/0x1a0 [ 338.329465] net_rx_action+0x28a/0x300 [ 338.329468] __do_softirq+0xcd/0x279 [ 338.329473] irq_exit_rcu+0x6a/0x90 [ 338.329477] common_interrupt+0x82/0xa0 [ 338.329482] </IRQ> This patch fixes the legacy rq case by releasing all allocated fragments and then setting the skip flag on all released fragments. It is important to note that the number of released fragments will be higher than the number of allocated fragments when an allocation error occurs. Fixes: 3f93f82988bc ("net/mlx5e: RX, Defer page release in legacy rq for better recycling") Tested-by: Chris Mason <clm@fb.com> Reported-by: Chris Mason <clm@fb.com> Closes: https://lore.kernel.org/netdev/117FF31A-7BE0-4050-B2BB-E41F224FF72F@meta.com Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq	Dragos Tatulea
	When a page allocation fails during refill in mlx5e_post_rx_mpwqes, the page will be released again on the next refill call. This triggers the page_pool negative page fragment count warning below: [ 2436.447717] WARNING: CPU: 1 PID: 2419 at include/net/page_pool/helpers.h:130 mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] ... [ 2436.447895] RIP: 0010:mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 2436.447991] Call Trace: [ 2436.447975] mlx5e_post_rx_mpwqes+0x1d5/0xcf0 [mlx5_core] [ 2436.447994] <IRQ> [ 2436.447996] ? __warn+0x7d/0x120 [ 2436.448009] ? mlx5e_handle_rx_cqe_mpwrq+0x109/0x1d0 [mlx5_core] [ 2436.448002] ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 2436.448044] ? mlx5e_poll_rx_cq+0x87/0x6e0 [mlx5_core] [ 2436.448061] ? report_bug+0x155/0x180 [ 2436.448065] ? handle_bug+0x36/0x70 [ 2436.448067] ? exc_invalid_op+0x13/0x60 [ 2436.448070] ? asm_exc_invalid_op+0x16/0x20 [ 2436.448079] mlx5e_napi_poll+0x122/0x6b0 [mlx5_core] [ 2436.448077] ? mlx5e_page_release_fragmented.isra.0+0x42/0x50 [mlx5_core] [ 2436.448113] ? generic_exec_single+0x35/0x100 [ 2436.448117] __napi_poll+0x25/0x1a0 [ 2436.448120] net_rx_action+0x28a/0x300 [ 2436.448122] __do_softirq+0xcd/0x279 [ 2436.448126] irq_exit_rcu+0x6a/0x90 [ 2436.448128] sysvec_apic_timer_interrupt+0x6e/0x90 [ 2436.448130] </IRQ> This patch fixes the striding rq case by setting the skip flag on all the wqe pages that were expected to have new pages allocated. Fixes: 4c2a13236807 ("net/mlx5e: RX, Defer page release in striding rq for better recycling") Tested-by: Chris Mason <clm@fb.com> Reported-by: Chris Mason <clm@fb.com> Closes: https://lore.kernel.org/netdev/117FF31A-7BE0-4050-B2BB-E41F224FF72F@meta.com Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5: Handle fw tracer change ownership event based on MTRC	Maher Sanalla
	Currently, whenever fw issues a change ownership event, the PF that owns the fw tracer drops its ownership directly and the other PFs try to pick up the ownership via what MTRC register suggests. In some cases, driver releases the ownership of the tracer and reacquires it later on. Whenever the driver releases ownership of the tracer, fw issues a change ownership event. This event can be delayed and come after driver has reacquired ownership of the tracer. Thus the late event will trigger the tracer owner PF to release the ownership again and lead to a scenario where no PF is owning the tracer. To prevent the scenario described above, when handling a change ownership event, do not drop ownership of the tracer directly, instead read the fw MTRC register to retrieve the up-to-date owner of the tracer and set it accordingly in driver level. Fixes: f53aaa31cce7 ("net/mlx5: FW tracer, implement tracer logic") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5: Bridge, fix peer entry ageing in LAG mode	Vlad Buslov
	With current implementation in single FDB LAG mode all packets are processed by eswitch 0 rules. As such, 'peer' FDB entries receive the packets for rules of other eswitches and are responsible for updating the main entry by sending SWITCHDEV_FDB_ADD_TO_BRIDGE notification from their background update wq task. However, this introduces a race condition when non-zero eswitch instance decides to delete a FDB entry, sends SWITCHDEV_FDB_DEL_TO_BRIDGE notification, but another eswitch's update task refreshes the same entry concurrently while its async delete work is still pending on the workque. In such case another SWITCHDEV_FDB_ADD_TO_BRIDGE event may be generated and entry will remain stuck in FDB marked as 'offloaded' since no more SWITCHDEV_FDB_DEL_TO_BRIDGE notifications are sent for deleting the peer entries. Fix the issue by synchronously marking deleted entries with MLX5_ESW_BRIDGE_FLAG_DELETED flag and skipping them in background update job. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5: E-switch, register event handler before arming the event	Shay Drory
	Currently, mlx5 is registering event handler for vport context change event some time after arming the event. this can lead to missing an event, which will result in wrong rules in the FDB. Hence, register the event handler before arming the event. This solution is valid since FW is sending vport context change event only on vports which SW armed, and SW arming the vport when enabling it, which is done after the FDB has been created. Fixes: 6933a9379559 ("net/mlx5: E-Switch, Use async events chain") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	net/mlx5: Perform DMA operations in the right locations	Shay Drory
	The cited patch change mlx5 driver so that during probe DMA operations were performed before pci_enable_device(), and during teardown DMA operations were performed after pci_disable_device(). DMA operations require PCI to be enabled. Hence, The above leads to the following oops in PPC systems[1]. On s390x systems, as reported by Niklas Schnelle, this is a problem because mlx5_pci_init() is where the DMA and coherent mask is set but mlx5_cmd_init() already does a dma_alloc_coherent(). Thus a DMA allocation is done during probe before the correct mask is set. This causes probe to fail initialization of the cmdif SW structs on s390x after that is converted to the common dma-iommu code. This is because on s390x DMA addresses below 4 GiB are reserved on current machines and unlike the old s390x specific DMA API implementation common code enforces DMA masks. Fix it by performing the DMA operations during probe after pci_enable_device() and after the dma mask is set, and during teardown before pci_disable_device(). [1] Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 netconsole rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser ib_umad rdma_cm ib_ipoib iw_cm libiscsi scsi_transport_iscsi ib_cm ib_uverbs ib_core mlx5_core(-) ptp pps_core fuse vmx_crypto crc32c_vpmsum [last unloaded: mlx5_ib] CPU: 1 PID: 8937 Comm: modprobe Not tainted 6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02 #1 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c000000000423388 LR: c0000000001e733c CTR: c0000000001e4720 REGS: c0000000055636d0 TRAP: 0380 Not tainted (6.5.0-rc3_for_upstream_min_debug_2023_07_31_16_02) MSR: 8000000000009033 CR: 24008884 XER: 20040000 CFAR: c0000000001e7338 IRQMASK: 0 NIP [c000000000423388] __free_pages+0x28/0x160 LR [c0000000001e733c] dma_direct_free+0xac/0x190 Call Trace: [c000000005563970] [5deadbeef0000100] 0x5deadbeef0000100 (unreliable) [c0000000055639b0] [c0000000003d46cc] kfree+0x7c/0x150 [c000000005563a40] [c0000000001e47c8] dma_free_attrs+0xa8/0x1a0 [c000000005563aa0] [c008000000d0064c] mlx5_cmd_cleanup+0xa4/0x100 [mlx5_core] [c000000005563ad0] [c008000000cf629c] mlx5_mdev_uninit+0xf4/0x140 [mlx5_core] [c000000005563b00] [c008000000cf6448] remove_one+0x160/0x1d0 [mlx5_core] [c000000005563b40] [c000000000958540] pci_device_remove+0x60/0x110 [c000000005563b80] [c000000000a35e80] device_remove+0x70/0xd0 [c000000005563bb0] [c000000000a37a38] device_release_driver_internal+0x2a8/0x330 [c000000005563c00] [c000000000a37b8c] driver_detach+0x8c/0x160 [c000000005563c40] [c000000000a35350] bus_remove_driver+0x90/0x110 [c000000005563c80] [c000000000a38948] driver_unregister+0x48/0x90 [c000000005563cf0] [c000000000957e38] pci_unregister_driver+0x38/0x150 [c000000005563d40] [c008000000eb6140] mlx5_cleanup+0x38/0x90 [mlx5_core] Fixes: 06cd555f73ca ("net/mlx5: split mlx5_cmd_init() to probe and reload routines") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-12	rswitch: Fix imbalance phy_power_off() calling	Yoshihiro Shimoda
	The phy_power_off() should not be called if phy_power_on() failed. So, add a condition .power_count before calls phy_power_off(). Fixes: 5cb630925b49 ("net: renesas: rswitch: Add phy_power_{on,off}() calling") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	rswitch: Fix renesas_eth_sw_remove() implementation	Yoshihiro Shimoda
	Fix functions calling order and a condition in renesas_eth_sw_remove(). Otherwise, kernel NULL pointer dereference happens from phy_stop() if a net device opens. Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	octeontx2-pf: Fix page pool frag allocation warning	Ratheesh Kannoth
	Since page pool param's "order" is set to 0, will result in below warn message if interface is configured with higher rx buffer size. Steps to reproduce the issue. 1. devlink dev param set pci/0002:04:00.0 name receive_buffer_size \ value 8196 cmode runtime 2. ifconfig eth0 up [ 19.901356] ------------[ cut here ]------------ [ 19.901361] WARNING: CPU: 11 PID: 12331 at net/core/page_pool.c:567 page_pool_alloc_frag+0x3c/0x230 [ 19.901449] pstate: 82401009 (Nzcv daif +PAN -UAO +TCO -DIT +SSBS BTYPE=--) [ 19.901451] pc : page_pool_alloc_frag+0x3c/0x230 [ 19.901453] lr : __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf] [ 19.901460] sp : ffff80000f66b970 [ 19.901461] x29: ffff80000f66b970 x28: 0000000000000000 x27: 0000000000000000 [ 19.901464] x26: ffff800000d15b68 x25: ffff000195b5c080 x24: ffff0002a5a32dc0 [ 19.901467] x23: ffff0001063c0878 x22: 0000000000000100 x21: 0000000000000000 [ 19.901469] x20: 0000000000000000 x19: ffff00016f781000 x18: 0000000000000000 [ 19.901472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 19.901474] x14: 0000000000000000 x13: ffff0005ffdc9c80 x12: 0000000000000000 [ 19.901477] x11: ffff800009119a38 x10: 4c6ef2e3ba300519 x9 : ffff800000d13844 [ 19.901479] x8 : ffff0002a5a33cc8 x7 : 0000000000000030 x6 : 0000000000000030 [ 19.901482] x5 : 0000000000000005 x4 : 0000000000000000 x3 : 0000000000000a20 [ 19.901484] x2 : 0000000000001080 x1 : ffff80000f66b9d4 x0 : 0000000000001000 [ 19.901487] Call trace: [ 19.901488] page_pool_alloc_frag+0x3c/0x230 [ 19.901490] __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf] [ 19.901494] otx2_rq_aura_pool_init+0x1c4/0x240 [rvu_nicpf] [ 19.901498] otx2_open+0x228/0xa70 [rvu_nicpf] [ 19.901501] otx2vf_open+0x20/0xd0 [rvu_nicvf] [ 19.901504] __dev_open+0x114/0x1d0 [ 19.901507] __dev_change_flags+0x194/0x210 [ 19.901510] dev_change_flags+0x2c/0x70 [ 19.901512] devinet_ioctl+0x3a4/0x6c4 [ 19.901515] inet_ioctl+0x228/0x240 [ 19.901518] sock_ioctl+0x2ac/0x480 [ 19.901522] __arm64_sys_ioctl+0x564/0xe50 [ 19.901525] invoke_syscall.constprop.0+0x58/0xf0 [ 19.901529] do_el0_svc+0x58/0x150 [ 19.901531] el0_svc+0x30/0x140 [ 19.901533] el0t_64_sync_handler+0xe8/0x114 [ 19.901535] el0t_64_sync+0x1a0/0x1a4 [ 19.901537] ---[ end trace 678c0bf660ad8116 ]--- Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Link: https://lore.kernel.org/r/20231010034842.3807816-1-rkannoth@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-11	nfp: flower: avoid rmmod nfp crash issues	Yanguo Li
	When there are CT table entries, and you rmmod nfp, the following events can happen: task1： nfp_net_pci_remove ↓ nfp_flower_stop->(asynchronous)tcf_ct_flow_table_cleanup_work(3) ↓ nfp_zone_table_entry_destroy(1) task2： nfp_fl_ct_handle_nft_flow(2) When the execution order is (1)->(2)->(3), it will crash. Therefore, in the function nfp_fl_ct_del_flow, nf_flow_table_offload_del_cb needs to be executed synchronously. At the same time, in order to solve the deadlock problem and the problem of rtnl_lock sometimes failing, replace rtnl_lock with the private nfp_fl_lock. Fixes: 7cc93d888df7 ("nfp: flower-ct: remove callback delete deadlock") Cc: stable@vger.kernel.org Signed-off-by: Yanguo Li <yanguo.li@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-10	net/mlx5e: Again mutually exclude RX-FCS and RX-port-timestamp	Will Mortensen
	Commit 1e66220948df8 ("net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change") seems to have accidentally inverted the logic added in commit 0bc73ad46a76 ("net/mlx5e: Mutually exclude RX-FCS and RX-port-timestamp"). The impact of this is a little unclear since it seems the FCS scattered with RX-FCS is (usually?) correct regardless. Fixes: 1e66220948df8 ("net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change") Tested-by: Charlotte Tan <charlotte@extrahop.com> Reviewed-by: Charlotte Tan <charlotte@extrahop.com> Cc: Adham Faris <afaris@nvidia.com> Cc: Aya Levin <ayal@nvidia.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Moshe Shemesh <moshe@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Will Mortensen <will@extrahop.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20231006053706.514618-1-will@extrahop.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10	ixgbe: fix crash with empty VF macvlan list	Dan Carpenter
	The adapter->vf_mvs.l list needs to be initialized even if the list is empty. Otherwise it will lead to crashes. Fixes: a1cbb15c1397 ("ixgbe: Add macvlan support for VF") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/ZSADNdIw8zFx1xw2@kadam Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10	net/mlx5e: macsec: use update_pn flag instead of PN comparation	Radu Pirea (NXP OSS)
	When updating the SA, use the new update_pn flags instead of comparing the new PN with the initial one. Comparing the initial PN value with the new value will allow the user to update the SA using the initial PN value as a parameter like this: $ ip macsec add macsec0 tx sa 0 pn 1 on key 00 \ ead3664f508eb06c40ac7104cdae4ce5 $ ip macsec set macsec0 tx sa 0 pn 1 off Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10	octeontx2-pf: mcs: update PN only when update_pn is true	Radu Pirea (NXP OSS)
	When updating SA, update the PN only when the update_pn flag is true. Otherwise, the PN will be reset to its previous value using the following command and this should not happen: $ ip macsec set macsec0 tx sa 0 on Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-08	ice: block default rule setting on LAG interface	Michal Swiatkowski
	When one of the LAG interfaces is in switchdev mode, setting default rule can't be done. The interface on which switchdev is running has ice_set_rx_mode() blocked to avoid default rule adding (and other rules). The other interfaces (without switchdev running but connected via bond with interface that runs switchdev) can't follow the same scheme, because rx filtering needs to be disabled when failover happens. Notification for bridge to set promisc mode seems like good place to do that. Fixes: bb52f42acef6 ("ice: Add driver support for firmware changes for LAG") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-08	mlxsw: fix mlxsw_sp2_nve_vxlan_learning_set() return type	Dan Carpenter
	The mlxsw_sp2_nve_vxlan_learning_set() function is supposed to return zero on success or negative error codes. So it needs to be type int instead of bool. Fixes: 4ee70efab68d ("mlxsw: spectrum_nve: Add support for VXLAN on Spectrum-2") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-06	ravb: Fix use-after-free issue in ravb_tx_timeout_work()	Yoshihiro Shimoda
	The ravb_stop() should call cancel_work_sync(). Otherwise, ravb_tx_timeout_work() is possible to use the freed priv after ravb_remove() was called like below: CPU0 CPU1 ravb_tx_timeout() ravb_remove() unregister_netdev() free_netdev(ndev) // free priv ravb_tx_timeout_work() // use priv unregister_netdev() will call .ndo_stop() so that ravb_stop() is called. And, after phy_stop() is called, netif_carrier_off() is also called. So that .ndo_tx_timeout() will not be called after phy_stop(). Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reported-by: Zheng Wang <zyytlz.wz@163.com> Closes: https://lore.kernel.org/netdev/20230725030026.1664873-1-zyytlz.wz@163.com/ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20231005011201.14368-3-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>