summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-11Merge branch 'net_sched-act-extend-rcu-use-in-dump-methods'Jakub Kicinski
Eric Dumazet says: ==================== net_sched: act: extend RCU use in dump() methods We are trying to get away from central RTNL in favor of fine-grained mutexes. While looking at net/sched, I found that act already uses RCU in the fast path for the most cases, and could also be used in dump() methods. This series is not complete and will be followed by a second one. v1: https://lore.kernel.org/20250707130110.619822-1-edumazet@google.com ==================== Link: https://patch.msgid.link/20250709090204.797558-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_skbedit: use RCU in tcf_skbedit_dump()Eric Dumazet
Also storing tcf_action into struct tcf_skbedit_params makes sure there is no discrepancy in tcf_skbedit_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-12-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_police: use RCU in tcf_police_dump()Eric Dumazet
Also storing tcf_action into struct tcf_police_params makes sure there is no discrepancy in tcf_police_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-11-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_pedit: use RCU in tcf_pedit_dump()Eric Dumazet
Also storing tcf_action into struct tcf_pedit_params makes sure there is no discrepancy in tcf_pedit_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-10-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_nat: use RCU in tcf_nat_dump()Eric Dumazet
Also storing tcf_action into struct tcf_nat_params makes sure there is no discrepancy in tcf_nat_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-9-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_mpls: use RCU in tcf_mpls_dump()Eric Dumazet
Also storing tcf_action into struct tcf_mpls_params makes sure there is no discrepancy in tcf_mpls_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_ctinfo: use RCU in tcf_ctinfo_dump()Eric Dumazet
Also storing tcf_action into struct tcf_ctinfo_params makes sure there is no discrepancy in tcf_ctinfo_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_ctinfo: use atomic64_t for three countersEric Dumazet
Commit 21c167aa0ba9 ("net/sched: act_ctinfo: use percpu stats") missed that stats_dscp_set, stats_dscp_error and stats_cpmark_set might be written (and read) locklessly. Use atomic64_t for these three fields, I doubt act_ctinfo is used heavily on big SMP hosts anyway. Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pedro Tammela <pctammela@mojatatu.com> Link: https://patch.msgid.link/20250709090204.797558-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_ct: use RCU in tcf_ct_dump()Eric Dumazet
Also storing tcf_action into struct tcf_ct_params makes sure there is no discrepancy in tcf_ct_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_csum: use RCU in tcf_csum_dump()Eric Dumazet
Also storing tcf_action into struct tcf_csum_params makes sure there is no discrepancy in tcf_csum_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act_connmark: use RCU in tcf_connmark_dump()Eric Dumazet
Also storing tcf_action into struct tcf_connmark_parms makes sure there is no discrepancy in tcf_connmark_act(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net_sched: act: annotate data-races in tcf_lastuse_update() and tcf_tm_dump()Eric Dumazet
tcf_tm_dump() reads fields that can be changed concurrently, and tcf_lastuse_update() might race against itself. Add READ_ONCE() and WRITE_ONCE() annotations. Fetch jiffies once in tcf_tm_dump(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250709090204.797558-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11eth: fbnic: fix ubsan complaints about OOB accessesJakub Kicinski
UBSAN complains that we reach beyond the end of the log entry: UBSAN: array-index-out-of-bounds in drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c:94:50 index 71 is out of range for type 'char [*]' Call Trace: <TASK> ubsan_epilogue+0x5/0x2b fbnic_fw_log_write+0x120/0x960 fbnic_fw_parse_logs+0x161/0x210 We're just taking the address of the character after the array, so this really seems like something that should be legal. But whatever, easy enough to silence by doing direct pointer math. Fixes: c2b93d6beca8 ("eth: fbnic: Create ring buffer for firmware logs") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250709205910.3107691-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11virtio_net: simplify tx queue wake condition checkLiming Wu
Consolidate the two nested if conditions for checking tx queue wake conditions into a single combined condition. This improves code readability without changing functionality. And move netif_tx_wake_queue into if condition to reduce unnecessary checks for queue stops. Signed-off-by: Liming Wu <liming.wu@jaguarmicro.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250710023208.846-1-liming.wu@jaguarmicro.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11selftests/tc-testing: Add tests for restrictions on netem duplicationWilliam Liu
Ensure that a duplicating netem cannot exist in a tree with other netems in both qdisc addition and change. This is meant to prevent the soft lockup and OOM loop scenario discussed in [1]. Also adjust a HFSC's re-entrancy test case with netem for this new restriction - KASAN still triggers upon its failure. [1] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/ Signed-off-by: William Liu <will@willsroot.io> Reviewed-by: Savino Dicanosa <savy@syst3mfailure.io> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20250708164219.875521-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net/sched: Restrict conditions for adding duplicating netems to qdisc treeWilliam Liu
netem_enqueue's duplication prevention logic breaks when a netem resides in a qdisc tree with other netems - this can lead to a soft lockup and OOM loop in netem_dequeue, as seen in [1]. Ensure that a duplicating netem cannot exist in a tree with other netems. Previous approaches suggested in discussions in chronological order: 1) Track duplication status or ttl in the sk_buff struct. Considered too specific a use case to extend such a struct, though this would be a resilient fix and address other previous and potential future DOS bugs like the one described in loopy fun [2]. 2) Restrict netem_enqueue recursion depth like in act_mirred with a per cpu variable. However, netem_dequeue can call enqueue on its child, and the depth restriction could be bypassed if the child is a netem. 3) Use the same approach as in 2, but add metadata in netem_skb_cb to handle the netem_dequeue case and track a packet's involvement in duplication. This is an overly complex approach, and Jamal notes that the skb cb can be overwritten to circumvent this safeguard. 4) Prevent the addition of a netem to a qdisc tree if its ancestral path contains a netem. However, filters and actions can cause a packet to change paths when re-enqueued to the root from netem duplication, leading us to the current solution: prevent a duplicating netem from inhabiting the same tree as other netems. [1] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/ [2] https://lwn.net/Articles/719297/ Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication") Reported-by: William Liu <will@willsroot.io> Reported-by: Savino Dicanosa <savy@syst3mfailure.io> Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20250708164141.875402-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc6-2). No conflicts. Adjacent changes: drivers/net/wireless/mediatek/mt76/mt7925/mcu.c c701574c5412 ("wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan") b3a431fe2e39 ("wifi: mt76: mt7925: fix off by one in mt7925_mcu_hw_scan()") drivers/net/wireless/mediatek/mt76/mt7996/mac.c 62da647a2b20 ("wifi: mt76: mt7996: Add MLO support to mt7996_tx_check_aggr()") dc66a129adf1 ("wifi: mt76: add a wrapper for wcid access with validation") drivers/net/wireless/mediatek/mt76/mt7996/main.c 3dd6f67c669c ("wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()") 8989d8e90f5f ("wifi: mt76: mt7996: Do not set wcid.sta to 1 in mt7996_mac_sta_event()") net/mac80211/cfg.c 58fcb1b4287c ("wifi: mac80211: reject VHT opmode for unsupported channel widths") 037dc18ac3fb ("wifi: mac80211: add support for storing station S1G capabilities") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge tag 'net-6.16-rc6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull more networking fixes from Jakub Kicinski "Big chunk of fixes for WiFi, Johannes says probably the last for the release. The Netlink fixes (on top of the tree) restore operation of iw (WiFi CLI) which uses sillily small recv buffer, and is the reason for this 'emergency PR'. The GRE multicast fix also stands out among the user-visible regressions. Current release - fix to a fix: - netlink: make sure we always allow at least one skb to be queued, even if the recvbuf is (mis)configured to be tiny Previous releases - regressions: - gre: fix IPv6 multicast route creation Previous releases - always broken: - wifi: prevent A-MSDU attacks in mesh networks - wifi: cfg80211: fix S1G beacon head validation and detection - wifi: mac80211: - always clear frame buffer to prevent stack leak in cases which hit a WARN() - fix monitor interface in device restart - wifi: mwifiex: discard erroneous disassoc frames on STA interface - wifi: mt76: - prevent null-deref in mt7925_sta_set_decap_offload() - add missing RCU annotations, and fix sleep in atomic - fix decapsulation offload - fixes for scanning - phy: microchip: improve link establishment and reset handling - eth: mlx5e: fix race between DIM disable and net_dim() - bnxt_en: correct DMA unmap len for XDP_REDIRECT" * tag 'net-6.16-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) netlink: make sure we allow at least one dump skb netlink: Fix rmem check in netlink_broadcast_deliver(). bnxt_en: Set DMA unmap len correctly for XDP_REDIRECT bnxt_en: Flush FW trace before copying to the coredump bnxt_en: Fix DCB ETS validation net: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam() net/mlx5e: Add new prio for promiscuous mode net/mlx5e: Fix race between DIM disable and net_dim() net/mlx5: Reset bw_share field when changing a node's parent can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level selftests: net: lib: fix shift count out of range selftests: Add IPv6 multicast route generation tests for GRE devices. gre: Fix IPv6 multicast route creation. net: phy: microchip: limit 100M workaround to link-down events on LAN88xx net: phy: microchip: Use genphy_soft_reset() to purge stale LPA bits ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof net: appletalk: Fix device refcount leak in atrtr_create() netfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto() wifi: mac80211: add the virtual monitor after reconfig complete wifi: mac80211: always initialize sdata::key_list ...
2025-07-11Merge tag 'gpio-fixes-for-v6.16-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix performance regression when setting values of multiple GPIO lines at once - make sure the GPIO OF xlate code doesn't end up passing an uninitialized local variable to GPIO core - update MAINTAINERS * tag 'gpio-fixes-for-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: MAINTAINERS: remove bouncing address for Nandor Han gpio: of: initialize local variable passed to the .of_xlate() callback gpiolib: fix performance regression when using gpio_chip_get_multiple()
2025-07-11selftests: drv-net: Add bpftool utilMohsin Bashir
Add bpf utility to simplify the use of bpftool for XDP tests included in this series. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250710184351.63797-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge tag 'pm-6.16-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a coding mistake in a previous fix related to system suspend and hibernation merged recently" * tag 'pm-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: Call pm_restore_gfp_mask() after dpm_resume()
2025-07-11Merge tag 'dma-mapping-6.16-2025-07-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping fix from Marek Szyprowski: - small fix relevant to arm64 server and custom CMA configuration (Feng Tang) * tag 'dma-mapping-6.16-2025-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-contiguous: hornor the cma address limit setup by user
2025-07-11netlink: make sure we allow at least one dump skbJakub Kicinski
Commit under Fixes tightened up the memory accounting for Netlink sockets. Looks like the accounting is too strict for some existing use cases, Marek reported issues with nl80211 / WiFi iw CLI. To reduce number of iterations Netlink dumps try to allocate messages based on the size of the buffer passed to previous recvmsg() calls. If user space uses a larger buffer in recvmsg() than sk_rcvbuf we will allocate an skb we won't be able to queue. Make sure we always allow at least one skb to be queued. Same workaround is already present in netlink_attachskb(). Alternative would be to cap the allocation size to rcvbuf - rmem_alloc but as I said, the workaround is already present in other places. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/9794af18-4905-46c6-b12c-365ea2f05858@samsung.com Fixes: ae8f160e7eb2 ("netlink: Fix wraparounds of sk->sk_rmem_alloc.") Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711001121.3649033-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11netlink: Fix rmem check in netlink_broadcast_deliver().Kuniyuki Iwashima
We need to allow queuing at least one skb even when skb is larger than sk->sk_rcvbuf. The cited commit made a mistake while converting a condition in netlink_broadcast_deliver(). Let's correct the rmem check for the allow-one-skb rule. Fixes: ae8f160e7eb24 ("netlink: Fix wraparounds of sk->sk_rmem_alloc.") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711053208.2965945-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge branch 'bnxt_en-3-bug-fixes'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: 3 bug fixes The first one fixes a possible failure when setting DCB ETS. The second one fixes the ethtool coredump (-W 2) not containing all the FW traces. The third one fixes the DMA unmap length when transmitting XDP_REDIRECT packets. ==================== Link: https://patch.msgid.link/20250710213938.1959625-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11bnxt_en: Set DMA unmap len correctly for XDP_REDIRECTSomnath Kotur
When transmitting an XDP_REDIRECT packet, call dma_unmap_len_set() with the proper length instead of 0. This bug triggers this warning on a system with IOMMU enabled: WARNING: CPU: 36 PID: 0 at drivers/iommu/dma-iommu.c:842 __iommu_dma_unmap+0x159/0x170 RIP: 0010:__iommu_dma_unmap+0x159/0x170 Code: a8 00 00 00 00 48 c7 45 b0 00 00 00 00 48 c7 45 c8 00 00 00 00 48 c7 45 a0 ff ff ff ff 4c 89 45 b8 4c 89 45 c0 e9 77 ff ff ff <0f> 0b e9 60 ff ff ff e8 8b bf 6a 00 66 66 2e 0f 1f 84 00 00 00 00 RSP: 0018:ff22d31181150c88 EFLAGS: 00010206 RAX: 0000000000002000 RBX: 00000000e13a0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ff22d31181150cf0 R08: ff22d31181150ca8 R09: 0000000000000000 R10: 0000000000000000 R11: ff22d311d36c9d80 R12: 0000000000001000 R13: ff13544d10645010 R14: ff22d31181150c90 R15: ff13544d0b2bac00 FS: 0000000000000000(0000) GS:ff13550908a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005be909dacff8 CR3: 0008000173408003 CR4: 0000000000f71ef0 PKRU: 55555554 Call Trace: <IRQ> ? show_regs+0x6d/0x80 ? __warn+0x89/0x160 ? __iommu_dma_unmap+0x159/0x170 ? report_bug+0x17e/0x1b0 ? handle_bug+0x46/0x90 ? exc_invalid_op+0x18/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? __iommu_dma_unmap+0x159/0x170 ? __iommu_dma_unmap+0xb3/0x170 iommu_dma_unmap_page+0x4f/0x100 dma_unmap_page_attrs+0x52/0x220 ? srso_alias_return_thunk+0x5/0xfbef5 ? xdp_return_frame+0x2e/0xd0 bnxt_tx_int_xdp+0xdf/0x440 [bnxt_en] __bnxt_poll_work_done+0x81/0x1e0 [bnxt_en] bnxt_poll+0xd3/0x1e0 [bnxt_en] Fixes: f18c2b77b2e4 ("bnxt_en: optimized XDP_REDIRECT support") Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250710213938.1959625-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11bnxt_en: Flush FW trace before copying to the coredumpShruti Parab
bnxt_fill_drv_seg_record() calls bnxt_dbg_hwrm_log_buffer_flush() to flush the FW trace buffer. This needs to be done before we call bnxt_copy_ctx_mem() to copy the trace data. Without this fix, the coredump may not contain all the FW traces. Fixes: 3c2179e66355 ("bnxt_en: Add FW trace coredump segments to the coredump") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250710213938.1959625-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11bnxt_en: Fix DCB ETS validationShravya KN
In bnxt_ets_validate(), the code incorrectly loops over all possible traffic classes to check and add the ETS settings. Fix it to loop over the configured traffic classes only. The unconfigured traffic classes will default to TSA_ETS with 0 bandwidth. Looping over these unconfigured traffic classes may cause the validation to fail and trigger this error message: "rejecting ETS config starving a TC\n" The .ieee_setets() will then fail. Fixes: 7df4ae9fe855 ("bnxt_en: Implement DCBNL to support host-based DCBX.") Reviewed-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Shravya KN <shravya.k-n@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250710213938.1959625-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam()Alok Tiwari
The function ll_temac_ethtools_set_ringparam() incorrectly checked rx_pending twice, once correctly for RX and once mistakenly in place of tx_pending. This caused tx_pending to be left unchecked against TX_BD_NUM_MAX. As a result, invalid TX ring sizes may have been accepted or valid ones wrongly rejected based on the RX limit, leading to potential misconfiguration or unexpected results. This patch corrects the condition to properly validate tx_pending. Fixes: f7b261bfc35e ("net: ll_temac: Make RX/TX ring sizes configurable") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20250710180621.2383000-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge branch 'mlx5-misc-fixes-2025-07-10'Jakub Kicinski
Tariq Toukan says: ==================== mlx5 misc fixes 2025-07-10 This small patchset provides misc bug fixes from the team to the mlx5 core and EN drivers. ==================== Link: https://patch.msgid.link/1752155624-24095-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net/mlx5e: Add new prio for promiscuous modeJianbo Liu
An optimization for promiscuous mode adds a high-priority steering table with a single catch-all rule to steer all traffic directly to the TTC table. However, a gap exists between the creation of this table and the insertion of the catch-all rule. Packets arriving in this brief window would miss as no rule was inserted yet, unnecessarily incrementing the 'rx_steer_missed_packets' counter and dropped. This patch resolves the issue by introducing a new prio for this table, placing it between MLX5E_TC_PRIO and MLX5E_NIC_PRIO. By doing so, packets arriving during the window now fall through to the next prio (at MLX5E_NIC_PRIO) instead of being dropped. Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1752155624-24095-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net/mlx5e: Fix race between DIM disable and net_dim()Carolina Jubran
There's a race between disabling DIM and NAPI callbacks using the dim pointer on the RQ or SQ. If NAPI checks the DIM state bit and sees it still set, it assumes `rq->dim` or `sq->dim` is valid. But if DIM gets disabled right after that check, the pointer might already be set to NULL, leading to a NULL pointer dereference in net_dim(). Fix this by calling `synchronize_net()` before freeing the DIM context. This ensures all in-progress NAPI callbacks are finished before the pointer is cleared. Kernel log: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:net_dim+0x23/0x190 ... Call Trace: <TASK> ? __die+0x20/0x60 ? page_fault_oops+0x150/0x3e0 ? common_interrupt+0xf/0xa0 ? sysvec_call_function_single+0xb/0x90 ? exc_page_fault+0x74/0x130 ? asm_exc_page_fault+0x22/0x30 ? net_dim+0x23/0x190 ? mlx5e_poll_ico_cq+0x41/0x6f0 [mlx5_core] ? sysvec_apic_timer_interrupt+0xb/0x90 mlx5e_handle_rx_dim+0x92/0xd0 [mlx5_core] mlx5e_napi_poll+0x2cd/0xac0 [mlx5_core] ? mlx5e_poll_ico_cq+0xe5/0x6f0 [mlx5_core] busy_poll_stop+0xa2/0x200 ? mlx5e_napi_poll+0x1d9/0xac0 [mlx5_core] ? mlx5e_trigger_irq+0x130/0x130 [mlx5_core] __napi_busy_loop+0x345/0x3b0 ? sysvec_call_function_single+0xb/0x90 ? asm_sysvec_call_function_single+0x16/0x20 ? sysvec_apic_timer_interrupt+0xb/0x90 ? pcpu_free_area+0x1e4/0x2e0 napi_busy_loop+0x11/0x20 xsk_recvmsg+0x10c/0x130 sock_recvmsg+0x44/0x70 __sys_recvfrom+0xbc/0x130 ? __schedule+0x398/0x890 __x64_sys_recvfrom+0x20/0x30 do_syscall_64+0x4c/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... ---[ end trace 0000000000000000 ]--- ... ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: 445a25f6e1a2 ("net/mlx5e: Support updating coalescing configuration without resetting channels") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1752155624-24095-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11net/mlx5: Reset bw_share field when changing a node's parentCarolina Jubran
When changing a node's parent, its scheduling element is destroyed and re-created with bw_share 0. However, the node's bw_share field was not updated accordingly. Set the node's bw_share to 0 after re-creation to keep the software state in sync with the firmware configuration. Fixes: 9c7bbf4c3304 ("net/mlx5: Add support for setting parent of nodes") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1752155624-24095-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge tag 'linux-can-fixes-for-6.16-20250711' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2025-07-11 Sean Nyekjaer's patch targets the m_can driver and demotes the "msg lost in rx" message to debug level to prevent flooding the kernel log with error messages. * tag 'linux-can-fixes-for-6.16-20250711' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level ==================== Link: https://patch.msgid.link/20250711102451.2828802-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11Merge branch 'hv-msi-parent-domain' into mainDavid S. Miller
Nam Cao says: ==================== Subject: [PATCH for-netdev v2 0/2] PCI: hv: MSI parent domain conversion This series originally belongs to a bigger series sent to PCI tree: https://lore.kernel.org/linux-pci/024f0122314198fe0a42fef01af53e8953a687ec.1750858083.git.namcao@linutronix.de/ However, during review, we noticed that the patch conflicts with another patch in netdev tree: https://lore.kernel.org/netdev/1749651015-9668-1-git-send-email-shradhagupta@linux.microsoft.com/ As this series has no dependency with the rest of the series, we think it is best to split out this one and send it to netdev, to avoid conflict resolution headache later on. Can netdev maintainers please pick it up? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-07-11PCI: hv: Switch to msi_create_parent_irq_domain()Nam Cao
Move away from the legacy MSI domain setup, switch to use msi_create_parent_irq_domain(). While doing the conversion, I noticed that hv_compose_msi_msg() is doing more than it is supposed to (composing message). This function also allocates and populates struct tran_int_desc, which should be done in hv_pcie_domain_alloc() instead. It works, but it is not the correct design. However, I have no hardware to test such change, therefore I leave a TODO note. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-07-11irqdomain: Export irq_domain_free_irqs_top()Nam Cao
Export irq_domain_free_irqs_top(), making it usable for drivers compiled as modules. Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-07-11can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to ↵Sean Nyekjaer
debug level Downgrade the "msg lost in rx" message to debug level, to prevent flooding the kernel log with error messages. Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Sean Nyekjaer <sean@geanix.com> Link: https://patch.msgid.link/20250711-mcan_ratelimit-v3-1-7413e8e21b84@geanix.com [mkl: enhance commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-07-11MAINTAINERS: remove bouncing address for Nandor HanBartosz Golaszewski
Nandor's address has been bouncing for some time now. Remove it from MAINTAINERS. The affected driver falls under the wider umbrella of GPIO modules. Link: https://lore.kernel.org/r/20250709071825.16212-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-07-10Merge tag 'nf-next-25-07-10' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next (v2) The following series contains an initial small batch of Netfilter updates for net-next: 1) Remove DCCP conntrack support, keep DCCP matches around in order to avoid breakage when loading ruleset, add Kconfig to wrap the code so it can be disabled by distributors. 2) Remove buggy code aiming at shrinking netlink deletion event, then re-add it correctly in another patch. This is to prevent -stable to pick up on a fix that breaks old userspace. From Phil Sutter. 3) Missing WARN_ON_ONCE() to check for lockdep_commit_lock_is_held() to uncover bugs. From Fedor Pchelkin. * tag 'nf-next-25-07-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: adjust lockdep assertions handling netfilter: nf_tables: Reintroduce shortened deletion notifications netfilter: nf_tables: Drop dead code from fill_*_info routines netfilter: conntrack: remove DCCP protocol support ==================== Link: https://patch.msgid.link/20250710010706.2861281-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10Merge branch 'net-ftgmac100-add-soc-reset-support-for-rmii-mode'Jakub Kicinski
Jacky Chou says: ==================== net: ftgmac100: Add SoC reset support for RMII mode This patch series adds support for an optional reset line to the ftgmac100 ethernet controller, as used on Aspeed SoCs. On these SoCs, the internal MAC reset is not sufficient to reset the RMII interface. By providing a SoC-level reset via the device tree "resets" property, the driver can properly reset both the MAC and RMII logic, ensuring correct operation in RMII mode. The series includes: - Device tree binding update to document the new "resets" property. - Addition of MAC1/2/3/4 reset definitions for AST2600. - Driver changes to assert/deassert the reset line as needed. This improves reliability and initialization of the MAC in RMII mode on Aspeed platforms. ==================== Link: https://patch.msgid.link/20250709070809.2560688-1-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10net: ftgmac100: Add optional reset control for RMII mode on Aspeed SoCsJacky Chou
On Aspeed SoCs, the internal MAC reset is insufficient to fully reset the RMII interface; only the SoC-level reset line can properly reset the RMII logic. This patch adds support for an optional "resets" property in the device tree, allowing the driver to assert and deassert the SoC reset line when operating in RMII mode. This ensures the MAC and RMII interface are correctly reset and initialized. Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250709070809.2560688-5-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10dt-bindings: clock: ast2600: Add reset definitions for MAC1 and MAC2Jacky Chou
Add ASPEED_RESET_MAC1 and ASPEED_RESET_MAC2 reset definitions to the ast2600-clock binding header. These are required for proper reset control of the MAC1 and MAC2 ethernet controllers on the AST2600 SoC. Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Link: https://patch.msgid.link/20250709070809.2560688-3-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10dt-bindings: net: ftgmac100: Add resets propertyJacky Chou
In Aspeed AST2600 design, the MAC internal delay on MAC register cannot fully reset the RMII interfaces, it may cause the RMII incompletely. Therefore, we need to add resets property to do SoC-level reset line to reset the whole MAC function that includes ftgmac, RGMII and RMII. Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250709070809.2560688-2-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10selftests: net: lib: fix shift count out of rangeHangbin Liu
I got the following warning when writing other tests: + handle_test_result_pass 'bond 802.3ad' '(lacp_active off)' + local 'test_name=bond 802.3ad' + shift + local 'opt_str=(lacp_active off)' + shift + log_test_result 'bond 802.3ad' '(lacp_active off)' ' OK ' + local 'test_name=bond 802.3ad' + shift + local 'opt_str=(lacp_active off)' + shift + local 'result= OK ' + shift + local retmsg= + shift /net/tools/testing/selftests/net/forwarding/../lib.sh: line 315: shift: shift count out of range This happens because an extra shift is executed even after all arguments have been consumed. Remove the last shift in log_test_result() to avoid this warning. Fixes: a923af1ceee7 ("selftests: forwarding: Convert log_test() to recognize RET values") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250709091244.88395-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10Merge branch 'gre-fix-default-ipv6-multicast-route-creation'Jakub Kicinski
Guillaume Nault says: ==================== gre: Fix default IPv6 multicast route creation. When fixing IPv6 link-local address generation on GRE devices with commit 3e6a0243ff00 ("gre: Fix again IPv6 link-local address generation."), I accidentally broke the default IPv6 multicast route creation on these GRE devices. Fix that in patch 1, making the GRE specific code yet a bit closer to the generic code used by most other network interface types. Then extend the selftest in patch 2 to cover this case. ==================== Link: https://patch.msgid.link/cover.1752070620.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10selftests: Add IPv6 multicast route generation tests for GRE devices.Guillaume Nault
The previous patch fixes a bug that prevented the creation of the default IPv6 multicast route (ff00::/8) for some GRE devices. Now let's extend the GRE IPv6 selftests to cover this case. Also, rename check_ipv6_ll_addr() to check_ipv6_device_config() and adapt comments and script output to take into account the fact that we're not limited to link-local address generation. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/65a89583bde3bf866a1922c2e5158e4d72c520e2.1752070620.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10gre: Fix IPv6 multicast route creation.Guillaume Nault
Use addrconf_add_dev() instead of ipv6_find_idev() in addrconf_gre_config() so that we don't just get the inet6_dev, but also install the default ff00::/8 multicast route. Before commit 3e6a0243ff00 ("gre: Fix again IPv6 link-local address generation."), the multicast route was created at the end of the function by addrconf_add_mroute(). But this code path is now only taken in one particular case (gre devices not bound to a local IP address and in EUI64 mode). For all other cases, the function exits early and addrconf_add_mroute() is not called anymore. Using addrconf_add_dev() instead of ipv6_find_idev() in addrconf_gre_config(), fixes the problem as it will create the default multicast route for all gre devices. This also brings addrconf_gre_config() a bit closer to the normal netdevice IPv6 configuration code (addrconf_dev_config()). Cc: stable@vger.kernel.org Fixes: 3e6a0243ff00 ("gre: Fix again IPv6 link-local address generation.") Reported-by: Aiden Yang <ling@moedove.com> Closes: https://lore.kernel.org/netdev/CANR=AhRM7YHHXVxJ4DmrTNMeuEOY87K2mLmo9KMed1JMr20p6g@mail.gmail.com/ Reviewed-by: Gary Guo <gary@garyguo.net> Tested-by: Gary Guo <gary@garyguo.net> Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/027a923dcb550ad115e6d93ee8bb7d310378bd01.1752070620.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10Merge branch 'net-phy-microchip-lan88xx-reliability-fixes'Jakub Kicinski
Oleksij Rempel says: ==================== net: phy: microchip: LAN88xx reliability fixes This patch series improves the reliability of the Microchip LAN88xx PHYs, particularly in edge cases involving fixed link configurations or forced speed modes. Patch 1 assigns genphy_soft_reset() to the .soft_reset hook to ensure that stale link partner advertisement (LPA) bits are properly cleared during reconfiguration. Without this, outdated autonegotiation bits may remain visible in some parallel detection cases. Patch 2 restricts the 100 Mbps workaround (originally intended to handle cable length switching) to only run when the link transitions to the PHY_NOLINK state. This prevents repeated toggling that can confuse autonegotiating link partners such as the Intel i350, leading to unstable link cycles. Both patches were tested on a LAN7850 (with integrated LAN88xx PHY) against an Intel I350 NIC. The full test suite - autonegotiation, fixed link, and parallel detection - passed successfully. ==================== Link: https://patch.msgid.link/20250709130753.3994461-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10net: phy: microchip: limit 100M workaround to link-down events on LAN88xxOleksij Rempel
Restrict the 100Mbit forced-mode workaround to link-down transitions only, to prevent repeated link reset cycles in certain configurations. The workaround was originally introduced to improve signal reliability when switching cables between long and short distances. It temporarily forces the PHY into 10 Mbps before returning to 100 Mbps. However, when used with autonegotiating link partners (e.g., Intel i350), executing this workaround on every link change can confuse the partner and cause constant renegotiation loops. This results in repeated link down/up transitions and the PHY never reaching a stable state. Limit the workaround to only run during the PHY_NOLINK state. This ensures it is triggered only once per link drop, avoiding disruptive toggling while still preserving its intended effect. Note: I am not able to reproduce the original issue that this workaround addresses. I can only confirm that 100 Mbit mode works correctly in my test setup. Based on code inspection, I assume the workaround aims to reset some internal state machine or signal block by toggling speeds. However, a PHY reset is already performed earlier in the function via phy_init_hw(), which may achieve a similar effect. Without a reproducer, I conservatively keep the workaround but restrict its conditions. Fixes: e57cf3639c32 ("net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250709130753.3994461-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>