summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-13net: cn23xx: fix typosJanik Haag
This patch fixes a few typos, spelling mistakes, and a bit of grammar, increasing the comments readability. Signed-off-by: Janik Haag <janik@aq0.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307145648.1679912-2-janik@aq0.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13net: hns3: use string choices helperJian Shen
Use string choices helper for better readability. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307113733.819448-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-12netdevsim: 'support' multi-buf XDPJakub Kicinski
Don't error out on large MTU if XDP is multi-buf. The ping test now tests ping with XDP and high MTU. netdevsim doesn't actually run the prog (yet?) so it doesn't matter if the prog was multi-buf.. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/20250311092820.542148-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12Merge branch 'net-remove-rtnl_lock-from-the-callers-of-queue-apis'Jakub Kicinski
Stanislav Fomichev says: ==================== net: remove rtnl_lock from the callers of queue APIs All drivers that use queue management APIs already depend on the netdev lock. Ultimately, we want to have most of the paths that work with specific netdev to be rtnl_lock-free (ethtool mostly in particular). Queue API currently has a much smaller API surface, so start with rtnl_lock from it: - add mutex to each dmabuf binding (to replace rtnl_lock) - move netdev lock management to the callers of netdev_rx_queue_restart and drop rtnl_lock ==================== Link: https://patch.msgid.link/20250311144026.4154277-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12net: drop rtnl_lock for queue_mgmt operationsStanislav Fomichev
All drivers that use queue API are already converted to use netdev instance lock. Move netdev instance lock management to the netlink layer and drop rtnl_lock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry. <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-4-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12net: add granular lock for the netdev netlink socketStanislav Fomichev
As we move away from rtnl_lock for queue ops, introduce per-netdev_nl_sock lock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12net: create netdev_nl_sock to wrap bindings listStanislav Fomichev
No functional changes. Next patches will add more granular locking to netdev_nl_sock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12net/mlx5: Avoid unnecessary use of comma operatorSimon Horman
Although it does not seem to have any untoward side-effects, the use of ';' to separate to assignments seems more appropriate than ','. Flagged by clang-19 -Wcomma No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250307-mlx5-comma-v1-1-934deb6927bb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12selftests: net: bump GRO timeout for gro/setup_vethJakub Kicinski
Commit 51bef03e1a71 ("selftests/net: deflake GRO tests") recently switched to NAPI suspension, and lowered the timeout from 1ms to 100us. This started causing flakes in netdev-run CI. Let's bump it to 200us. In a quick test of a debug kernel I see failures with 100us, with 200us in 5 runs I see 2 completely clean runs and 3 with a single retry (GRO test will retry up to 5 times). Reviewed-by: Kevin Krakauer <krakauer@google.com> Link: https://patch.msgid.link/20250310110821.385621-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12eth: bnxt: add missing netdev lock management to bnxt_dl_reload_upStanislav Fomichev
bnxt_dl_reload_up is completely missing instance lock management which can result in `devlink dev reload` leaving with instance lock held. Add the missing calls. Also add netdev_assert_locked to make it clear that the up() method is running with the instance lock grabbed. v2: - add net/netdev_lock.h include to bnxt_devlink.c for netdev_assert_locked Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12eth: bnxt: request unconditional ops lockStanislav Fomichev
netdev_lock_ops conditionally grabs instance lock when queue_mgmt_ops is defined. However queue_mgmt_ops support is signaled via FW so we can sometimes boot without queue_mgmt_ops being set. This will result in bnxt running without instance lock which the driver now heavily depends on. Set request_ops_lock to true unconditionally to always request netdev instance lock. Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12eth: bnxt: switch to netif_closeStanislav Fomichev
All (error) paths that call dev_close are already holding instance lock, so switch to netif_close to avoid the deadlock. v2: - add missing EXPORT_MODULE for netif_close Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FTStanislav Fomichev
There is a couple of places from which we can arrive to ndo_setup_tc with TC_SETUP_BLOCK/TC_SETUP_FT: - netlink - netlink notifier - netdev notifier Locking netdev too deep in this call chain seems to be problematic (especially assuming some/all of the call_netdevice_notifiers NETDEV_UNREGISTER) might soon be running with the instance lock). Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT framework already takes care of most of the locking. Document the assumptions. ndo_setup_tc TC_SETUP_BLOCK nft_block_offload_cmd nft_chain_offload_cmd nft_flow_block_chain nft_flow_offload_chain nft_flow_rule_offload_abort nft_flow_rule_offload_commit nft_flow_rule_offload_commit nf_tables_commit nfnetlink_rcv_batch nfnetlink_rcv_skb_batch nfnetlink_rcv nft_offload_netdev_event NETDEV_UNREGISTER notifier ndo_setup_tc TC_SETUP_FT nf_flow_table_offload_cmd nf_flow_table_offload_setup nft_unregister_flowtable_hook nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nfnetlink_rcv_batch (.call NFNL_CB_BATCH) nft_flowtable_update nf_tables_newflowtable nft_flowtable_event nf_tables_flowtable_event NETDEV_UNREGISTER notifier __nft_unregister_flowtable_net_hooks nft_unregister_flowtable_net_hooks nf_tables_commit nfnetlink_rcv_batch (.call NFNL_CB_BATCH) __nf_tables_abort nf_tables_abort nfnetlink_rcv_batch __nft_release_hook __nft_release_hooks nf_tables_pre_exit_net -> module unload nft_rcv_nl_event netlink_register_notifier (oh boy) nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nf_tables_newflowtable Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250308044726.1193222-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-11docs: netdev: add a note on selftest postingJakub Kicinski
We haven't had much discussion on the list about this, but a handful of people have been confused about rules on posting selftests for fixes, lately. I tend to post fixes with their respective selftests in the same series. There are tradeoffs around size of the net tree and conflicts but so far it hasn't been a major issue. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250306180533.1864075-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11Merge branch 'net-ti-icssg-prueth-add-native-mode-xdp-support'Paolo Abeni
Meghana Malladi says: ==================== net: ti: icssg-prueth: Add native mode XDP support This series adds native XDP support using page_pool. XDP zero copy support is not included in this patch series. Patch 1/3: Replaces skb with page pool for Rx buffer allocation Patch 2/3: Adds prueth_swdata struct for SWDATA for all swdata cases Patch 3/3: Introduces native mode XDP support v3: https://lore.kernel.org/all/20250224110102.1528552-1-m-malladi@ti.com/ ==================== Link: https://patch.msgid.link/20250305101422.1908370-1-m-malladi@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11net: ti: icssg-prueth: Add XDP supportRoger Quadros
Add native XDP support. We do not support zero copy yet. Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20250305101422.1908370-4-m-malladi@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11net: ti: icssg-prueth: introduce and use prueth_swdata struct for SWDATARoger Quadros
We have different cases for SWDATA (skb, page, cmd, etc) so it is better to have a dedicated data structure for that. We can embed the type field inside the struct and use it to interpret the data in completion handlers. Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20250305101422.1908370-3-m-malladi@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11net: ti: icssg-prueth: Use page_pool API for RX buffer allocationRoger Quadros
This is to prepare for native XDP support. The page pool API is more faster in allocating pages than __alloc_skb(). Drawback is that it works at PAGE_SIZE granularity so we are not efficient in memory usage. i.e. we are using PAGE_SIZE (4KB) memory for 1.5KB max packet size. Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Meghana Malladi <m-malladi@ti.com> Link: https://patch.msgid.link/20250305101422.1908370-2-m-malladi@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11Merge branch 'enic-enable-32-64-byte-cqes-and-get-max-rx-tx-ring-size-from-hw'Paolo Abeni
Satish Kharat via says: ==================== enic: enable 32, 64 byte cqes and get max rx/tx ring size from hw This series enables using the max rx and tx ring sizes read from hw. For newer hw that can be up to 16k entries. This requires bigger completion entries for rx queues. This series enables the use of the 32 and 64 byte completion queues entries for enic rx queues on supported hw versions. This is in addition to the exiting (default) 16 byte rx cqes. Signed-off-by: Satish Kharat <satishkh@cisco.com> ==================== Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-0-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: get max rq & wq entries supported by hw, 16K queuesSatish Kharat
Enables reading the max rq and wq entries supported from the hw. Enables 16k rq and wq entries on hw that supports. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-8-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: cleanup of enic wq request completion pathSatish Kharat
Cleans up the enic wq request completion path needed for 16k wq size support. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-7-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: added enic_wq.c and enic_wq.hSatish Kharat
Moves wq related function to enic_wq.c. Prepares for a cleaup of enic wq code path. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-6-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: remove unused function cq_enet_wq_desc_decSatish Kharat
Removes cq_enet_wq_desc_dec, not needed anymore. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-5-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: enable rq extended cq supportSatish Kharat
Enables getting from hw all the supported rq cq sizes and uses the highest supported cq size. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-4-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: enic rq extended cq definesSatish Kharat
Adds the defines for 32 and 64 byte receive queue completion queue descriptors. Adds devcmd define to get rq cq descriptor size/s supported by hw. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-3-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: enic rq code reorgSatish Kharat
Separates enic rx path from generic vnic api. Removes some complexity of doign enic callbacks through vnic api in rx. This is in preparation for enabling enic extended cq which applies only to enic rx path. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-2-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-11enic: Move function from header file to c fileSatish Kharat
Moves cq_enet_rq_desc_dec from cq_enet_desc.h to enic_rq.c. This is in preparation for enic extended completion queue enabling. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Link: https://patch.msgid.link/20250304-enic_cleanup_and_ext_cq-v2-1-85804263dad8@cisco.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-10Merge branch 'mptcp-pm-code-reorganisation'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: pm: code reorganisation Before this series, the PM code was dispersed in different places: - pm.c had common code for all PMs. - pm_netlink.c was initially only about the in-kernel PM, but ended up also getting exported common helpers, callbacks used by the different PMs, NL events for PM userspace daemon, etc. quite confusing. - pm_userspace.c had userspace PM only code, but it was using "specific" in-kernel PM helpers according to their names. To clarify the code, a reorganisation is suggested here, only by moving code around, and small helper renaming to avoid confusions: - pm_netlink.c now only contains common PM generic Netlink code: - PM events: this code was already there - shared helpers around Netlink code that were already there as well - shared Netlink commands code from pm.c - pm_kernel.c now contains only code that is specific to the in-kernel PM. Now all functions are either called from: - pm.c: events coming from the core, when this PM is being used - pm_netlink.c: for shared Netlink commands - mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM - sockopt.c: for the exported counters per netns - pm.c got many code from pm_netlink.c: - helpers used from both PMs and not linked to Netlink - callbacks used by different PMs, e.g. ADD_ADDR management - some helpers have been renamed to remove the '_nl' prefix, and some have been marked as 'static'. - protocol.h has been updated accordingly: - some helpers no longer need to be exported - new ones needed to be exported: they have been prefixed if needed. The code around the PM is now less confusing, which should help for the maintenance in the long term, and the introduction of a PM Ops. This will certainly impact future backports, but because other cleanups have already done recently, and more are coming to ease the addition of a new path-manager controlled with BPF (struct_ops), doing that now seems to be a good time. Also, many issues around the PM have been fixed a few months ago while increasing the code coverage in the selftests, so such big reorganisation can be done with more confidence now. Note that checkpatch, when used with --max-line-length=80, will complain about lines being over the 80 limits, but these warnings were already there before moving the code around. Also, patch 1 is not directly related to the code reorganisation, but it was a remaining cleanup that we didn't upstream before, because it was conflicting with another patch that has been sent for inclusion to the net tree. ==================== Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-0-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: move Netlink PM helpers to pm_netlink.cMatthieu Baerts (NGI0)
Before this patch, the PM code was dispersed in different places: - pm.c had common code for all PMs, but also Netlink specific code that will not be needed with the future BPF path-managers. - pm_netlink.c had common Netlink code. To clarify the code, a reorganisation is suggested here, only by moving code around, and small helper renaming to avoid confusions: - pm_netlink.c now only contains common PM Netlink code: - PM events: this code was already there - shared helpers around Netlink code that were already there as well - shared Netlink commands code from pm.c - pm.c now no longer contain Netlink specific code. - protocol.h has been updated accordingly: - mptcp_nl_fill_addr() no longer need to be exported. The code around the PM is now less confusing, which should help for the maintenance in the long term. This will certainly impact future backports, but because other cleanups have already done recently, and more are coming to ease the addition of a new path-manager controlled with BPF (struct_ops), doing that now seems to be a good time. Also, many issues around the PM have been fixed a few months ago while increasing the code coverage in the selftests, so such big reorganisation can be done with more confidence now. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-15-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: split in-kernel PM specific codeMatthieu Baerts (NGI0)
Before this patch, the PM code was dispersed in different places: - pm.c had common code for all PMs - pm_netlink.c was supposed to be about the in-kernel PM, but also had exported common Netlink helpers, NL events for PM userspace daemons, etc. quite confusing. To clarify the code, a reorganisation is suggested here, only by moving code around to avoid confusions: - pm_netlink.c now only contains common PM Netlink code: - PM events: this code was already there - shared helpers around Netlink code that were already there as well - more shared Netlink commands code from pm.c will come after - pm_kernel.c now contains only code that is specific to the in-kernel PM. Now all functions are either called from: - pm.c: events coming from the core, when this PM is being used - pm_netlink.c: for shared Netlink commands - mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM - sockopt.c: for the exported counters per netns - (while at it, a useless 'return;' spot by checkpatch at the end of mptcp_pm_nl_set_flags_all, has been removed) The code around the PM is now less confusing, which should help for the maintenance in the long term. This will certainly impact future backports, but because other cleanups have already done recently, and more are coming to ease the addition of a new path-manager controlled with BPF (struct_ops), doing that now seems to be a good time. Also, many issues around the PM have been fixed a few months ago while increasing the code coverage in the selftests, so such big reorganisation can be done with more confidence now. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-14-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: move generic PM helpers to pm.cMatthieu Baerts (NGI0)
Before this patch, the PM code was dispersed in different places: - pm.c had common code for all PMs - pm_netlink.c was supposed to be about the in-kernel PM, but also had exported common helpers, callbacks used by the different PMs, NL events for PM userspace daemon, etc. quite confusing. - pm_userspace.c had userspace PM only code, but using specific in-kernel PM helpers To clarify the code, a reorganisation is suggested here, only by moving code around, and (un)exporting functions: - helpers used from both PMs and not linked to Netlink - callbacks used by different PMs, e.g. ADD_ADDR management - some helpers have been marked as 'static' - protocol.h has been updated accordingly - (while at it, a needless if before a kfree(), spot by checkpatch in mptcp_remove_anno_list_by_saddr(), has been removed) The code around the PM is now less confusing, which should help for the maintenance in the long term. This will certainly impact future backports, but because other cleanups have already done recently, and more are coming to ease the addition of a new path-manager controlled with BPF (struct_ops), doing that now seems to be a good time. Also, many issues around the PM have been fixed a few months ago while increasing the code coverage in the selftests, so such big reorganisation can be done with more confidence now. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-13-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: move generic helper at the topMatthieu Baerts (NGI0)
In prevision to another change importing all generic PM helpers from pm_netlink.c to there. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-12-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: export mptcp_remote_addressMatthieu Baerts (NGI0)
In a following commit, the 'remote_address' helper will need to be used from different files. It is then exported, and prefixed with 'mptcp_', similar to 'mptcp_local_address'. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-11-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: worker: split in-kernel and common tasksMatthieu Baerts (NGI0)
To make it clear what actions are in-kernel PM specific and which ones are not and done for all PMs, e.g. sending ADD_ADDR and close associated subflows when a RM_ADDR is received. The behavioural is changed a bit: MPTCP_PM_ADD_ADDR_RECEIVED is now treated after MPTCP_PM_ADD_ADDR_SEND_ACK and MPTCP_PM_RM_ADDR_RECEIVED, but that should not change anything in practice. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-10-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: avoid calling PM specific code from coreMatthieu Baerts (NGI0)
When destroying an MPTCP socket, some userspace PM specific code was called from mptcp_destroy_common() in protocol.c. That feels wrong, and it is the only case. Instead, the core now calls mptcp_pm_destroy() from pm.c which is now in charge of cleaning the announced addresses list, and ask the different PMs to do extra cleaning if needed, e.g. the userspace PM, if used, will clean the local addresses list. While at it, the userspace PM specific helper has been prefixed with 'mptcp_userspace_pm_' like the other ones. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-9-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: kernel: add '_pm' to mptcp_nl_set_flagsMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. Here, '_pm' was missing from 'mptcp_nl_set_flags'. Add '_pm' to be similar to others, and add '_all' to avoid confusions witih the global 'mptcp_pm_nl_set_flags'. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-8-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_is_init_remote_addrMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_is_init_remote_addr' is not specific to this PM: it is called from pm.c for both the in-kernel and userspace PMs. To avoid confusions, the '_nl' bit has been removed from the name. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-7-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_subflow_chk_stale()Matthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_subflow_chk_stale' is not specific to this PM: it is called from pm.c for both the in-kernel and userspace PMs. To avoid confusions, the '_nl' bit has been removed from the name. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-6-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_receivedMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_rm_addr_received' is not specific to this PM: it is called from the PM worker, and used by both the in-kernel and userspace PMs. The helper has been renamed to 'mptcp_pm_rm_addr_recv' instead of '_received' to avoid confusions with the one from pm.c. mptcp_pm_nl_rm_addr_or_subflow', and 'mptcp_pm_nl_rm_subflow_received' have been updated too for the same reason. To avoid confusions, the '_nl' bit has been removed from the name. While at it, the in-kernel PM specific code has been move from mptcp_pm_rm_addr_or_subflow to a new dedicated helper, clearer. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-5-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_workMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_work' is not specific to this PM: it is called from the core to call helpers, some of them needed by both the in-kernel and userspace PMs. To avoid confusions, the '_nl' bit has been removed from the name. Also used 'worker' instead of 'work', similar to protocol.c's worker. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-4-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_mp_prio_send_ackMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_mp_prio_send_ack()' is not specific to this PM: it is used by both the in-kernel and userspace PMs. To avoid confusions, the '_nl' bit has been removed from the name. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-3-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: remove '_nl' from mptcp_pm_nl_addr_send_ackMatthieu Baerts (NGI0)
Currently, in-kernel PM specific helpers are prefixed with 'mptcp_pm_nl_'. But here 'mptcp_pm_nl_addr_send_ack()' is not specific to this PM: it is used by both the in-kernel and userspace PMs. To avoid confusions, the '_nl' bit has been removed from the name. No behavioural changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-2-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10mptcp: pm: use addr entry for get_local_idGeliang Tang
The following code in mptcp_userspace_pm_get_local_id() that assigns "skc" to "new_entry" is not allowed in BPF if we use the same code to implement the get_local_id() interface of a BFP path manager: memset(&new_entry, 0, sizeof(struct mptcp_pm_addr_entry)); new_entry.addr = *skc; new_entry.addr.id = 0; new_entry.flags = MPTCP_PM_ADDR_FLAG_IMPLICIT; To solve the issue, this patch moves this assignment to "new_entry" forward to mptcp_pm_get_local_id(), and then passing "new_entry" as a parameter to both mptcp_pm_nl_get_local_id() and mptcp_userspace_pm_get_local_id(). No behavioural changes intended. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-1-abef20ada03b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10eth: fbnic: fix memory corruption in fbnic_tlv_attr_get_string()Dan Carpenter
This code is trying to ensure that the last byte of the buffer is a NUL terminator. However, the problem is that attr->value[] is an array of __le32, not char, so it zeroes out 4 bytes way beyond the end of the buffer. Cast the buffer to char to address this. Fixes: e5cf5107c9e4 ("eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Lee Trager <lee@trager.us> Link: https://patch.msgid.link/2791d4be-ade4-4e50-9b12-33307d8410f6@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10r8169: increase max jumbo packet size on RTL8125/RTL8126Heiner Kallweit
Realtek confirmed that all RTL8125/RTL8126 chip versions support up to 16K jumbo packets. Reflect this in the driver. Tested by Rui on RTL8125B with 12K jumbo packets. Suggested-by: Rui Salvaterra <rsalvaterra@gmail.com> Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/396762ad-cc65-4e60-b01e-8847db89e98b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10Merge branch 'follow-up-on-deduplicate-cookie-logic'Jakub Kicinski
Willem de Bruijn says: ==================== follow-up on deduplicate cookie logic 1/3: I came across a leftover from cookie deduplication, due to UDP having two code paths: lockless fast path and locked cork path. 3/3: Even though the leftover was in the fast path, this prompted me to complete coverage to the cork path. 2/3: That uncovered a subtle API inconsistency in how dontfrag is configured. It should not be possible to switch DF mid datagram. ==================== Link: https://patch.msgid.link/20250307033620.411611-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10selftests/net: expand cmsg_ip with MSG_MOREWillem de Bruijn
UDP send with MSG_MORE takes a slightly different path than the lockless fast path. For completeness, add coverage to this case too. Pass MSG_MORE on the initial sendmsg, then follow up with a zero byte write to unplug the cork. Unrelated: also add two missing endlines in usage(). Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250307033620.411611-4-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10ipv6: save dontfrag in corkWillem de Bruijn
When spanning datagram construction over multiple send calls using MSG_MORE, per datagram settings are configured on the first send. That is when ip(6)_setup_cork stores these settings for subsequent use in __ip(6)_append_data and others. The only flag that escaped this was dontfrag. As a result, a datagram could be constructed with df=0 on the first sendmsg, but df=1 on a next. Which is what cmsg_ip.sh does in an upcoming MSG_MORE test in the "diff" scenario. Changing datagram conditions in the middle of constructing an skb makes this already complex code path even more convoluted. It is here unintentional. Bring this flag in line with expected sockopt/cmsg behavior. And stop passing ipc6 to __ip6_append_data, to avoid such issues in the future. This is already the case for __ip_append_data. inet6_cork had a 6 byte hole, so the 1B flag has no impact. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250307033620.411611-3-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10ipv6: remove leftover ip6 cookie initializerWillem de Bruijn
As of the blamed commit ipc6.dontfrag is always initialized at the start of udpv6_sendmsg, by ipcm6_init_sk, to either 0 or 1. Later checks against -1 are no longer needed and the branches are now dead code. The blamed commit had removed those branches. But I had overlooked this one case. UDP has both a lockless fast path and a slower path for corked requests. This branch remained in the fast path. Fixes: 096208592b09 ("ipv6: replace ipcm6_init calls with ipcm6_init_sk") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250307033620.411611-2-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10Merge branch 'virtio-net-link-queues-to-napis'Jakub Kicinski
Joe Damato says: ==================== virtio-net: Link queues to NAPIs Jakub recently commented [1] that I should not hold this series on virtio-net linking queues to NAPIs behind other important work that is on-going and suggested I re-spin, so here we are :) As per the discussion on the v3 [2], now both RX and TX NAPIs use the API to link queues to NAPIs. Since TX-only NAPIs don't have a NAPI ID, commit 6597e8d35851 ("netdev-genl: Elide napi_id when not present") now correctly elides the TX-only NAPIs (instead of printing zero) when the queues and NAPIs are linked. As per the discussion on the v4 [3], patch 3 has been refactored to hold RTNL only in the specific locations which need it as Jason requested. As per the discussion on the v5 [4], patch 3 now leaves refill_work as-is and does not use the API to unlink and relink queues and NAPIs. A comment has been left as suggested by Jakub [5] for future work. See the commit message of patch 3 for an example of how to get the NAPI to queue mapping information. See the commit message of patch 4 for an example of how NAPI IDs are persistent despite queue count changes. [1]: https://lore.kernel.org/20250221142650.3c74dcac@kernel.org [2]: https://lore.kernel.org/20250127142400.24eca319@kernel.org [3]: https://lore.kernel.org/CACGkMEv=ejJnOWDnAu7eULLvrqXjkMkTL4cbi-uCTUhCpKN_GA@mail.gmail.com [4]: https://lore.kernel.org/Z8X15hxz8t-vXpPU@LQ3V64L9R2 [5]: https://lore.kernel.org/20250303160355.5f8d82d8@kernel.org v5: https://lore.kernel.org/20250227185017.206785-1-jdamato@fastly.com v4: https://lore.kernel.org/20250225020455.212895-1-jdamato@fastly.com rfcv3: https://lore.kernel.org/20250121191047.269844-1-jdamato@fastly.com v2: https://lore.kernel.org/20250116055302.14308-1-jdamato@fastly.com v1: https://lore.kernel.org/20250110202605.429475-1-jdamato@fastly.com ==================== Link: https://patch.msgid.link/20250307011215.266806-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>