summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-12xfrm: state: make xfrm_state_lookup_byaddr locklessFlorian Westphal
This appears to be an oversight back when the state lookup was converted to RCU, I see no reason why we need to hold the state lock here. __xfrm_state_lookup_byaddr already uses xfrm_state_hold_rcu helper to obtain a reference, so just replace the state lock with rcu. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-25Merge branch 'Support-PMTU-in-tunnel-mode-for-packet-offload'Steffen Klassert
Leon Romanovsky says: ==================== This series refactors the xdo_dev_offload_ok() to be global place for drivers to check if their offload can perform encryption for xmit packets. Such common place gives us an option to check MTU and PMTU at one place. ==================== Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21xfrm: check for PMTU in tunnel mode for packet offloadLeon Romanovsky
In tunnel mode, for the packet offload, there were no PMTU signaling to the upper level about need to fragment the packet. As a solution, call to already existing xfrm[4|6]_tunnel_check_size() to perform that. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21xfrm: provide common xdo_dev_offload_ok callback implementationLeon Romanovsky
Almost all drivers except bond and nsim had same check if device can perform XFRM offload on that specific packet. The check was that packet doesn't have IPv4 options and IPv6 extensions. In NIC drivers, the IPv4 HELEN comparison was slightly different, but the intent was to check for the same conditions. So let's chose more strict variant as a common base. Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21xfrm: rely on XFRM offloadLeon Romanovsky
After change of initialization of x->type_offload pointer to be valid only for offloaded SAs. There is no need to rely on both x->type_offload and x->xso.type to determine if SA is offloaded or not. Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21xfrm: simplify SA initialization routineLeon Romanovsky
SA replay mode is initialized differently for user-space and kernel-space users, but the call to xfrm_init_replay() existed in common path with boolean protection. That caused to situation where we have two different function orders. So let's rewrite the SA initialization flow to have same order for both in-kernel and user-space callers. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21xfrm: delay initialization of offload path till its actually requestedLeon Romanovsky
XFRM offload path is probed even if offload isn't needed at all. Let's make sure that x->type_offload pointer stays NULL for such path to reduce ambiguity. Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-12xfrm: prevent high SEQ input in non-ESN modeLeon Romanovsky
In non-ESN mode, the SEQ numbers are limited to 32 bits and seq_hi/oseq_hi are not used. So make sure that user gets proper error message, in case such assignment occurred. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-11Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-02-10 (ice, igc, e1000e) For ice: Karol, Jake, and Michal add PTP support for E830 devices. Karol refactors and cleans up PTP code. Jake allows for a common cross-timestamp implementation to be shared for all devices and Michal adds E830 support. Mateusz cleans up initial Flow Director rule creation to loop rather than duplicate repeated similar calls. For igc: Siang adjust calls to remove need for close and open calls on loading XDP program. For e1000e: Gerhard Engleder batches register writes for writing multicast table on real-time kernels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000e: Fix real-time violations on link up igc: Avoid unnecessary link down event in XDP_SETUP_PROG process ice: refactor ice_fdir_create_dflt_rules() function ice: Implement PTP support for E830 devices ice: Refactor ice_ptp_init_tx_* ice: Add unified ice_capture_crosststamp ice: Process TSYN IRQ in a separate function ice: Use FIELD_PREP for timestamp values ice: Remove unnecessary ice_is_e8xx() functions ice: Don't check device type when checking GNSS presence ==================== Link: https://patch.msgid.link/20250210192352.3799673-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11Merge branch 'sfc-support-devlink-flash'Jakub Kicinski
Edward Cree says: ==================== sfc: support devlink flash Allow upgrading device firmware on Solarflare NICs through standard tools. ==================== Link: https://patch.msgid.link/cover.1739186252.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11sfc: document devlink flash supportEdward Cree
Update the information in sfc's devlink documentation including support for firmware update with devlink flash. Also update the help text for CONFIG_SFC_MTD, as it is no longer strictly required for firmware updates. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/3476b0ef04a0944f03e0b771ec8ed1a9c70db4dc.1739186253.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11sfc: deploy devlink flash images to NIC over MCDIEdward Cree
Use MC_CMD_NVRAM_* wrappers to write the firmware to the partition identified from the image header. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/a746335876b621b3e54cf4e49948148e349a1745.1739186253.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11sfc: extend NVRAM MCDI handlersEdward Cree
Support variable write-alignment, and background updates. The latter allows other MCDI to continue while the device is processing an MC_CMD_NVRAM_UPDATE_FINISH, since this can take a long time owing to e.g. cryptographic signature verification. Expose these handlers in mcdi.h, and build them even when CONFIG_SFC_MTD=n, so they can be used for devlink flash in a subsequent patch. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/de3d9e14fee69e15d95b46258401a93b75659f78.1739186253.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11sfc: parse headers of devlink flash imagesEdward Cree
This parsing is necessary to obtain the metadata which will be used in a subsequent patch to write the image to the device. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/65318300f3f1b1462925f917f7c0d0ac833955ae.1739186253.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11Merge branch 'use-hwmon_channel_info-macro-to-simplify-code'Jakub Kicinski
Huisong Li says: ==================== Use HWMON_CHANNEL_INFO macro to simplify code The HWMON_CHANNEL_INFO macro is provided by hwmon.h and used widely by many other drivers. This series use HWMON_CHANNEL_INFO macro to simplify code in net subsystem. Note: These patches do not depend on each other. Put them togeter just for belonging to the same subsystem. ==================== Link: https://patch.msgid.link/20250210054710.12855-1-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: aquantia: Use HWMON_CHANNEL_INFO macro to simplify codeHuisong Li
Use HWMON_CHANNEL_INFO macro to simplify code. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250210054710.12855-6-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: marvell10g: Use HWMON_CHANNEL_INFO macro to simplify codeHuisong Li
Use HWMON_CHANNEL_INFO macro to simplify code. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250210054710.12855-5-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: marvell: Use HWMON_CHANNEL_INFO macro to simplify codeHuisong Li
Use HWMON_CHANNEL_INFO macro to simplify code. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250210054710.12855-4-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: nfp: Use HWMON_CHANNEL_INFO macro to simplify codeHuisong Li
Use HWMON_CHANNEL_INFO macro to simplify code. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250210054710.12855-3-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: aquantia: Use HWMON_CHANNEL_INFO macro to simplify codeHuisong Li
Use HWMON_CHANNEL_INFO macro to simplify code. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250210054710.12855-2-lihuisong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: wwan: t7xx: don't include '<linux/pm_wakeup.h>' directlyWolfram Sang
The header clearly states that it does not want to be included directly, only via '<linux/(platform_)?device.h>'. Which is already present, so delete the superfluous include. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Link: https://patch.msgid.link/20250210113710.52071-2-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: broadcom: don't include '<linux/pm_wakeup.h>' directlyWolfram Sang
The header clearly states that it does not want to be included directly, only via '<linux/(platform_)?device.h>'. Replace the include accordingly. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250210113658.52019-2-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: freescale: ucc_geth: remove unused PHY_INIT_TIMEOUT and PHY_CHANGE_TIMEHeiner Kallweit
Both definitions are unused. Last users have been removed with: 1577ecef7666 ("netdev: Merge UCC and gianfar MDIO bus drivers") 728de4c927a3 ("ucc_geth: migrate ucc_geth to phylib") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Link: https://patch.msgid.link/62e9429b-57e0-42ec-96a5-6a89553f441d@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: remove unused PHY_INIT_TIMEOUT and PHY_FORCE_TIMEOUTHeiner Kallweit
Both definitions are unused. Last users have been removed with: f3ba9d490d6e ("net: s6gmac: remove driver") 2bd229df5e2e ("net: phy: remove state PHY_FORCING") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/f8e7b8ed-a665-41ad-b0ce-cbfdb65262ef@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11hamradio: baycom: replace strcpy() with strscpy()Ethan Carter Edwards
The strcpy() function has been deprecated and replaced with strscpy(). There is an effort to make this change treewide: https://github.com/KSPP/linux/issues/88. Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/3qo3fbrak7undfgocsi2s74v4uyjbylpdqhie4dohfoh4welfn@joq7up65ug6v Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11blackhole_dev: convert self-test to KUnitTamir Duberstein
Convert this very simple smoke test to a KUnit test. Add a missing `htons` call that was spotted[0] by kernel test robot <lkp@intel.com> after initial conversion to KUnit. Link: https://lore.kernel.org/oe-kbuild-all/202502090223.qCYMBjWT-lkp@intel.com/ [0] Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://patch.msgid.link/20250208-blackholedev-kunit-convert-v2-1-182db9bd56ec@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11Merge branch 'net-phy-rename-eee_broken_mode'Jakub Kicinski
Heiner Kallweit says: ==================== net: phy: rename eee_broken_mode eee_broken_mode is used also if an EEE mode isn't actually broken but e.g. not supported by MAC. So rename it. This is split out from a bigger series that needs more rework. ==================== Link: https://patch.msgid.link/d7924d4e-49b0-4182-831f-73c558d4425e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: rename phy_set_eee_broken to phy_disable_eee_modeHeiner Kallweit
Consider that an EEE mode may not be broken but simply not supported by the MAC, and rename function phy_set_eee_broken(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/30deb630-3f6b-4ffb-a1e6-a9736021f43a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: rename eee_broken_modes to eee_disabled_modesHeiner Kallweit
This bitmap is used also if the MAC doesn't support an EEE mode. So the mode isn't necessarily broken in the PHY. Therefore rename the bitmap. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/6cd11422-dd67-4c87-a642-308de694af92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11Merge branch 'tcp-allow-to-reduce-max-rto'Paolo Abeni
Eric Dumazet says: ==================== tcp: allow to reduce max RTO This is a followup of a discussion started 6 months ago by Jason Xing. Some applications want to lower the time between each retransmit attempts. TCP_KEEPINTVL and TCP_KEEPCNT socket options don't work around the issue. This series adds: - a new TCP level socket option (TCP_RTO_MAX_MS) - a new sysctl (/proc/sys/net/ipv4/tcp_rto_max_ms) Admins and/or applications can now change the max rto value at their own risk. Link: https://lore.kernel.org/netdev/20240715033118.32322-1-kerneljasonxing@gmail.com/T/ ==================== Link: https://patch.msgid.link/20250207152830.2527578-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: add tcp_rto_max_ms sysctlEric Dumazet
Previous patch added a TCP_RTO_MAX_MS socket option to tune a TCP socket max RTO value. Many setups prefer to change a per netns sysctl. This patch adds /proc/sys/net/ipv4/tcp_rto_max_ms Its initial value is 120000 (120 seconds). Keep in mind that a decrease of tcp_rto_max_ms means shorter overall timeouts, unless tcp_retries2 sysctl is increased. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: add the ability to control max RTOEric Dumazet
Currently, TCP stack uses a constant (120 seconds) to limit the RTO value exponential growth. Some applications want to set a lower value. Add TCP_RTO_MAX_MS socket option to set a value (in ms) between 1 and 120 seconds. It is discouraged to change the socket rto max on a live socket, as it might lead to unexpected disconnects. Following patch is adding a netns sysctl to control the default value at socket creation time. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: use tcp_reset_xmit_timer()Eric Dumazet
In order to reduce TCP_RTO_MAX occurrences, replace: inet_csk_reset_xmit_timer(sk, what, when, TCP_RTO_MAX) With: tcp_reset_xmit_timer(sk, what, when, false); Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: add a @pace_delay parameter to tcp_reset_xmit_timer()Eric Dumazet
We want to factorize calls to inet_csk_reset_xmit_timer(), to ease TCP_RTO_MAX change. Current users want to add tcp_pacing_delay(sk) to the timeout. Remaining calls to inet_csk_reset_xmit_timer() do not add the pacing delay. Following patch will convert them, passing false for @pace_delay. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: remove tcp_reset_xmit_timer() @max_when argumentEric Dumazet
All callers use TCP_RTO_MAX, we can factorize this constant, becoming a variable soon. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11Merge branch 'mptcp-pm-misc-cleanups-part-2'Paolo Abeni
Matthieu Baerts says: ==================== mptcp: pm: misc cleanups, part 2 These cleanups lead the way to the unification of the path-manager interfaces, and allow future extensions. The following patches are not all linked to each others, but are all related to the path-managers. - Patch 1: drop unneeded parameter in a function helper. - Patch 2: clearer NL error message when an NL attribute is missing. - Patch 3: more precise NL messages by avoiding 'this or that is NOK'. - Patch 4: improve too vague or missing NL err messages. - Patch 5: use GENL_REQ_ATTR_CHECK to look for mandatory NL attributes. - Patch 6: avoid overriding the error message. - Patch 7: check all mandatory NL attributes with GENL_REQ_ATTR_CHECK. - Patch 8: use NL_SET_ERR_MSG_ATTR instead of GENL_SET_ERR_MSG - Patch 9: move doit callbacks used for both PM to pm.c. - Patch 10: drop another unneeded parameter in a function helper. - Patch 11: share the ID parsing code for the 'get_addr' callback. - Patch 12: share sending NL code for the 'get_addr' callback. - Patch 13: drop yet another unneeded parameter in a function helper. - Patch 14: pick the usual structure type for the remote address. - Patch 15: share the local addr parsing code for the 'set_flags' cb. The behaviour when there are no errors should then not be modified. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> ==================== v2: https://lore.kernel.org/r/20250117-net-next-mptcp-pm-misc-cleanup-2-v2-0-61d4fe0586e8@kernel.org v1: https://lore.kernel.org/r/20250116-net-next-mptcp-pm-misc-cleanup-2-v1-0-c0b43f18fe06@kernel.org Link: https://patch.msgid.link/20250207-net-next-mptcp-pm-misc-cleanup-2-v3-0-71753ed957de@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: add local parameter for set_flagsGeliang Tang
This patch updates the interfaces set_flags to reduce repetitive code, adds a new parameter 'local' for them. The local address is parsed in public helper mptcp_pm_nl_set_flags_doit(), then pass it to mptcp_pm_nl_set_flags() and mptcp_userspace_pm_set_flags(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: change rem type of set_flagsGeliang Tang
Generally, in the path manager interfaces, the local address is defined as an mptcp_pm_addr_entry type address, while the remote address is defined as an mptcp_addr_info type one: (struct mptcp_pm_addr_entry *local, struct mptcp_addr_info *remote) But the set_flags() interface uses two mptcp_pm_addr_entry type parameters. This patch changes the second one to mptcp_addr_info type and use helper mptcp_pm_parse_addr() to parse it instead of using mptcp_pm_parse_entry(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: drop skb parameter of set_flagsGeliang Tang
The first parameter 'skb' in mptcp_pm_nl_set_flags() is only used to obtained the network namespace, which can also be obtained through the second parameters 'info' by using genl_info_net() helper. This patch drops these useless parameters 'skb' in all three set_flags() interfaces. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: reuse sending nlmsg code in get_addrGeliang Tang
The netlink messages are sent both in mptcp_pm_nl_get_addr() and mptcp_userspace_pm_get_addr(), this makes the code somewhat repetitive. This is because the netlink PM and userspace PM use different locks to protect the address entry that needs to be sent via the netlink message. The former uses rcu read lock, and the latter uses msk->pm.lock. The current get_addr() flow looks like this: lock(); entry = get_entry(); send_nlmsg(entry); unlock(); After holding the lock, get the entry from the list, send the entry, and finally release the lock. This patch changes the process by getting the entry while holding the lock, then making a copy of the entry so that the lock can be released. Finally, the copy of the entry is sent without locking: lock(); entry = get_entry(); *copy = *entry; unlock(); send_nlmsg(copy); This way we can reuse the send_nlmsg() code in get_addr() interfaces between the netlink PM and userspace PM. They only need to implement their own get_addr() interfaces to hold the different locks, get the entry from the different lists, then release the locks. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: add id parameter for get_addrGeliang Tang
The address id is parsed both in mptcp_pm_nl_get_addr() and mptcp_userspace_pm_get_addr(), this makes the code somewhat repetitive. So this patch adds a new parameter 'id' for all get_addr() interfaces. The address id is only parsed in mptcp_pm_nl_get_addr_doit(), then pass it to both mptcp_pm_nl_get_addr() and mptcp_userspace_pm_get_addr(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: drop skb parameter of get_addrGeliang Tang
The first parameters 'skb' of get_addr() interfaces are now useless since mptcp_userspace_pm_get_sock() helper is used. This patch drops these useless parameters of them. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: make three pm wrappers staticGeliang Tang
Three netlink functions: mptcp_pm_nl_get_addr_doit() mptcp_pm_nl_get_addr_dumpit() mptcp_pm_nl_set_flags_doit() are generic, implemented for each PM, in-kernel PM and userspace PM. It's clearer to move them from pm_netlink.c to pm.c. And the linked three path manager wrappers mptcp_pm_get_addr() mptcp_pm_dump_addr() mptcp_pm_set_flags() can be changed as static functions, no need to export them in protocol.h. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: use NL_SET_ERR_MSG_ATTR when possibleMatthieu Baerts (NGI0)
Instead of only returning a text message with GENL_SET_ERR_MSG(), NL_SET_ERR_MSG_ATTR() can help the userspace developers by also reporting which attribute is faulty. When the error is specific to an attribute, NL_SET_ERR_MSG_ATTR() is now used. The error messages have not been modified in this commit. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: mark missing address attributesMatthieu Baerts (NGI0)
mptcp_pm_parse_entry() will check if the given attribute is defined. If not, it will return a generic error: "missing address info". It might then not be clear for the userspace developer which attribute is missing, especially when the command takes multiple addresses. By using GENL_REQ_ATTR_CHECK(), the userspace will get a hint about which attribute is missing, making thing clearer. Note that this is what was already done for most of the other MPTCP NL commands, this patch simply adds the missing ones. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: remove duplicated error messagesMatthieu Baerts (NGI0)
mptcp_pm_parse_entry() and mptcp_pm_parse_addr() will already set a error message in case of parsing issue. Then, no need to override this error message with another less precise one: "error parsing address". Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: userspace: use GENL_REQ_ATTR_CHECKGeliang Tang
A more general way to check if MPTCP_PM_ATTR_* exists in 'info' is to use GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ATTR_*) instead of directly reading info->attrs[MPTCP_PM_ATTR_*] and then checking if it's NULL. So this patch uses GENL_REQ_ATTR_CHECK() for userspace PM in mptcp_pm_nl_announce_doit(), mptcp_pm_nl_remove_doit(), mptcp_pm_nl_subflow_create_doit(), mptcp_pm_nl_subflow_destroy_doit() and mptcp_userspace_pm_get_sock(). Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: improve error messagesMatthieu Baerts (NGI0)
Some error messages were: - too generic: "missing input", "invalid request" - not precise enough: "limit greater than maximum" but what's the max? - missing: subflow not found, or connect error. This can be easily improved by being more precise, or adding new error messages. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: more precise error messagesMatthieu Baerts (NGI0)
Some errors reported by the userspace PM were vague: "this or that is invalid". It is easier for the userspace to know which part is wrong, instead of having to guess that. While at it, in mptcp_userspace_pm_set_flags() move the parsing after the check linked to the local attribute. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11mptcp: pm: userspace: flags: clearer msg if no remote addrMatthieu Baerts (NGI0)
Since its introduction in commit 892f396c8e68 ("mptcp: netlink: issue MP_PRIO signals from userspace PMs"), it was mandatory to specify the remote address, because of the 'if (rem->addr.family == AF_UNSPEC)' check done later one. In theory, this attribute can be optional, but it sounds better to be precise to avoid sending the MP_PRIO on the wrong subflow, e.g. if there are multiple subflows attached to the same local ID. This can be relaxed later on if there is a need to act on multiple subflows with one command. For the moment, the check to see if attr_rem is NULL can be removed, because mptcp_pm_parse_entry() will do this check as well, no need to do that differently here. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>