summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-05ublk: set_params: properly check if parameters can be appliedUday Shankar
The parameters set by the set_params call are only applied to the block device in the start_dev call. So if a device has already been started, a subsequently issued set_params on that device will not have the desired effect, and should return an error. There is an existing check for this - set_params fails on devices in the LIVE state. But this check is not sufficient to cover the recovery case. In this case, the device will be in the QUIESCED or FAIL_IO states, so set_params will succeed. But this success is misleading, because the parameters will not be applied, since the device has already been started (by a previous ublk server). The bit UB_STATE_USED is set on completion of the start_dev; use it to detect and fail set_params commands which arrive too late to be applied (after start_dev). Signed-off-by: Uday Shankar <ushankar@purestorage.com> Fixes: 0aa73170eba5 ("ublk_drv: add SET_PARAMS/GET_PARAMS control command") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250304-set_params-v1-1-17b5e0887606@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-05net-timestamp: support TCP GSO case for a few missing flagsJason Xing
When I read through the TSO codes, I found out that we probably miss initializing the tx_flags of last seg when TSO is turned off, which means at the following points no more timestamp (for this last one) will be generated. There are three flags to be handled in this patch: 1. SKBTX_HW_TSTAMP 2. SKBTX_BPF 3. SKBTX_SCHED_TSTAMP Note that SKBTX_BPF[1] was added in 6.14.0-rc2 by commit 6b98ec7e882af ("bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback") and only belongs to net-next branch material for now. The common issue of the above three flags can be fixed by this single patch. This patch initializes the tx_flags to SKBTX_ANY_TSTAMP like what the UDP GSO does to make the newly segmented last skb inherit the tx_flags so that requested timestamp will be generated in each certain layer, or else that last one has zero value of tx_flags which leads to no timestamp at all. Fixes: 4ed2d765dfacc ("net-timestamp: TCP timestamping") Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05exfat: add a check for invalid data sizeYuezhang Mo
Add a check for invalid data size to avoid corrupted filesystem from being further corrupted. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-05exfat: short-circuit zero-byte writes in exfat_file_write_iterEric Sandeen
When generic_write_checks() returns zero, it means that iov_iter_count() is zero, and there is no work to do. Simply return success like all other filesystems do, rather than proceeding down the write path, which today yields an -EFAULT in generic_perform_write() via the (fault_in_iov_iter_readable(i, bytes) == bytes) check when bytes == 0. Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Reported-by: Noah <kernel-org-10@maxgrass.eu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-05exfat: fix soft lockup in exfat_clear_bitmapNamjae Jeon
bitmap clear loop will take long time in __exfat_free_cluster() if data size of file/dir enty is invalid. If cluster bit in bitmap is already clear, stop clearing bitmap go to out of loop. Fixes: 31023864e67a ("exfat: add fat entry operations") Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-05exfat: fix just enough dentries but allocate a new cluster to dirYuezhang Mo
This commit fixes the condition for allocating cluster to parent directory to avoid allocating new cluster to parent directory when there are just enough empty directory entries at the end of the parent directory. Fixes: af02c72d0b62 ("exfat: convert exfat_find_empty_entry() to use dentry cache") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-05Merge branch 'dynamic-possix-clocks-permission-checks'David S. Miller
Wojtek Wasko says: ==================== Permission checks for dynamic POSIX clocks Dynamic clocks - such as PTP clocks - extend beyond the standard POSIX clock API by using ioctl calls. While file permissions are enforced for standard POSIX operations, they are not implemented for ioctl calls, since the POSIX layer cannot differentiate between calls which modify the clock's state (like enabling PPS output generation) and those that don't (such as retrieving the clock's PPS capabilities). On the other hand, drivers implementing the dynamic clocks lack the necessary information context to enforce permission checks themselves. Additionally, POSIX clock layer requires the WRITE permission even for readonly adjtime() operations before invoking the callback. Add a struct file pointer to the POSIX clock context and use it to implement the appropriate permission checks on PTP chardevs. Permit readonly adjtime() for dynamic clocks. Add a readonly option to testptp. Changes in v4: - Allow readonly adjtime() for dynamic clocks, as suggested by Thomas Changes in v3: - Reword the log message for commit against posix-clock and fix documentation of struct posix_clock_context, as suggested by Thomas Changes in v2: - Store file pointer in POSIX clock context rather than fmode in the PTP clock's private data, as suggested by Richard. - Move testptp.c changes into separate patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05testptp: Add option to open PHC in readonly modeWojtek Wasko
PTP Hardware Clocks no longer require WRITE permission to perform readonly operations, such as listing device capabilities or listening to EXTTS events once they have been enabled by a process with WRITE permissions. Add '-r' option to testptp to open the PHC in readonly mode instead of the default read-write mode. Skip enabling EXTTS if readonly mode is requested. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05ptp: Add PHC file mode checks. Allow RO adjtime() without FMODE_WRITE.Wojtek Wasko
Many devices implement highly accurate clocks, which the kernel manages as PTP Hardware Clocks (PHCs). Userspace applications rely on these clocks to timestamp events, trace workload execution, correlate timescales across devices, and keep various clocks in sync. The kernel’s current implementation of PTP clocks does not enforce file permissions checks for most device operations except for POSIX clock operations, where file mode is verified in the POSIX layer before forwarding the call to the PTP subsystem. Consequently, it is common practice to not give unprivileged userspace applications any access to PTP clocks whatsoever by giving the PTP chardevs 600 permissions. An example of users running into this limitation is documented in [1]. Additionally, POSIX layer requires WRITE permission even for readonly adjtime() calls which are used in PTP layer to return current frequency offset applied to the PHC. Add permission checks for functions that modify the state of a PTP device. Continue enforcing permission checks for POSIX clock operations (settime, adjtime) in the POSIX layer. Only require WRITE access for dynamic clocks adjtime() if any flags are set in the modes field. [1] https://lists.nwtime.org/sympa/arc/linuxptp-users/2024-01/msg00036.html Changes in v4: - Require FMODE_WRITE in ajtime() only for calls modifying the clock in any way. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05posix-clock: Store file pointer in struct posix_clock_contextWojtek Wasko
File descriptor based pc_clock_*() operations of dynamic posix clocks have access to the file pointer and implement permission checks in the generic code before invoking the relevant dynamic clock callback. Character device operations (open, read, poll, ioctl) do not implement a generic permission control and the dynamic clock callbacks have no access to the file pointer to implement them. Extend struct posix_clock_context with a struct file pointer and initialize it in posix_clock_open(), so that all dynamic clock callbacks can access it. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05gpio: rcar: Use raw_spinlock to protect register accessNiklas Söderlund
Use raw_spinlock in order to fix spurious messages about invalid context when spinlock debugging is enabled. The lock is only used to serialize register access. [ 4.239592] ============================= [ 4.239595] [ BUG: Invalid wait context ] [ 4.239599] 6.13.0-rc7-arm64-renesas-05496-gd088502a519f #35 Not tainted [ 4.239603] ----------------------------- [ 4.239606] kworker/u8:5/76 is trying to lock: [ 4.239609] ffff0000091898a0 (&p->lock){....}-{3:3}, at: gpio_rcar_config_interrupt_input_mode+0x34/0x164 [ 4.239641] other info that might help us debug this: [ 4.239643] context-{5:5} [ 4.239646] 5 locks held by kworker/u8:5/76: [ 4.239651] #0: ffff0000080fb148 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x190/0x62c [ 4.250180] OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'frame-master' with a value. [ 4.254094] #1: ffff80008299bd80 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1b8/0x62c [ 4.254109] #2: ffff00000920c8f8 [ 4.258345] OF: /soc/sound@ec500000/ports/port@1/endpoint: Read of boolean property 'bitclock-master' with a value. [ 4.264803] (&dev->mutex){....}-{4:4}, at: __device_attach_async_helper+0x3c/0xdc [ 4.264820] #3: ffff00000a50ca40 (request_class#2){+.+.}-{4:4}, at: __setup_irq+0xa0/0x690 [ 4.264840] #4: [ 4.268872] OF: /soc/sound@ec500000/ports/port@1/endpoint: Read of boolean property 'frame-master' with a value. [ 4.273275] ffff00000a50c8c8 (lock_class){....}-{2:2}, at: __setup_irq+0xc4/0x690 [ 4.296130] renesas_sdhi_internal_dmac ee100000.mmc: mmc1 base at 0x00000000ee100000, max clock rate 200 MHz [ 4.304082] stack backtrace: [ 4.304086] CPU: 1 UID: 0 PID: 76 Comm: kworker/u8:5 Not tainted 6.13.0-rc7-arm64-renesas-05496-gd088502a519f #35 [ 4.304092] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT) [ 4.304097] Workqueue: async async_run_entry_fn [ 4.304106] Call trace: [ 4.304110] show_stack+0x14/0x20 (C) [ 4.304122] dump_stack_lvl+0x6c/0x90 [ 4.304131] dump_stack+0x14/0x1c [ 4.304138] __lock_acquire+0xdfc/0x1584 [ 4.426274] lock_acquire+0x1c4/0x33c [ 4.429942] _raw_spin_lock_irqsave+0x5c/0x80 [ 4.434307] gpio_rcar_config_interrupt_input_mode+0x34/0x164 [ 4.440061] gpio_rcar_irq_set_type+0xd4/0xd8 [ 4.444422] __irq_set_trigger+0x5c/0x178 [ 4.448435] __setup_irq+0x2e4/0x690 [ 4.452012] request_threaded_irq+0xc4/0x190 [ 4.456285] devm_request_threaded_irq+0x7c/0xf4 [ 4.459398] ata1: link resume succeeded after 1 retries [ 4.460902] mmc_gpiod_request_cd_irq+0x68/0xe0 [ 4.470660] mmc_start_host+0x50/0xac [ 4.474327] mmc_add_host+0x80/0xe4 [ 4.477817] tmio_mmc_host_probe+0x2b0/0x440 [ 4.482094] renesas_sdhi_probe+0x488/0x6f4 [ 4.486281] renesas_sdhi_internal_dmac_probe+0x60/0x78 [ 4.491509] platform_probe+0x64/0xd8 [ 4.495178] really_probe+0xb8/0x2a8 [ 4.498756] __driver_probe_device+0x74/0x118 [ 4.503116] driver_probe_device+0x3c/0x154 [ 4.507303] __device_attach_driver+0xd4/0x160 [ 4.511750] bus_for_each_drv+0x84/0xe0 [ 4.515588] __device_attach_async_helper+0xb0/0xdc [ 4.520470] async_run_entry_fn+0x30/0xd8 [ 4.524481] process_one_work+0x210/0x62c [ 4.528494] worker_thread+0x1ac/0x340 [ 4.532245] kthread+0x10c/0x110 [ 4.535476] ret_from_fork+0x10/0x20 Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250121135833.3769310-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-05gpio: aggregator: protect driver attr handlers against module unloadKoichiro Den
Both new_device_store and delete_device_store touch module global resources (e.g. gpio_aggregator_lock). To prevent race conditions with module unload, a reference needs to be held. Add try_module_get() in these handlers. For new_device_store, this eliminates what appears to be the most dangerous scenario: if an id is allocated from gpio_aggregator_idr but platform_device_register has not yet been called or completed, a concurrent module unload could fail to unregister/delete the device, leaving behind a dangling platform device/GPIO forwarder. This can result in various issues. The following simple reproducer demonstrates these problems: #!/bin/bash while :; do # note: whether 'gpiochip0 0' exists or not does not matter. echo 'gpiochip0 0' > /sys/bus/platform/drivers/gpio-aggregator/new_device done & while :; do modprobe gpio-aggregator modprobe -r gpio-aggregator done & wait Starting with the following warning, several kinds of warnings will appear and the system may become unstable: ------------[ cut here ]------------ list_del corruption, ffff888103e2e980->next is LIST_POISON1 (dead000000000100) WARNING: CPU: 1 PID: 1327 at lib/list_debug.c:56 __list_del_entry_valid_or_report+0xa3/0x120 [...] RIP: 0010:__list_del_entry_valid_or_report+0xa3/0x120 [...] Call Trace: <TASK> ? __list_del_entry_valid_or_report+0xa3/0x120 ? __warn.cold+0x93/0xf2 ? __list_del_entry_valid_or_report+0xa3/0x120 ? report_bug+0xe6/0x170 ? __irq_work_queue_local+0x39/0xe0 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? __list_del_entry_valid_or_report+0xa3/0x120 gpiod_remove_lookup_table+0x22/0x60 new_device_store+0x315/0x350 [gpio_aggregator] kernfs_fop_write_iter+0x137/0x1f0 vfs_write+0x262/0x430 ksys_write+0x60/0xd0 do_syscall_64+0x6c/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] </TASK> ---[ end trace 0000000000000000 ]--- Fixes: 828546e24280 ("gpio: Add GPIO Aggregator") Cc: stable@vger.kernel.org Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250224143134.3024598-2-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-05platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TAShyam Sundar S K
The PMF driver allocates a shared memory buffer using tee_shm_alloc_kernel_buf() for communication with the PMF-TA. The latest PMF-TA version introduces new structures with OEM debug information and additional policy input conditions for evaluating the policy binary. Consequently, the shared memory size must be increased to ensure compatibility between the PMF driver and the updated PMF-TA. To do so, introduce the new PMF-TA UUID and update the PMF shared memory configuration to ensure compatibility with the latest PMF-TA version. Additionally, export the TA UUID. These updates will result in modifications to the prototypes of amd_pmf_tee_init() and amd_pmf_ta_open_session(). Link: https://lore.kernel.org/all/55ac865f-b1c7-fa81-51c4-d211c7963e7e@linux.intel.com/ Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20250305045842.4117767-2-Shyam-sundar.S-k@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-05platform/x86/amd/pmf: Propagate PMF-TA return codesShyam Sundar S K
In the amd_pmf_invoke_cmd_init() function within the PMF driver ensure that the actual result from the PMF-TA is returned rather than a generic EIO. This change allows for proper handling of errors originating from the PMF-TA. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20250305045842.4117767-1-Shyam-sundar.S-k@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-05doc: correcting two prefix errors in idmappings.rstAiden Ma
Add the 'k' prefix to id 21000. And id `u1000` in the third idmapping should be mapped to `k31000`, not `u31000`. Signed-off-by: Aiden Ma <jiaheng.ma@foxmail.com> Link: https://lore.kernel.org/r/tencent_4E7B1F143E8051530C21FCADF4E014DCBB06@qq.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05ALSA: hda: intel: Add Dell ALC3271 to power_save denylistHoku Ishibe
Dell XPS 13 7390 with the Realtek ALC3271 codec experiences persistent humming noise when the power_save mode is enabled. This issue occurs when the codec enters power saving mode, leading to unwanted noise from the speakers. This patch adds the affected model (PCI ID 0x1028:0x0962) to the power_save denylist to ensure power_save is disabled by default, preventing power-off related noise issues. Steps to Reproduce 1. Boot the system with `snd_hda_intel` loaded. 2. Verify that `power_save` mode is enabled: ```sh cat /sys/module/snd_hda_intel/parameters/power_save ```` output: 10 (default power save timeout) 3. Wait for the power save timeout 4. Observe a persistent humming noise from the speakers 5. Disable `power_save` manually: ```sh echo 0 | sudo tee /sys/module/snd_hda_intel/parameters/power_save ```` 6. Confirm that the noise disappears immediately. This issue has been observed on my system, and this patch successfully eliminates the unwanted noise. If other users experience similar issues, additional reports would be helpful. Signed-off-by: Hoku Ishibe <me@hokuishi.be> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250224020517.51035-1-me@hokuishi.be Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-05ALSA: hda/realtek: update ALC222 depop optimizeKailang Yang
Add ALC222 its own depop functions for alc_init and alc_shutup. [note: this fixes pop noise issues on the models with two headphone jacks -- tiwai ] Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-05x86/sgx: Fix size overflows in sgx_encl_create()Jarkko Sakkinen
The total size calculated for EPC can overflow u64 given the added up page for SECS. Further, the total size calculated for shmem can overflow even when the EPC size stays within limits of u64, given that it adds the extra space for 128 byte PCMD structures (one for each page). Address this by pre-evaluating the micro-architectural requirement of SGX: the address space size must be power of two. This is eventually checked up by ECREATE but the pre-check has the additional benefit of making sure that there is some space for additional data. Fixes: 888d24911787 ("x86/sgx: Add SGX_IOC_ENCLAVE_CREATE") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250305050006.43896-1-jarkko@kernel.org Closes: https://lore.kernel.org/linux-sgx/c87e01a0-e7dd-4749-a348-0980d3444f04@stanley.mountain/
2025-03-04Merge tag 'x86_microcode_for_v6.14_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD microcode loading fixes from Borislav Petkov: - Load only sha256-signed microcode patch blobs - Other good cleanups * tag 'x86_microcode_for_v6.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Load only SHA256-checksummed patches x86/microcode/AMD: Add get_patch_level() x86/microcode/AMD: Get rid of the _load_microcode_amd() forward declaration x86/microcode/AMD: Merge early_apply_microcode() into its single callsite x86/microcode/AMD: Remove unused save_microcode_in_initrd_amd() declarations x86/microcode/AMD: Remove ugly linebreak in __verify_patch_section() signature
2025-03-04vlan: enforce underlying device typeOscar Maes
Currently, VLAN devices can be created on top of non-ethernet devices. Besides the fact that it doesn't make much sense, this also causes a bug which leaks the address of a kernel function to usermode. When creating a VLAN device, we initialize GARP (garp_init_applicant) and MRP (mrp_init_applicant) for the underlying device. As part of the initialization process, we add the multicast address of each applicant to the underlying device, by calling dev_mc_add. __dev_mc_add uses dev->addr_len to determine the length of the new multicast address. This causes an out-of-bounds read if dev->addr_len is greater than 6, since the multicast addresses provided by GARP and MRP are only 6 bytes long. This behaviour can be reproduced using the following commands: ip tunnel add gretest mode ip6gre local ::1 remote ::2 dev lo ip l set up dev gretest ip link add link gretest name vlantest type vlan id 100 Then, the following command will display the address of garp_pdu_rcv: ip maddr show | grep 01:80:c2:00:00:21 Fix the bug by enforcing the type of the underlying device during VLAN device initialization. Fixes: 22bedad3ce11 ("net: convert multicast list to list_head") Reported-by: syzbot+91161fe81857b396c8a0@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/000000000000ca9a81061a01ec20@google.com/ Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250303155619.8918-1-oscmaes92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: Prevent use after free in netif_napi_set_irq_locked()Dan Carpenter
The cpu_rmap_put() will call kfree() when the last reference is dropped so it could result in a use after free when we dereference the same pointer the next line. Move the cpu_rmap_put() after the dereference. Fixes: bd7c00605ee0 ("net: move aRFS rmap management and CPU affinity to core") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/5a9c53a4-5487-4b8c-9ffa-d8e5343aaaaf@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: cadence: macb: Synchronize standard statsSean Anderson
The new stats calculations add several additional calls to macb/gem_update_stats() and accesses to bp->hw_stats. These are protected by a spinlock since commit fa52f15c745c ("net: cadence: macb: Synchronize stats calculations"), which was applied in parallel. Add some locking now that the net has been merged into net-next. Fixes: f6af690a295a ("net: cadence: macb: Report standard stats") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250303231832.1648274-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04Merge branch 'tcp-scale-connect-under-pressure'Jakub Kicinski
Eric Dumazet says: ==================== tcp: scale connect() under pressure Adoption of bhash2 in linux-6.1 made some operations almost twice more expensive, because of additional locks. This series adds RCU in __inet_hash_connect() to help the case where many attempts need to be made before finding an available 4-tuple. This brings a ~200 % improvement in this experiment: Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server Before series: utime_start=0.288582 utime_end=1.548707 stime_start=20.637138 stime_end=2002.489845 num_transactions=484453 latency_min=0.156279245 latency_max=20.922042756 latency_mean=1.546521274 latency_stddev=3.936005194 num_samples=312537 throughput=47426.00 perf top on the client: 49.54% [kernel] [k] _raw_spin_lock 25.87% [kernel] [k] _raw_spin_lock_bh 5.97% [kernel] [k] queued_spin_lock_slowpath 5.67% [kernel] [k] __inet_hash_connect 3.53% [kernel] [k] __inet6_check_established 3.48% [kernel] [k] inet6_ehashfn 0.64% [kernel] [k] rcu_all_qs After this series: utime_start=0.271607 utime_end=3.847111 stime_start=18.407684 stime_end=1997.485557 num_transactions=1350742 latency_min=0.014131929 latency_max=17.895073144 latency_mean=0.505675853 # Nice reduction of latency metrics latency_stddev=2.125164772 num_samples=307884 throughput=139866.80 # 194 % increase perf top on client: 56.86% [kernel] [k] __inet6_check_established 17.96% [kernel] [k] __inet_hash_connect 13.88% [kernel] [k] inet6_ehashfn 2.52% [kernel] [k] rcu_all_qs 2.01% [kernel] [k] __cond_resched 0.41% [kernel] [k] _raw_spin_lock ==================== Link: https://patch.msgid.link/20250302124237.3913746-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04tcp: use RCU lookup in __inet_hash_connect()Eric Dumazet
When __inet_hash_connect() has to try many 4-tuples before finding an available one, we see a high spinlock cost from the many spin_lock_bh(&head->lock) performed in its loop. This patch adds an RCU lookup to avoid the spinlock cost. check_established() gets a new @rcu_lookup argument. First reason is to not make any changes while head->lock is not held. Second reason is to not make this RCU lookup a second time after the spinlock has been acquired. Tested: Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server Before series: utime_start=0.288582 utime_end=1.548707 stime_start=20.637138 stime_end=2002.489845 num_transactions=484453 latency_min=0.156279245 latency_max=20.922042756 latency_mean=1.546521274 latency_stddev=3.936005194 num_samples=312537 throughput=47426.00 perf top on the client: 49.54% [kernel] [k] _raw_spin_lock 25.87% [kernel] [k] _raw_spin_lock_bh 5.97% [kernel] [k] queued_spin_lock_slowpath 5.67% [kernel] [k] __inet_hash_connect 3.53% [kernel] [k] __inet6_check_established 3.48% [kernel] [k] inet6_ehashfn 0.64% [kernel] [k] rcu_all_qs After this series: utime_start=0.271607 utime_end=3.847111 stime_start=18.407684 stime_end=1997.485557 num_transactions=1350742 latency_min=0.014131929 latency_max=17.895073144 latency_mean=0.505675853 # Nice reduction of latency metrics latency_stddev=2.125164772 num_samples=307884 throughput=139866.80 # 190 % increase perf top on client: 56.86% [kernel] [k] __inet6_check_established 17.96% [kernel] [k] __inet_hash_connect 13.88% [kernel] [k] inet6_ehashfn 2.52% [kernel] [k] rcu_all_qs 2.01% [kernel] [k] __cond_resched 0.41% [kernel] [k] _raw_spin_lock Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Tested-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250302124237.3913746-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04tcp: add RCU management to inet_bind_bucketEric Dumazet
Add RCU protection to inet_bind_bucket structure. - Add rcu_head field to the structure definition. - Use kfree_rcu() at destroy time, and remove inet_bind_bucket_destroy() first argument. - Use hlist_del_rcu() and hlist_add_head_rcu() methods. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250302124237.3913746-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04tcp: optimize inet_use_bhash2_on_bind()Eric Dumazet
There is no reason to call ipv6_addr_type(). Instead, use highly optimized ipv6_addr_any() and ipv6_addr_v4mapped(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250302124237.3913746-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04tcp: use RCU in __inet{6}_check_established()Eric Dumazet
When __inet_hash_connect() has to try many 4-tuples before finding an available one, we see a high spinlock cost from __inet_check_established() and/or __inet6_check_established(). This patch adds an RCU lookup to avoid the spinlock acquisition when the 4-tuple is found in the hash table. Note that there are still spin_lock_bh() calls in __inet_hash_connect() to protect inet_bind_hashbucket, this will be fixed later in this series. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Tested-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250302124237.3913746-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04mptcp: fix 'scheduling while atomic' in mptcp_pm_nl_append_new_local_addrKrister Johansen
If multiple connection requests attempt to create an implicit mptcp endpoint in parallel, more than one caller may end up in mptcp_pm_nl_append_new_local_addr because none found the address in local_addr_list during their call to mptcp_pm_nl_get_local_id. In this case, the concurrent new_local_addr calls may delete the address entry created by the previous caller. These deletes use synchronize_rcu, but this is not permitted in some of the contexts where this function may be called. During packet recv, the caller may be in a rcu read critical section and have preemption disabled. An example stack: BUG: scheduling while atomic: swapper/2/0/0x00000302 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1)) dump_stack (lib/dump_stack.c:124) __schedule_bug (kernel/sched/core.c:5943) schedule_debug.constprop.0 (arch/x86/include/asm/preempt.h:33 kernel/sched/core.c:5970) __schedule (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 kernel/sched/features.h:29 kernel/sched/core.c:6621) schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6804 kernel/sched/core.c:6818) schedule_timeout (kernel/time/timer.c:2160) wait_for_completion (kernel/sched/completion.c:96 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:148) __wait_rcu_gp (include/linux/rcupdate.h:311 kernel/rcu/update.c:444) synchronize_rcu (kernel/rcu/tree.c:3609) mptcp_pm_nl_append_new_local_addr (net/mptcp/pm_netlink.c:966 net/mptcp/pm_netlink.c:1061) mptcp_pm_nl_get_local_id (net/mptcp/pm_netlink.c:1164) mptcp_pm_get_local_id (net/mptcp/pm.c:420) subflow_check_req (net/mptcp/subflow.c:98 net/mptcp/subflow.c:213) subflow_v4_route_req (net/mptcp/subflow.c:305) tcp_conn_request (net/ipv4/tcp_input.c:7216) subflow_v4_conn_request (net/mptcp/subflow.c:651) tcp_rcv_state_process (net/ipv4/tcp_input.c:6709) tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1934) tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2334) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (include/linux/rcupdate.h:813 net/ipv4/ip_input.c:234) ip_local_deliver (include/linux/netfilter.h:314 include/linux/netfilter.h:308 net/ipv4/ip_input.c:254) ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:580) ip_sublist_rcv (net/ipv4/ip_input.c:640) ip_list_rcv (net/ipv4/ip_input.c:675) __netif_receive_skb_list_core (net/core/dev.c:5583 net/core/dev.c:5631) netif_receive_skb_list_internal (net/core/dev.c:5685 net/core/dev.c:5774) napi_complete_done (include/linux/list.h:37 include/net/gro.h:449 include/net/gro.h:444 net/core/dev.c:6114) igb_poll (drivers/net/ethernet/intel/igb/igb_main.c:8244) igb __napi_poll (net/core/dev.c:6582) net_rx_action (net/core/dev.c:6653 net/core/dev.c:6787) handle_softirqs (kernel/softirq.c:553) __irq_exit_rcu (kernel/softirq.c:588 kernel/softirq.c:427 kernel/softirq.c:636) irq_exit_rcu (kernel/softirq.c:651) common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14)) </IRQ> This problem seems particularly prevalent if the user advertises an endpoint that has a different external vs internal address. In the case where the external address is advertised and multiple connections already exist, multiple subflow SYNs arrive in parallel which tends to trigger the race during creation of the first local_addr_list entries which have the internal address instead. Fix by skipping the replacement of an existing implicit local address if called via mptcp_pm_nl_get_local_id. Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250303-net-mptcp-fix-sched-while-atomic-v1-1-f6a216c5a74c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: ethernet: ti: cpsw_new: populate netdev of_nodeAlexander Sverdlin
So that of_find_net_device_by_node() can find CPSW ports and other DSA switches can be stacked downstream. Tested in conjunction with KSZ8873. Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://patch.msgid.link/20250303074703.1758297-1-alexander.sverdlin@siemens.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04tipc: Reduce scope for the variable “fdefq” in tipc_link_tnl_prepare()Markus Elfring
The address of a data structure member was determined before a corresponding null pointer check in the implementation of the function “tipc_link_tnl_prepare”. Thus avoid the risk for undefined behaviour by moving the definition for the local variable “fdefq” into an if branch at the end. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://patch.msgid.link/08fe8fc3-19c3-4324-8719-0ee74b0f32c9@web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04selftests: drv-net: use env.rpath in the HDS testJakub Kicinski
Commit 29b036be1b0b ("selftests: drv-net: test XDP, HDS auto and the ioctl path") added a new test case in the net tree, now that this code has made its way to net-next convert it to use the env.rpath() helper instead of manually computing the relative path. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250228212956.25399-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04dsa: mt7530: Utilize REGMAP_IRQ for interrupt handlingDaniel Golle
Replace the custom IRQ chip handler and mask/unmask functions with REGMAP_IRQ. This significantly simplifies the code and allows for the removal of almost all interrupt-related functions from mt7530.c. Tested on MT7988A built-in switch (MMIO) as well as MT7531AE IC (MDIO). Signed-off-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/221013c3530b61504599e285c341a993f6188f00.1740792674.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: ethtool: netlink: Allow NULL nlattrs when getting a phy_deviceMaxime Chevallier
ethnl_req_get_phydev() is used to lookup a phy_device, in the case an ethtool netlink command targets a specific phydev within a netdev's topology. It takes as a parameter a const struct nlattr *header that's used for error handling : if (!phydev) { NL_SET_ERR_MSG_ATTR(extack, header, "no phy matching phyindex"); return ERR_PTR(-ENODEV); } In the notify path after a ->set operation however, there's no request attributes available. The typical callsite for the above function looks like: phydev = ethnl_req_get_phydev(req_base, tb[ETHTOOL_A_XXX_HEADER], info->extack); So, when tb is NULL (such as in the ethnl notify path), we have a nice crash. It turns out that there's only the PLCA command that is in that case, as the other phydev-specific commands don't have a notification. This commit fixes the crash by passing the cmd index and the nlattr array separately, allowing NULL-checking it directly inside the helper. Fixes: c15e065b46dc ("net: ethtool: Allow passing a phy index for some commands") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reported-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20250301141114.97204-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04ppp: use IFF_NO_QUEUE in virtual interfacesQingfang Deng
For PPPoE, PPTP, and PPPoL2TP, the start_xmit() function directly forwards packets to the underlying network stack and never returns anything other than 1. So these interfaces do not require a qdisc, and the IFF_NO_QUEUE flag should be set. Introduces a direct_xmit flag in struct ppp_channel to indicate when IFF_NO_QUEUE should be applied. The flag is set in ppp_connect_channel() for relevant protocols. While at it, remove the usused latency member from struct ppp_channel. Signed-off-by: Qingfang Deng <dqfext@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250301135517.695809-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04Merge branch 'eth-fbnic-cleanup-macros-and-string-function'Jakub Kicinski
Lee Trager says: ==================== eth: fbnic: Cleanup macros and string function We have received some feedback that the macros we use for reading FW mailbox attributes are too large in scope and confusing to understanding. Additionally the string function did not provide errors allowing it to silently succeed. This patch set fixes theses issues. ==================== Link: https://patch.msgid.link/20250228191935.3953712-1-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04eth: fbnic: Replace firmware field macrosLee Trager
Replace the firmware field macros with new macros which follow typical kernel standards. No variables are required to be predefined for use and results are now returned. These macros are prefixed with fta or fbnic TLV attribute. Signed-off-by: Lee Trager <lee@trager.us> Link: https://patch.msgid.link/20250228191935.3953712-4-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()Lee Trager
Allow fbnic_tlv_attr_get_string() to return an error code. In the event the source mailbox attribute is missing return -EINVAL. Like nla_strscpy() return -E2BIG when the source string is larger than the destination string. In this case the amount of data copied is equal to dstsize. Signed-off-by: Lee Trager <lee@trager.us> Link: https://patch.msgid.link/20250228191935.3953712-3-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04eth: fbnic: Prepend TSENE FW fields with FBNIC_FWLee Trager
All other firmware fields are prepended with FBNIC_FW. Update TSENE fields to follow the same format. Signed-off-by: Lee Trager <lee@trager.us> Link: https://patch.msgid.link/20250228191935.3953712-2-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04Merge branch ↵Jakub Kicinski
'net-convert-gianfar-triple-speed-ethernet-controller-bindings-to-yaml' J. Neuschäfer says: ==================== net: Convert Gianfar (Triple Speed Ethernet Controller) bindings to YAML The aim of this series is to modernize the device tree bindings for the Freescale "Gianfar" ethernet controller (a.k.a. TSEC, Triple Speed Ethernet Controller) by converting them to YAML. v1: https://lore.kernel.org/20250220-gianfar-yaml-v1-0-0ba97fd1ef92@posteo.net ==================== Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-0-6beeefbd4818@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04dt-bindings: net: Convert fsl,gianfar to YAMLJ. Neuschäfer
Add a binding for the "Gianfar" ethernet controller, also known as TSEC/eTSEC. Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-3-6beeefbd4818@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04dt-bindings: net: fsl,gianfar-mdio: Update information about TBIJ. Neuschäfer
When this binding was originally written, all known TSEC Ethernet controllers had a Ten-Bit Interface (TBI). However, some datasheets such as for the MPC8315E suggest that this is not universally true: The eTSECs do not support TBI, GMII, and FIFO operating modes, so all references to these interfaces and features should be ignored for this device. Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-2-6beeefbd4818@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04dt-bindings: net: Convert fsl,gianfar-{mdio,tbi} to YAMLJ. Neuschäfer
Move the information related to the Freescale Gianfar (TSEC) MDIO bus and the Ten-Bit Interface (TBI) from fsl-tsec-phy.txt to a new binding file in YAML format, fsl,gianfar-mdio.yaml. Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250228-gianfar-yaml-v2-1-6beeefbd4818@posteo.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04Merge branch 'net-phy-nxp-c45-tja11xx-add-support-for-tja1121'Jakub Kicinski
Andrei Botila says: ==================== net: phy: nxp-c45-tja11xx: add support for TJA1121 This patch series adds .match_phy_device for the existing TJAs to differentiate between TJA1103/TJA1104 and TJA1120/TJA1121. TJA1103 and TJA1104 share the same PHY_ID but TJA1104 has MACsec capabilities while TJA1103 doesn't. Also add support for TJA1121 which is based on TJA1120 hardware with additional MACsec IP. ==================== Link: https://patch.msgid.link/20250228154320.2979000-1-andrei.botila@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: phy: nxp-c45-tja11xx: add support for TJA1121Andrei Botila
Add support for TJA1121 which is based on TJA1120 but with additional MACsec IP. Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250228154320.2979000-3-andrei.botila@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04net: phy: nxp-c45-tja11xx: add match_phy_device to TJA1103/TJA1104Andrei Botila
Add .match_phy_device for the existing TJAs to differentiate between TJA1103 and TJA1104. TJA1103 and TJA1104 share the same PHY_ID but TJA1104 has MACsec capabilities while TJA1103 doesn't. Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250228154320.2979000-2-andrei.botila@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04dpll: Add an assertion to check freq_supported_numJiasheng Jiang
Since the driver is broken in the case that src->freq_supported is not NULL but src->freq_supported_num is 0, add an assertion for it. Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20250228150210.34404-1-jiashengjiangcool@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04Merge branch 'mptcp-improve-code-coverage-and-small-optimisations'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: improve code coverage and small optimisations This small series have various unrelated patches: - Patch 1 and 2: improve code coverage by validating mptcp_diag_dump_one thanks to a new tool displaying MPTCP info for a specific token. - Patch 3: a fix for a commit which is only in net-next. - Patch 4: reduce parameters for one in-kernel PM helper. - Patch 5: exit early when processing an ADD_ADDR echo to avoid unneeded operations. ==================== Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-0-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04mptcp: pm: exit early with ADD_ADDR echo if possibleMatthieu Baerts (NGI0)
When the userspace PM is used, or when the in-kernel limits are reached, there will be no need to schedule the PM worker to signal new addresses. That corresponds to pm->work_pending set to 0. In this case, an early exit can be done in mptcp_pm_add_addr_echoed() not to hold the PM lock, and iterate over the announced addresses list, not to schedule the worker anyway in this case. This is similar to what is done when a connection or a subflow has been established. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-5-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04mptcp: pm: in-kernel: reduce parameters of set_flagsGeliang Tang
The number of parameters in mptcp_nl_set_flags() can be reduced. Only need to pass a "local" parameter to it instead of "local->addr" and "local->flags". Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-4-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04mptcp: pm: in-kernel: avoid access entry without lockGeliang Tang
In mptcp_pm_nl_set_flags(), "entry" is copied to "local" when pernet->lock is held to avoid direct access to entry without pernet->lock. Therefore, "local->flags" should be passed to mptcp_nl_set_flags instead of "entry->flags" when pernet->lock is not held, so as to avoid access to entry. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Fixes: 145dc6cc4abd ("mptcp: pm: change to fullmesh only for 'subflow'") Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-3-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>