linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-07-08	ipv6: mcast: Replace locking comments with lockdep annotations.	Kuniyuki Iwashima
	Commit 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data") added the same comments regarding locking to many functions. Let's replace the comments with lockdep annotation, which is more helpful. Note that we just remove the comment for mld_clear_zeros() and mld_send_cr(), where mc_dereference() is used in the entry of the function. While at it, a comment for __ipv6_sock_mc_join() is moved back to the correct place. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250702230210.3115355-3-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	ipv6: ndisc: Remove __in6_dev_get() in pndisc_{constructor,destructor}().	Kuniyuki Iwashima
	ipv6_dev_mc_{inc,dec}() has the same check. Let's remove __in6_dev_get() from pndisc_constructor() and pndisc_destructor(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250702230210.3115355-2-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge branch 'net-xsk-update-tx-queue-consumer'	Jakub Kicinski
	Jason Xing says: ==================== net: xsk: update tx queue consumer Patch 1 makes sure the consumer is updated at the end of generic xmit. Patch 2 adds corresponding test. ==================== Link: https://patch.msgid.link/20250703141712.33190-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	selftests/bpf: add a new test to check the consumer update case	Jason Xing
	The subtest sends 33 packets at one time on purpose to see if xsk exitting __xsk_generic_xmit() updates the global consumer of tx queue when reaching the max loop (max_tx_budget, 32 by default). The number 33 can avoid xskq_cons_peek_desc() updates the consumer when it's about to quit sending, to accurately check if the issue that the first patch resolves remains. The new case will not check this issue in zero copy mode. Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250703141712.33190-3-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: xsk: update tx queue consumer immediately after transmission	Jason Xing
	For afxdp, the return value of sendto() syscall doesn't reflect how many descs handled in the kernel. One of use cases is that when user-space application tries to know the number of transmitted skbs and then decides if it continues to send, say, is it stopped due to max tx budget? The following formular can be used after sending to learn how many skbs/descs the kernel takes care of: tx_queue.consumers_before - tx_queue.consumers_after Prior to the current patch, in non-zc mode, the consumer of tx queue is not immediately updated at the end of each sendto syscall when error occurs, which leads to the consumer value out-of-dated from the perspective of user space. So this patch requires store operation to pass the cached value to the shared value to handle the problem. More than those explicit errors appearing in the while() loop in __xsk_generic_xmit(), there are a few possible error cases that might be neglected in the following call trace: __xsk_generic_xmit() xskq_cons_peek_desc() xskq_cons_read_desc() xskq_cons_is_valid_desc() It will also cause the premature exit in the while() loop even if not all the descs are consumed. Based on the above analysis, using @sent_frame could cover all the possible cases where it might lead to out-of-dated global state of consumer after finishing __xsk_generic_xmit(). The patch also adds a common helper __xsk_tx_release() to keep align with the zc mode usage in xsk_tx_release(). Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250703141712.33190-2-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge branch 'net-phy-smsc-robustness-fixes-for-lan87xx-lan9500'	Jakub Kicinski
	Oleksij Rempel says: ==================== net: phy: smsc: robustness fixes for LAN87xx/LAN9500 The SMSC 10/100 PHYs (LAN87xx family) found in smsc95xx (lan95xx) USB-Ethernet adapters show several quirks around the Auto-MDIX feature: - A hardware strap (AUTOMDIX_EN) may boot the PHY in fixed-MDI mode, and the current driver cannot always override it. - When Auto-MDIX is left enabled while autonegotiation is forced off, the PHY endlessly swaps the TX/RX pairs and never links up. - The driver sets the enable bit for Auto-MDIX but forgets the override bit, so userspace requests are silently ignored. - Rapid configuration changes can wedge the link if PHY IRQs are enabled. The four patches below make the MDIX state fully predictable and prevent link failures in every tested strap / autoneg / MDI-X permutation. Tested on LAN9512 Eval board. ==================== Link: https://patch.msgid.link/20250703114941.3243890-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: phy: smsc: Fix link failure in forced mode with Auto-MDIX	Oleksij Rempel
	Force a fixed MDI-X mode when auto-negotiation is disabled to prevent link instability. When forcing the link speed and duplex on a LAN9500 PHY (e.g., with `ethtool -s eth0 autoneg off ...`) while leaving MDI-X control in auto mode, the PHY fails to establish a stable link. This occurs because the PHY's Auto-MDIX algorithm is not designed to operate when auto-negotiation is disabled. In this state, the PHY continuously toggles the TX/RX signal pairs, which prevents the link partner from synchronizing. This patch resolves the issue by detecting when auto-negotiation is disabled. If the MDI-X control mode is set to 'auto', the driver now forces a specific, stable mode (ETH_TP_MDI) to prevent the pair toggling. This choice of a fixed MDI mode mirrors the behavior the hardware would exhibit if the AUTOMDIX_EN strap were configured for a fixed MDI connection. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250703114941.3243890-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: phy: smsc: Force predictable MDI-X state on LAN87xx	Oleksij Rempel
	Override the hardware strap configuration for MDI-X mode to ensure a predictable initial state for the driver. The initial mode of the LAN87xx PHY is determined by the AUTOMDIX_EN strap pin, but the driver has no documented way to read its latched status. This unpredictability means the driver cannot know if the PHY has initialized with Auto-MDIX enabled or disabled, preventing it from providing a reliable interface to the user. This patch introduces a `config_init` hook that forces the PHY into a known state by explicitly enabling Auto-MDIX. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250703114941.3243890-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: phy: smsc: Fix Auto-MDIX configuration when disabled by strap	Oleksij Rempel
	Correct the Auto-MDIX configuration to ensure userspace settings are respected when the feature is disabled by the AUTOMDIX_EN hardware strap. The LAN9500 PHY allows its default MDI-X mode to be configured via a hardware strap. If this strap sets the default to "MDI-X off", the driver was previously unable to enable Auto-MDIX from userspace. When handling the ETH_TP_MDI_AUTO case, the driver would set the SPECIAL_CTRL_STS_AMDIX_ENABLE_ bit but neglected to set the required SPECIAL_CTRL_STS_OVRRD_AMDIX_ bit. Without the override flag, the PHY falls back to its hardware strap default, ignoring the software request. This patch corrects the behavior by also setting the override bit when enabling Auto-MDIX. This ensures that the userspace configuration takes precedence over the hardware strap, allowing Auto-MDIX to be enabled correctly in all scenarios. Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Andre Edich <andre.edich@microchip.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250703114941.3243890-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2	EricChan
	According to the Synopsys Controller IP XGMAC-10G Ethernet MAC Databook v3.30a (section 2.7.2), when the INTM bit in the DMA_Mode register is set to 2, the sbd_perch_tx_intr_o[] and sbd_perch_rx_intr_o[] signals operate in level-triggered mode. However, in this configuration, the DMA does not assert the XGMAC_NIS status bit for Rx or Tx interrupt events. This creates a functional regression where the condition if (likely(intr_status & XGMAC_NIS)) in dwxgmac2_dma_interrupt() will never evaluate to true, preventing proper interrupt handling for level-triggered mode. The hardware specification explicitly states that "The DMA does not assert the NIS status bit for the Rx or Tx interrupt events" (Synopsys DWC_XGMAC2 Databook v3.30a, sec. 2.7.2). The fix ensures correct handling of both edge and level-triggered interrupts while maintaining backward compatibility with existing configurations. It has been tested on the hardware device (not publicly available), and it can properly trigger the RX and TX interrupt handling in both the INTM=0 and INTM=2 configurations. Fixes: d6ddfacd95c7 ("net: stmmac: Add DMA related callbacks for XGMAC2") Tested-by: EricChan <chenchuangyu@xiaomi.com> Signed-off-by: EricChan <chenchuangyu@xiaomi.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703020449.105730-1-chenchuangyu@xiaomi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge branch 'af_unix-introduce-so_inq-scm_inq'	Jakub Kicinski
	Kuniyuki Iwashima says: ==================== af_unix: Introduce SO_INQ & SCM_INQ. We have an application that uses almost the same code for TCP and AF_UNIX (SOCK_STREAM). The application uses TCP_INQ for TCP, but AF_UNIX doesn't have it and requires an extra syscall, ioctl(SIOCINQ) or getsockopt(SO_MEMINFO) as an alternative. Also, ioctl(SIOCINQ) for AF_UNIX SOCK_STREAM is more expensive because it needs to iterate all skb in the receive queue. This series adds a cached field for SIOCINQ to speed it up and introduce SO_INQ, the generic version of TCP_INQ to get the queue length as cmsg in each recvmsg(). ==================== Link: https://patch.msgid.link/20250702223606.1054680-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	selftest: af_unix: Add test for SO_INQ.	Kuniyuki Iwashima
	Let's add a simple test to check the basic functionality of SO_INQ. The test does the following: 1. Create socketpair in self->fd[] 2. Enable SO_INQ 3. Send data via self->fd[0] 4. Receive data from self->fd[1] 5. Compare the SCM_INQ cmsg with ioctl(SIOCINQ) Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-8-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Introduce SO_INQ.	Kuniyuki Iwashima
	We have an application that uses almost the same code for TCP and AF_UNIX (SOCK_STREAM). TCP can use TCP_INQ, but AF_UNIX doesn't have it and requires an extra syscall, ioctl(SIOCINQ) or getsockopt(SO_MEMINFO) as an alternative. Let's introduce the generic version of TCP_INQ. If SO_INQ is enabled, recvmsg() will put a cmsg of SCM_INQ that contains the exact value of ioctl(SIOCINQ). The cmsg is also included when msg->msg_get_inq is non-zero to make sockets io_uring-friendly. Note that SOCK_CUSTOM_SOCKOPT is flagged only for SOCK_STREAM to override setsockopt() for SOL_SOCKET. By having the flag in struct unix_sock, instead of struct sock, we can later add SO_INQ support for TCP and reuse tcp_sk(sk)->recvmsg_inq. Note also that supporting custom getsockopt() for SOL_SOCKET will need preparation for other SOCK_CUSTOM_SOCKOPT users (UDP, vsock, MPTCP). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-7-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Cache state->msg in unix_stream_read_generic().	Kuniyuki Iwashima
	In unix_stream_read_generic(), state->msg is fetched multiple times. Let's cache it in a local variable. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702223606.1054680-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Use cached value for SOCK_STREAM in unix_inq_len().	Kuniyuki Iwashima
	Compared to TCP, ioctl(SIOCINQ) for AF_UNIX SOCK_STREAM socket is more expensive, as unix_inq_len() requires iterating through the receive queue and accumulating skb->len. Let's cache the value for SOCK_STREAM to a new field during sendmsg() and recvmsg(). The field is protected by the receive queue lock. Note that ioctl(SIOCINQ) for SOCK_DGRAM returns the length of the first skb in the queue. SOCK_SEQPACKET still requires iterating through the queue because we do not touch functions shared with unix_dgram_ops. But, if really needed, we can support it by switching __skb_try_recv_datagram() to a custom version. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-5-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Don't use skb_recv_datagram() in unix_stream_read_skb().	Kuniyuki Iwashima
	unix_stream_read_skb() calls skb_recv_datagram() with MSG_DONTWAIT, which is mostly equivalent to sock_error(sk) + skb_dequeue(). In the following patch, we will add a new field to cache the number of bytes in the receive queue. Then, we want to avoid introducing atomic ops in the fast path, so we will reuse the receive queue lock. As a preparation for the change, let's not use skb_recv_datagram() in unix_stream_read_skb(). Note that sock_error() is now moved out of the u->iolock mutex as the mutex does not synchronise the peer's close() at all. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702223606.1054680-4-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Don't check SOCK_DEAD in unix_stream_read_skb().	Kuniyuki Iwashima
	unix_stream_read_skb() checks SOCK_DEAD only when the dequeued skb is OOB skb. unix_stream_read_skb() is called for a SOCK_STREAM socket in SOCKMAP when data is sent to it. The function is invoked via sk_psock_verdict_data_ready(), which is set to sk->sk_data_ready(). During sendmsg(), we check if the receiver has SOCK_DEAD, so there is no point in checking it again later in ->read_skb(). Also, unix_read_skb() for SOCK_DGRAM does not have the test either. Let's remove the SOCK_DEAD test in unix_stream_read_skb(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702223606.1054680-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	af_unix: Don't hold unix_state_lock() in __unix_dgram_recvmsg().	Kuniyuki Iwashima
	When __skb_try_recv_datagram() returns NULL in __unix_dgram_recvmsg(), we hold unix_state_lock() unconditionally. This is because SOCK_SEQPACKET sk needs to return EOF in case its peer has been close()d concurrently. This behaviour totally depends on the timing of the peer's close() and reading sk->sk_shutdown, and taking the lock does not play a role. Let's drop the lock from __unix_dgram_recvmsg() and use READ_ONCE(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250702223606.1054680-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge branch 'eth-fbnic-add-firmware-logging-support'	Jakub Kicinski
	Lee Trager says: ==================== eth: fbnic: Add firmware logging support Firmware running on fbnic generates device logs. These logs contain useful information about the device which may or may not be related to the host. Logs are stored in a ring buffer and accessible through DebugFS. ==================== Link: https://patch.msgid.link/20250702192207.697368-1-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Create fw_log file in DebugFS	Lee Trager
	Allow reading the firmware log in DebugFS by accessing the fw_log file. Buffer is read while a spinlock is acquired. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-7-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Enable firmware logging	Lee Trager
	The firmware log buffer is enabled during probe and freed during remove. Early versions of firmware do not support sending logs. Once the mailbox is up driver will enable logging when supported firmware versions are detected. Logging is disabled before the mailbox is freed. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-6-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Add mailbox support for firmware logs	Lee Trager
	By default firmware will not send logs to the host. This must be explicitly enabled by the driver. The mailbox has the concept of a flag which is a u32 used as a boolean. Lack of flag defaults to a value of false. When enabling logging historical logs may be optionally requested. These are log messages generated by the NIC before the driver was loaded. The driver also sends a log version to support changing the logging format in the future. [SEND_LOGS_REQ] = { [SEND_LOGS] /* flag to request log reporting / [SEND_LOGS_HISTORY] / flag to request historical logs / [SEND_LOGS_VERSION] / u32 indicating the log format version / } Logs may be sent to the user either one at a time, or when historical logs are requested in bulk. Firmware may not send more than 14 messages in bulk to prevent flooding the mailbox. [LOG_MSG] = { [LOG_INDEX] / entry 0 - u64 index of log / [LOG_MSEC] / entry 0 - u32 timestamp of log / [LOG_MSG] / entry 0 - char log message up to 256 / [LOG_LENGTH] / u32 of remaining log items in arrays / [LOG_INDEX_ARRAY] = { [LOG_INDEX] / entry 1 - u64 index of log / [LOG_INDEX] / entry 2 - u64 index of log / ... } [LOG_MSEC_ARRAY] = { [LOG_MSEC] / entry 1 - u32 timestamp of log / [LOG_MSEC] / entry 2 - u32 timestamp of log / ... } [LOG_MSG_ARRAY] = { [LOG_MSG] / entry 1 - char log message up to 256 / [LOG_MSG] / entry 2 - char log message up to 256 */ ... } } Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-5-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Create ring buffer for firmware logs	Lee Trager
	When enabled, firmware may send logs messages which are specific to the device and not the host. Create a ring buffer to store these messages which are read by a user through DebugFS. Buffer access is protected by a spinlock. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-4-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Use FIELD_PREP to generate minimum firmware version	Lee Trager
	Create a new macro based on FIELD_PREP to generate easily readable minimum firmware version ints. This macro will prevent the mistake from the previous patch from happening again. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-3-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: fbnic: Fix incorrect minimum firmware version	Lee Trager
	The full minimum version is 0.10.6-0. The six is now correctly defined as patch and shifted appropriately. 0.10.6-0 is a preproduction version of firmware which was released over a year and a half ago. All production devices meet this requirement. Signed-off-by: Lee Trager <lee@trager.us> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250702192207.697368-2-lee@trager.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge branch 'mlx5-next' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-07-08 The following pull-request contains common mlx5 updates for your net-next tree. v2: https://lore.kernel.org/1751574385-24672-1-git-send-email-tariqt@nvidia.com * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Check device memory pointer before usage net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow net/mlx5: Add IFC bits for PCIe Congestion Event object net/mlx5: Small refactor for general object capabilities net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain ==================== Link: https://patch.msgid.link/1752002102-11316-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge tag 'pwm/for-6.16-rc6-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fixes from Uwe Kleine-König: "Two fixes for v6.16-rc6 The first patch fixes an embarrassing bug in the pwm core. I really wonder this wasn't found earlier since it's introduction in v6.11-rc1 as it greatly disturbs driving a PWM via sysfs. The second and last patch fixes a clock balance issue in an error path of the Mediatek PWM driver" * tag 'pwm/for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: mediatek: Ensure to disable clocks in error path pwm: Fix invalid state detection
2025-07-08	Merge tag 'modules-6.16-rc6.fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull modules fixes from Daniel Gomez: "This includes two fixes: one introduced in the current release cycle and another introduced back in v6.4-rc1. Additionally, as Petr and Luis mentioned in previous pull requests, add myself (Daniel Gomez) to the list of modules maintainers. The first was reported by Intel's kernel test robot, and it addresses a crash exposed by Sebastian's commit c50d295c37f2 ("rds: Use nested-BH locking for rds_page_remainder") by allowing relocations for the per-CPU section even if it lacks the SHF_ALLOC flag. Petr and Sebastian went down to the archive history (before Git) and found the commit that broke it at [1] / [2] ("Don't relocate non-allocated regions in modules."). The second fix, reported and fixed by Petr (with additional cleanup), resolves a memory leak by ensuring proper deallocation if module loading fails. We couldn't find a reproducer other than forcing it manually or leveraging eBPF. So, I tested it by enabling error injection in the codetag functions through the error path that produces the leak and made it fail until execmem is unable to allocate more memory" Link: https://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux-fullhistory.git/commit/?id=b3b91325f3c7 [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=1a6100caae [2] * tag 'modules-6.16-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: MAINTAINERS: update Daniel Gomez's role and email address module: Make sure relocations are applied to the per-CPU section module: Avoid unnecessary return value initialization in move_module() module: Fix memory deallocation on error path in move_module()
2025-07-08	rxrpc: Fix over large frame size warning	David Howells
	Under some circumstances, the compiler will emit the following warning for rxrpc_send_response(): net/rxrpc/output.c: In function 'rxrpc_send_response': net/rxrpc/output.c:974:1: warning: the frame size of 1160 bytes is larger than 1024 bytes This occurs because the local variables include a 16-element scatterlist array and a 16-element bio_vec array. It's probably not actually a problem as this function is only called by the rxrpc I/O thread function in a kernel thread and there won't be much on the stack before it. Fix this by overlaying the bio_vec array over the kvec array in the rxrpc_local struct. There is one of these per I/O thread and the kvec array is intended for pointing at bits of a packet to be transmitted, typically a DATA or an ACK packet. As packets for a local endpoint are only transmitted by its specific I/O thread, there can be no race, and so overlaying this bit of memory should be no problem. Fixes: 5800b1cf3fd8 ("rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506240423.E942yKJP-lkp@intel.com/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250707102435.2381045-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	Merge tag 'bitmap-for-6.16-rc6' of https://github.com/norov/linux	Linus Torvalds
	Pull bitops UAPI fix from Yury Norov: "Fix BITS_PER_LONG merge error Tomas' fix for __BITS_PER_LONG was effectively reverted by a wrong merge. Fix it and add the related files to MAINTAINERS" * tag 'bitmap-for-6.16-rc6' of https://github.com/norov/linux: MAINTAINERS: bitmap: add UAPI headers uapi: bitops: use UAPI-safe variant of BITS_PER_LONG again (2)
2025-07-08	Merge branch 'net-migrate-remaining-drivers-to-dedicated-_rxfh_context-ops'	Jakub Kicinski
	Jakub Kicinski says: ==================== net: migrate remaining drivers to dedicated _rxfh_context ops Around a year ago Ed added dedicated ops for managing RSS contexts. This significantly improved the clarity of the driver facing API. Migrate the remaining 3 drivers and remove the old way of muxing the RSS context operations via .set_rxfh(). v2: https://lore.kernel.org/20250702030606.1776293-1-kuba@kernel.org v1: https://lore.kernel.org/20250630160953.1093267-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250707184115.2285277-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: ethtool: reduce indent for _rxfh_context ops	Jakub Kicinski
	Now that we don't have the compat code we can reduce the indent a little. No functional changes. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250707184115.2285277-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	net: ethtool: remove the compat code for _rxfh_context ops	Jakub Kicinski
	All drivers are now converted to dedicated _rxfh_context ops. Remove the use of >set_rxfh() to manage additional contexts. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250707184115.2285277-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: mlx5: migrate to the *_rxfh_context ops	Jakub Kicinski
	Convert mlx5 to dedicated RXFH ops. This is a fairly shallow conversion, TBH, most of the driver code stays as is, but we let the core allocate the context ID for the driver. mlx5e_rx_res_rss_get_rxfh() and friends are made void, since core only calls the driver for context 0. The second call is right after context creation so it must exist (tm). Tested with drivers/net/hw/rss_ctx.py on MCX6. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250707184115.2285277-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: ice: drop the dead code related to rss_contexts	Jakub Kicinski
	ICE appears to have some odd form of rss_context use plumbed in for .get_rxfh. The .set_rxfh side does not support creating contexts, however, so this must be dead code. For at least a year now (since commit 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")) we have not been calling .get_rxfh with a non-zero rss_context. We just get the info from the RSS XArray under dev->ethtool. Remove what must be dead code in the driver, clear the support flags. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	eth: otx2: migrate to the *_rxfh_context ops	Jakub Kicinski
	otx2 only supports additional indirection tables (no separate keys etc.) so the conversion to dedicated callbacks and core-allocated context is mostly removing the code which stores the extra tables in the driver. Core already stores the indirection tables for additional contexts, and doesn't call .get for them. One subtle change here is that we'll now start with the table covering all queues, not directing all traffic to queue 0. This is what core expects if the user doesn't pass the initial indir table explicitly (there's a WARN_ON() in the core trying to make sure driver authors don't forget to populate ctx to defaults). Drivers implementing .create_rxfh_context don't have to set cap_rss_ctx_supported, so remove it. Tested-by: Geetha Sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	MAINTAINERS: update Daniel Gomez's role and email address	Daniel Gomez
	Update Daniel Gomez's modules reviewer role to maintainer. This is according to the plan [1][2][3] of scaling with more reviewers for modules (for the incoming Rust support [4]) and rotate [5] every 6 months. Acked-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/linux-modules/ZsPANzx4-5DrOl5m@bombadil.infradead.org [1] Link: https://lore.kernel.org/linux-modules/20240821174021.2371547-1-mcgrof@kernel.org [2] Link: https://lore.kernel.org/linux-modules/458901be-1da8-4987-9c72-5aa3da6db15e@suse.com [3] Link: https://lore.kernel.org/linux-modules/20250702-module-params-v3-v14-0-5b1cc32311af@kernel.org [4] Link: https://lore.kernel.org/linux-modules/Z3gDAnPlA3SZEbgl@bombadil.infradead.org [5] Acked-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
2025-07-08	module: Make sure relocations are applied to the per-CPU section	Sebastian Andrzej Siewior
	The per-CPU data section is handled differently than the other sections. The memory allocations requires a special __percpu pointer and then the section is copied into the view of each CPU. Therefore the SHF_ALLOC flag is removed to ensure move_module() skips it. Later, relocations are applied and apply_relocations() skips sections without SHF_ALLOC because they have not been copied. This also skips the per-CPU data section. The missing relocations result in a NULL pointer on x86-64 and very small values on x86-32. This results in a crash because it is not skipped like NULL pointer would and can't be dereferenced. Such an assignment happens during static per-CPU lock initialisation with lockdep enabled. Allow relocation processing for the per-CPU section even if SHF_ALLOC is missing. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202506041623.e45e4f7d-lkp@intel.com Fixes: 1a6100caae425 ("Don't relocate non-allocated regions in modules.") #v2.6.1-rc3 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20250610163328.URcsSUC1@linutronix.de Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250610163328.URcsSUC1@linutronix.de>
2025-07-08	module: Avoid unnecessary return value initialization in move_module()	Petr Pavlu
	All error conditions in move_module() set the return value by updating the ret variable. Therefore, it is not necessary to the initialize the variable when declaring it. Remove the unnecessary initialization. Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250618122730.51324-3-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250618122730.51324-3-petr.pavlu@suse.com>
2025-07-08	module: Fix memory deallocation on error path in move_module()	Petr Pavlu
	The function move_module() uses the variable t to track how many memory types it has allocated and consequently how many should be freed if an error occurs. The variable is initially set to 0 and is updated when a call to module_memory_alloc() fails. However, move_module() can fail for other reasons as well, in which case t remains set to 0 and no memory is freed. Fix the problem by initializing t to MOD_MEM_NUM_TYPES. Additionally, make the deallocation loop more robust by not relying on the mod_mem_type_t enum having a signed integer as its underlying type. Fixes: c7ee8aebf6c0 ("module: add stop-grap sanity check on module memcpy()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250618122730.51324-2-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250618122730.51324-2-petr.pavlu@suse.com>
2025-07-08	Merge tag 'libcrypto-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fix from Eric Biggers: "Fix an uninitialized variable in the s390 optimized SHA-1 and SHA-2. Note that my librarification changes also fix this by greatly simplifying how the s390 optimized SHA code is integrated. However, we need this separate fix for 6.16 and older versions" * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: crypto: s390/sha - Fix uninitialized variable in SHA-1 and SHA-2
2025-07-08	vhost/net: enable gso over UDP tunnel support.	Paolo Abeni
	Vhost net need to know the exact virtio net hdr size to be able to copy such header correctly. Teach it about the newly defined UDP tunnel-related option and update the hdr size computation accordingly. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	tun: enable gso over UDP tunnel support.	Paolo Abeni
	Add new tun features to represent the newly introduced virtio GSO over UDP tunnel offload. Allows detection and selection of such features via the existing TUNSETOFFLOAD ioctl and compute the expected virtio header size and tunnel header offset using the current netdev features, so that we can plug almost seamless the newly introduced virtio helpers to serialize the extended virtio header. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- v6 -> v7: - rebased v4 -> v5: - encapsulate the guest feature guessing in a tun helper - dropped irrelevant check on xdp buff headroom - do not remove unrelated black line - avoid line len > 80 char v3 -> v4: - virtio tnl-related fields are at fixed offset, cleanup the code accordingly. - use netdev features instead of flags bit to check for the configured offload - drop packet in case of enabled features/configured hdr size mismatch v2 -> v3: - cleaned-up uAPI comments - use explicit struct layout instead of raw buf.
2025-07-08	virtio_net: enable gso over UDP tunnel support.	Paolo Abeni
	If the related virtio feature is set, enable transmission and reception of gso over UDP tunnel packets. Most of the work is done by the previously introduced helper, just need to determine the UDP tunnel features inside the virtio_net_hdr and update accordingly the virtio net hdr size. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	net: implement virtio helpers to handle UDP GSO tunneling.	Paolo Abeni
	The virtio specification are introducing support for GSO over UDP tunnel. This patch brings in the needed defines and the additional virtio hdr parsing/building helpers. The UDP tunnel support uses additional fields in the virtio hdr, and such fields location can change depending on other negotiated features - specifically VIRTIO_NET_F_HASH_REPORT. Try to be as conservative as possible with the new field validation. Existing implementation for plain GSO offloads allow for invalid/ self-contradictory values of such fields. With GSO over UDP tunnel we can be more strict, with no need to deal with legacy implementation. Since the checksum-related field validation is asymmetric in the driver and in the device, introduce a separate helper to implement the new checks (to be used only on the driver side). Note that while the feature space exceeds the 64-bit boundaries, the guest offload space is fixed by the specification of the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command to a 64-bit size. Prior to the UDP tunnel GSO support, each guest offload bit corresponded to the feature bit with the same value and vice versa. Due to the limited 'guest offload' space, relevant features in the high 64 bits are 'mapped' to free bits in the lower range. That is simpler than defining a new command (and associated features) to exchange an extended guest offloads set. As a consequence, the uAPIs also specify the mapped guest offload value corresponding to the UDP tunnel GSO features. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> -- v4 -> v5: - avoid lines above 80 chars v3 -> v4: - fixed offset for UDP GSO tunnel, update accordingly the helpers - tried to clarified vlan_hlen semantic - virtio_net_chk_data_valid() -> virtio_net_handle_csum_offload() v2 -> v3: - add definitions for possible vnet hdr layouts with tunnel support v1 -> v2: - 'relay' -> 'rely' typo - less unclear comment WRT enforced inner GSO checks - inner header fields are allowed only with 'modern' virtio, thus are always le - clarified in the commit message the need for 'mapped features' defines - assume little_endian is true when UDP GSO is enabled. - fix inner proto type value
2025-07-08	virtio_net: add supports for extended offloads	Paolo Abeni
	The virtio_net driver needs it to implement GSO over UDP tunnel offload. The only missing piece is mapping them to/from the extended features. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	vhost-net: allow configuring extended features	Paolo Abeni
	Use the extended feature type for 'acked_features' and implement two new ioctls operation allowing the user-space to set/query an unbounded amount of features. The actual number of processed features is limited by VIRTIO_FEATURES_MAX and attempts to set features above such limit fail with EOPNOTSUPP. Note that: the legacy ioctls implicitly truncate the negotiated features to the lower 64 bits range and the 'acked_backend_features' field don't need conversion, as the only negotiated feature there is in the low 64 bit range. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	virtio_pci_modern: allow configuring extended features	Paolo Abeni
	The virtio specifications allows for up to 128 bits for the device features. Soon we are going to use some of the 'extended' bits features (above 64) for the virtio_net driver. Extend the virtio pci modern driver to support configuring the full virtio features range, replacing the unrolled loops reading and writing the features space with explicit one bounded to the actual features space size in word and implementing the get_extended_features callback. Note that in vp_finalize_features() we only need to cache the lower 64 features bits, to process the transport features. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	virtio: introduce extended features	Paolo Abeni
	The virtio specifications allows for up to 128 bits for the device features. Soon we are going to use some of the 'extended' bits features (above 64) for the virtio_net driver. Introduce extended features as a fixed size array of u64. To minimize the diffstat allows legacy driver to access the low 64 bits via a transparent union. Introduce an extended get_extended_features configuration callback that devices supporting the extended features range must implement in place of the traditional one. Note that legacy and transport features don't need any change, as they are always in the low 64 bit range. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-08	scripts/kernel_doc.py: properly handle VIRTIO_DECLARE_FEATURES	Paolo Abeni
	The mentioned macro introduce by the next patch will foul kdoc; fully expand the mentioned macro to avoid the issue. Signed-off-by: Paolo Abeni <pabeni@redhat.com>