Age | Commit message (Collapse) | Author |
|
While testing prior patch, I was able to trigger
an infinite loop in rt6_nlmsg_size() in the following place:
list_for_each_entry_rcu(sibling, &f6i->fib6_siblings,
fib6_siblings) {
rt6_nh_nlmsg_size(sibling->fib6_nh, &nexthop_len);
}
This is because fib6_del_route() and fib6_add_rt2node()
uses list_del_rcu(), which can confuse rcu readers,
because they might no longer see the head of the list.
Restart the loop if f6i->fib6_nsiblings is zero.
Fixes: d9ccb18f83ea ("ipv6: Fix soft lockups in fib6_select_path under high next hop churn")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250725140725.3626540-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_rt_notify() can be called under RCU protection only.
This means the route could be changed concurrently
and rt6_fill_node() could return -EMSGSIZE.
Re-size the skb when this happens and retry, removing
one WARN_ON() that syzbot was able to trigger:
WARNING: CPU: 3 PID: 6291 at net/ipv6/route.c:6342 inet6_rt_notify+0x475/0x4b0 net/ipv6/route.c:6342
Modules linked in:
CPU: 3 UID: 0 PID: 6291 Comm: syz.0.77 Not tainted 6.16.0-rc7-syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:inet6_rt_notify+0x475/0x4b0 net/ipv6/route.c:6342
Code: fc ff ff e8 6d 52 ea f7 e9 47 fc ff ff 48 8b 7c 24 08 4c 89 04 24 e8 5a 52 ea f7 4c 8b 04 24 e9 94 fd ff ff e8 9c fe 84 f7 90 <0f> 0b 90 e9 bd fd ff ff e8 6e 52 ea f7 e9 bb fb ff ff 48 89 df e8
RSP: 0018:ffffc900035cf1d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffc900035cf540 RCX: ffffffff8a36e790
RDX: ffff88802f7e8000 RSI: ffffffff8a36e9d4 RDI: 0000000000000005
RBP: ffff88803c230f00 R08: 0000000000000005 R09: 00000000ffffffa6
R10: 00000000ffffffa6 R11: 0000000000000001 R12: 00000000ffffffa6
R13: 0000000000000900 R14: ffff888032ea4100 R15: 0000000000000000
FS: 00007fac7b89a6c0(0000) GS:ffff8880d6a20000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fac7b899f98 CR3: 0000000034b3f000 CR4: 0000000000352ef0
Call Trace:
<TASK>
ip6_route_mpath_notify+0xde/0x280 net/ipv6/route.c:5356
ip6_route_multipath_add+0x1181/0x1bd0 net/ipv6/route.c:5536
inet6_rtm_newroute+0xe4/0x1a0 net/ipv6/route.c:5647
rtnetlink_rcv_msg+0x95e/0xe90 net/core/rtnetlink.c:6944
netlink_rcv_skb+0x155/0x420 net/netlink/af_netlink.c:2552
netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
netlink_unicast+0x58d/0x850 net/netlink/af_netlink.c:1346
netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:712 [inline]
__sock_sendmsg net/socket.c:727 [inline]
____sys_sendmsg+0xa95/0xc70 net/socket.c:2566
___sys_sendmsg+0x134/0x1d0 net/socket.c:2620
Fixes: 169fd62799e8 ("ipv6: Get rid of RTNL for SIOCADDRT and RTM_NEWROUTE.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://patch.msgid.link/20250725140725.3626540-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit ff3fbcdd4724 ("selftests: tc: Add generic erspan_opts matching support
for tc-flower") started triggering the following kmemleak warning:
unreferenced object 0xffff888015fb0e00 (size 512):
comm "softirq", pid 0, jiffies 4294679065
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 40 d2 85 9e ff ff ff ff ........@.......
41 69 59 9d ff ff ff ff 00 00 00 00 00 00 00 00 AiY.............
backtrace (crc 30b71e8b):
__kmalloc_noprof+0x359/0x460
metadata_dst_alloc+0x28/0x490
erspan_rcv+0x4f1/0x1160 [ip_gre]
gre_rcv+0x217/0x240 [ip_gre]
gre_rcv+0x1b8/0x400 [gre]
ip_protocol_deliver_rcu+0x31d/0x3a0
ip_local_deliver_finish+0x37d/0x620
ip_local_deliver+0x174/0x460
ip_rcv+0x52b/0x6b0
__netif_receive_skb_one_core+0x149/0x1a0
process_backlog+0x3c8/0x1390
__napi_poll.constprop.0+0xa1/0x390
net_rx_action+0x59b/0xe00
handle_softirqs+0x22b/0x630
do_softirq+0xb1/0xf0
__local_bh_enable_ip+0x115/0x150
vrf_ip6_input_dst unconditionally sets skb dst entry, add a call to
skb_dst_drop to drop any existing entry.
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Fixes: 9ff74384600a ("net: vrf: Handle ipv6 multicast and link-local addresses")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250725160043.350725-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jason Xing says:
====================
xsk: fix negative overflow issues in zerocopy xmit
Fix two negative overflow issues around {stmmac_xdp|igb}_xmit_zc().
====================
Link: https://patch.msgid.link/20250723142327.85187-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is no break time in the while() loop, so every time at the end of
igb_xmit_zc(), negative overflow of nb_pkts will occur, which renders
the return value always false. But theoretically, the result should be
set after calling xsk_tx_peek_release_desc_batch(). We can take
i40e_xmit_zc() as a good example.
Returning false means we're not done with transmission and we need one
more poll, which is exactly what igb_xmit_zc() always did before this
patch. After this patch, the return value depends on the nb_pkts value.
Two cases might happen then:
1. if (nb_pkts < budget), it means we process all the possible data, so
return true and no more necessary poll will be triggered because of
this.
2. if (nb_pkts == budget), it means we might have more data, so return
false to let another poll run again.
Fixes: f8e284a02afc ("igb: Add AF_XDP zero-copy Tx support")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20250723142327.85187-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A negative overflow can happen when the budget number of descs are
consumed. as long as the budget is decreased to zero, it will again go
into while (budget-- > 0) statement and get decreased by one, so the
overflow issue can happen. It will lead to returning true whereas the
expected value should be false.
In this case where all the budget is used up, it means zc function
should return false to let the poll run again because normally we
might have more data to process. Without this patch, zc function would
return true instead.
Fixes: 132c32ee5bc0 ("net: stmmac: Add TX via XDP zero-copy socket")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20250723142327.85187-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-07-25
The patch is by Stephane Grosjean and adds support the recent firmware
of USB CAN FD interfaces to the peak_usb driver.
* tag 'linux-can-fixes-for-6.16-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: peak_usb: fix USB FD devices potential malfunction
====================
Link: https://patch.msgid.link/20250725101619.4095105-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 21b688dabecb ("net: phy: micrel: Cable Diag feature for lan8814
phy") introduced cable_test support for the LAN8814 that reuses parts of
the KSZ886x logic and introduced the cable_diag_reg and pair_mask
parameters to account for differences between those chips.
However, it did not update the ksz8081_type struct, so those members are
now 0, causing no pairs to be tested in ksz886x_cable_test_get_status
and ksz886x_cable_test_wait_for_completion to poll the wrong register
for the affected PHYs (Basic Control/Reset, which is 0 in normal
operation) and exit immediately.
Fix this by setting both struct members accordingly.
Fixes: 21b688dabecb ("net: phy: micrel: Cable Diag feature for lan8814 phy")
Cc: stable@vger.kernel.org
Signed-off-by: Florian Larysch <fl@n621.de>
Link: https://patch.msgid.link/20250723222250.13960-1-fl@n621.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
kernel test robot reported null-ptr-deref in neigh_flush_dev(). [0]
The cited commit introduced per-netdev neighbour list and converted
neigh_flush_dev() to use it instead of the global hash table.
One thing we missed is that neigh_table_clear() calls neigh_ifdown()
with NULL dev.
Let's restore the hash table iteration.
Note that IPv6 module is no longer unloadable, so neigh_table_clear()
is called only when IPv6 fails to initialise, which is unlikely to
happen.
[0]:
IPv6: Attempt to unregister permanent protocol 136
IPv6: Attempt to unregister permanent protocol 17
Oops: general protection fault, probably for non-canonical address 0xdffffc00000001a0: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000d00-0x0000000000000d07]
CPU: 1 UID: 0 PID: 1 Comm: systemd Tainted: G T 6.12.0-rc6-01246-gf7f52738637f #1
Tainted: [T]=RANDSTRUCT
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:neigh_flush_dev.llvm.6395807810224103582+0x52/0x570
Code: c1 e8 03 42 8a 04 38 84 c0 0f 85 15 05 00 00 31 c0 41 83 3e 0a 0f 94 c0 48 8d 1c c3 48 81 c3 f8 0c 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 38 00 74 08 48 89 df e8 f7 49 93 fe 4c 8b 3b 4d 85 ff 0f
RSP: 0000:ffff88810026f408 EFLAGS: 00010206
RAX: 00000000000001a0 RBX: 0000000000000d00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffc0631640
RBP: ffff88810026f470 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffffc0625250 R14: ffffffffc0631640 R15: dffffc0000000000
FS: 00007f575cb83940(0000) GS:ffff8883aee00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f575db40008 CR3: 00000002bf936000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__neigh_ifdown.llvm.6395807810224103582+0x44/0x390
neigh_table_clear+0xb1/0x268
ndisc_cleanup+0x21/0x38 [ipv6]
init_module+0x2f5/0x468 [ipv6]
do_one_initcall+0x1ba/0x628
do_init_module+0x21a/0x530
load_module+0x2550/0x2ea0
__se_sys_finit_module+0x3d2/0x620
__x64_sys_finit_module+0x76/0x88
x64_sys_call+0x7ff/0xde8
do_syscall_64+0xfb/0x1e8
entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f575d6f2719
Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 06 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007fff82a2a268 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000557827b45310 RCX: 00007f575d6f2719
RDX: 0000000000000000 RSI: 00007f575d584efd RDI: 0000000000000004
RBP: 00007f575d584efd R08: 0000000000000000 R09: 0000557827b47b00
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000020000
R13: 0000000000000000 R14: 0000557827b470e0 R15: 00007f575dbb4270
</TASK>
Modules linked in: ipv6(+)
Fixes: f7f52738637f4 ("neighbour: Create netdev->neighbour association")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507200931.7a89ecd8-lkp@intel.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250723195443.448163-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When KSZ8863 support was first added to KSZ driver the RX drop MIB
counter was somehow defined as 0x105. The TX drop MIB counter
starts at 0x100 for port 1, 0x101 for port 2, and 0x102 for port 3, so
the RX drop MIB counter should start at 0x103 for port 1, 0x104 for
port 2, and 0x105 for port 3.
There are 5 ports for KSZ8895, so its RX drop MIB counter starts at
0x105.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250723030403.56878-1-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Gemalto Cinterion PLS83-W modem (cdc_ether) is emitting confusing link
up and down events when the WWAN interface is activated on the modem-side.
Interrupt URBs will in consecutive polls grab:
* Link Connected
* Link Disconnected
* Link Connected
Where the last Connected is then a stable link state.
When the system is under load this may cause the unlink_urbs() work in
__handle_link_change() to not complete before the next usbnet_link_change()
call turns the carrier on again, allowing rx_submit() to queue new SKBs.
In that event the URB queue is filled faster than it can drain, ending up
in a RCU stall:
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 33108 jiffies s: 201 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
Call trace:
arch_local_irq_enable+0x4/0x8
local_bh_enable+0x18/0x20
__netdev_alloc_skb+0x18c/0x1cc
rx_submit+0x68/0x1f8 [usbnet]
rx_alloc_submit+0x4c/0x74 [usbnet]
usbnet_bh+0x1d8/0x218 [usbnet]
usbnet_bh_tasklet+0x10/0x18 [usbnet]
tasklet_action_common+0xa8/0x110
tasklet_action+0x2c/0x34
handle_softirqs+0x2cc/0x3a0
__do_softirq+0x10/0x18
____do_softirq+0xc/0x14
call_on_irq_stack+0x24/0x34
do_softirq_own_stack+0x18/0x20
__irq_exit_rcu+0xa8/0xb8
irq_exit_rcu+0xc/0x30
el1_interrupt+0x34/0x48
el1h_64_irq_handler+0x14/0x1c
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x38/0x48
xhci_urb_dequeue+0x1ac/0x45c [xhci_hcd]
unlink1+0xd4/0xdc [usbcore]
usb_hcd_unlink_urb+0x70/0xb0 [usbcore]
usb_unlink_urb+0x24/0x44 [usbcore]
unlink_urbs.constprop.0.isra.0+0x64/0xa8 [usbnet]
__handle_link_change+0x34/0x70 [usbnet]
usbnet_deferred_kevent+0x1c0/0x320 [usbnet]
process_scheduled_works+0x2d0/0x48c
worker_thread+0x150/0x1dc
kthread+0xd8/0xe8
ret_from_fork+0x10/0x20
Get around the problem by delaying the carrier on to the scheduled work.
This needs a new flag to keep track of the necessary action.
The carrier ok check cannot be removed as it remains required for the
LINK_RESET event flow.
Fixes: 4b49f58fff00 ("usbnet: handle link change")
Cc: stable@vger.kernel.org
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Link: https://patch.msgid.link/20250723102526.1305339-1-john.ernberg@actia.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
mlx5e misc fixes 2025-07-23
This small patchset provides misc bug fixes from the team to the mlx5e
driver.
====================
Link: https://patch.msgid.link/1753256672-337784-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlx5e_reporter_rx_timeout() is currently invoked synchronously
in the driver's open error flow. This causes the thread holding
priv->state_lock to attempt acquiring the devlink lock, which
can result in a circular dependency with other devlink operations.
For example:
- Devlink health diagnose flow:
- __devlink_nl_pre_doit() acquires the devlink lock.
- devlink_nl_health_reporter_diagnose_doit() invokes the
driver's diagnose callback.
- mlx5e_rx_reporter_diagnose() then attempts to acquire
priv->state_lock.
- Driver open flow:
- mlx5e_open() acquires priv->state_lock.
- If an error occurs, devlink_health_reporter may be called,
attempting to acquire the devlink lock.
To prevent this circular locking scenario, defer the RX timeout
recovery by scheduling it via a workqueue. This ensures that the
recovery work acquires locks in a consistent order: first the
devlink lock, then priv->state_lock.
Additionally, make the recovery work acquire the netdev instance
lock to safely synchronize with the open/close channel flows,
similar to mlx5e_tx_timeout_work. Repeatedly attempt to acquire
the netdev instance lock until it is taken or the target RQ is no
longer active, as indicated by the MLX5E_STATE_CHANNELS_ACTIVE bit.
Fixes: 32c57fb26863 ("net/mlx5e: Report and recover from rx timeout")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hardware returns a unique identifier for a decrypted packet's xfrm
state, this state is looked up in an xarray. However, the state might
have been freed by the time of this lookup.
Currently, if the state is not found, only a counter is incremented.
The secpath (sp) extension on the skb is not removed, resulting in
sp->len becoming 0.
Subsequently, functions like __xfrm_policy_check() attempt to access
fields such as xfrm_input_state(skb)->xso.type (which dereferences
sp->xvec[sp->len - 1]) without first validating sp->len. This leads to
a crash when dereferencing an invalid state pointer.
This patch prevents the crash by explicitly removing the secpath
extension from the skb if the xfrm state is not found after hardware
decryption. This ensures downstream functions do not operate on a
zero-length secpath.
BUG: unable to handle page fault for address: ffffffff000002c8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 282e067 P4D 282e067 PUD 0
Oops: Oops: 0000 [#1] SMP
CPU: 12 UID: 0 PID: 0 Comm: swapper/12 Not tainted 6.15.0-rc7_for_upstream_min_debug_2025_05_27_22_44 #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:__xfrm_policy_check+0x61a/0xa30
Code: b6 77 7f 83 e6 02 74 14 4d 8b af d8 00 00 00 41 0f b6 45 05 c1 e0 03 48 98 49 01 c5 41 8b 45 00 83 e8 01 48 98 49 8b 44 c5 10 <0f> b6 80 c8 02 00 00 83 e0 0c 3c 04 0f 84 0c 02 00 00 31 ff 80 fa
RSP: 0018:ffff88885fb04918 EFLAGS: 00010297
RAX: ffffffff00000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000000
RBP: ffffffff8311af80 R08: 0000000000000020 R09: 00000000c2eda353
R10: ffff88812be2bbc8 R11: 000000001faab533 R12: ffff88885fb049c8
R13: ffff88812be2bbc8 R14: 0000000000000000 R15: ffff88811896ae00
FS: 0000000000000000(0000) GS:ffff8888dca82000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff000002c8 CR3: 0000000243050002 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
? try_to_wake_up+0x108/0x4c0
? udp4_lib_lookup2+0xbe/0x150
? udp_lib_lport_inuse+0x100/0x100
? __udp4_lib_lookup+0x2b0/0x410
__xfrm_policy_check2.constprop.0+0x11e/0x130
udp_queue_rcv_one_skb+0x1d/0x530
udp_unicast_rcv_skb+0x76/0x90
__udp4_lib_rcv+0xa64/0xe90
ip_protocol_deliver_rcu+0x20/0x130
ip_local_deliver_finish+0x75/0xa0
ip_local_deliver+0xc1/0xd0
? ip_protocol_deliver_rcu+0x130/0x130
ip_sublist_rcv+0x1f9/0x240
? ip_rcv_finish_core+0x430/0x430
ip_list_rcv+0xfc/0x130
__netif_receive_skb_list_core+0x181/0x1e0
netif_receive_skb_list_internal+0x200/0x360
? mlx5e_build_rx_skb+0x1bc/0xda0 [mlx5_core]
gro_receive_skb+0xfd/0x210
mlx5e_handle_rx_cqe_mpwrq+0x141/0x280 [mlx5_core]
mlx5e_poll_rx_cq+0xcc/0x8e0 [mlx5_core]
? mlx5e_handle_rx_dim+0x91/0xd0 [mlx5_core]
mlx5e_napi_poll+0x114/0xab0 [mlx5_core]
__napi_poll+0x25/0x170
net_rx_action+0x32d/0x3a0
? mlx5_eq_comp_int+0x8d/0x280 [mlx5_core]
? notifier_call_chain+0x33/0xa0
handle_softirqs+0xda/0x250
irq_exit_rcu+0x6d/0xc0
common_interrupt+0x81/0xa0
</IRQ>
Fixes: b2ac7541e377 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When updating the PBMC register, we read its current value,
modify desired fields, then write it back.
The port_buffer_size field within PBMC is Read-Only (RO).
If this RO field contains a non-zero value when read,
attempting to write it back will cause the entire PBMC
register update to fail.
This commit ensures port_buffer_size is explicitly cleared
to zero after reading the PBMC register but before writing
back the modified value.
This allows updates to other fields in the PBMC register to succeed.
Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Yael Chemla <ychemla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The latest firmware versions of USB CAN FD interfaces export the EP numbers
to be used to dialog with the device via the "type" field of a response to
a vendor request structure, particularly when its value is greater than or
equal to 2.
Correct the driver's test of this field.
Fixes: 4f232482467a ("can: peak_usb: include support for a new MCU")
Signed-off-by: Stephane Grosjean <stephane.grosjean@hms-networks.com>
Link: https://patch.msgid.link/20250724081550.11694-1-stephane.grosjean@free.fr
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
[mkl: rephrase commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Daniel Zahka says:
====================
selftests: drv-net: tso: fix issues with tso selftest
There are a couple issues with the tso selftest.
- Features required for test cases are detected by searching the set
of active features at test start, so if a feature is supported by
hw, but disabled, the test will report that the feature under test
is not available and fail.
- The vxlan test cases do not use the correct ip link flags based on
the gso feature under test
- The non-tunneled tso6 test case is showing up with the wrong name.
With all patches applied test output is:
# Detected qstat for LSO wire-packets
TAP version 13
1..14
ok 1 tso.ipv4
# Testing with mangleid enabled
ok 2 tso.vxlan4_ipv4
ok 3 tso.vxlan4_ipv6
# Testing with mangleid enabled
ok 4 tso.vxlan_csum4_ipv4
ok 5 tso.vxlan_csum4_ipv6
# Testing with mangleid enabled
ok 6 tso.gre4_ipv4
ok 7 tso.gre4_ipv6
ok 8 tso.ipv6
# Testing with mangleid enabled
ok 9 tso.vxlan6_ipv4
ok 10 tso.vxlan6_ipv6
# Testing with mangleid enabled
ok 11 tso.vxlan_csum6_ipv4
ok 12 tso.vxlan_csum6_ipv6
# Testing with mangleid enabled
ok 13 tso.gre6_ipv4
ok 14 tso.gre6_ipv6
# Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0
====================
Link: https://patch.msgid.link/20250723184740.4075410-1-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The non-tunneled tso6 test case was showing up as:
ok 8 tso.ipv4
This is because of the way test_builder() uses the inner_ipver arg in
test naming, and how test_info is iterated over in main(). Given that
some tunnels not supported yet, e.g. ipip or sit, only support ipv4 or
ipv6 as the inner network protocol, I think the best fix here is to
call test_builder() in separate branches for tunneled and non-tunneled
tests, and to make supported inner l3 types an explicit attribute of
tunnel test cases.
# Detected qstat for LSO wire-packets
TAP version 13
1..14
ok 1 tso.ipv4
# Testing with mangleid enabled
ok 2 tso.vxlan4_ipv4
ok 3 tso.vxlan4_ipv6
# Testing with mangleid enabled
ok 4 tso.vxlan_csum4_ipv4
ok 5 tso.vxlan_csum4_ipv6
# Testing with mangleid enabled
ok 6 tso.gre4_ipv4
ok 7 tso.gre4_ipv6
ok 8 tso.ipv6
# Testing with mangleid enabled
ok 9 tso.vxlan6_ipv4
ok 10 tso.vxlan6_ipv6
# Testing with mangleid enabled
ok 11 tso.vxlan_csum6_ipv4
ok 12 tso.vxlan_csum6_ipv6
# Testing with mangleid enabled
ok 13 tso.gre6_ipv4
ok 14 tso.gre6_ipv6
# Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0
Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-4-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When vxlan is used with ipv6 as the outer network header, the correct
ip link parameters for acheiving the SKB_GSO_UDP_TUNNEL gso type is
"udp6zerocsumtx udp6zerocsumrx". Otherwise the gso type will be
SKB_GSO_UDP_TUNNEL_CSUM.
This bug was the reason for the second of the three possible
invocations of run_one_stream() invocations, so that can be deleted as
well. We only need to test with the feature off and on.
Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-3-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tso.py uses the active features at the time of test execution
as the set of available gso features to test. This means if a gso
feature is supported but toggled off at test start, the test will be
skipped with a "Device does not support {feature}" message.
Instead, we can enumerate the set of toggleable features by capturing
the driver's hw_features bitmap. To avoid configuration side-effects
from running the test, we also snapshot the wanted_features flag set
before making any feature changes, and then attempt to restore the
same set of wanted_features before test exit.
Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-2-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can and xfrm.
The TI regression notified last week is actually on our net-next tree,
it does not affect 6.16.
We are investigating a virtio regression which is quite hard to
reproduce - currently only our CI sporadically hits it. Hopefully it
should not be critical, and I'm not sure that an additional week would
be enough to solve it.
Current release - fix to a fix:
- sched: sch_qfq: avoid sleeping in atomic context in qfq_delete_class
Previous releases - regressions:
- xfrm:
- set transport header to fix UDP GRO handling
- delete x->tunnel as we delete x
- eth:
- mlx5: fix memory leak in cmd_exec()
- i40e: when removing VF MAC filters, avoid losing PF-set MAC
- gve: fix stuck TX queue for DQ queue format
Previous releases - always broken:
- can: fix NULL pointer deref of struct can_priv::do_set_mode
- eth:
- ice: fix a null pointer dereference in ice_copy_and_init_pkg()
- ism: fix concurrency management in ism_cmd()
- dpaa2: fix device reference count leak in MAC endpoint handling
- icssg-prueth: fix buffer allocation for ICSSG
Misc:
- selftests: mptcp: increase code coverage"
* tag 'net-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
net: hns3: default enable tx bounce buffer when smmu enabled
net: hns3: fixed vf get max channels bug
net: hns3: disable interrupt when ptp init failed
net: hns3: fix concurrent setting vlan filter issue
s390/ism: fix concurrency management in ism_cmd()
selftests: drv-net: wait for iperf client to stop sending
MAINTAINERS: Add in6.h to MAINTAINERS
selftests: netfilter: tone-down conntrack clash test
can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class
gve: Fix stuck TX queue for DQ queue format
net: appletalk: Fix use-after-free in AARP proxy probe
net: bcmasp: Restore programming of TX map vector register
selftests: mptcp: connect: also cover checksum
selftests: mptcp: connect: also cover alt modes
e1000e: ignore uninitialized checksum word on tgp
e1000e: disregard NVM checksum on tgp when valid checksum bit is not set
ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
i40e: When removing VF MAC filters, only check PF-set MAC
i40e: report VF tx_dropped with tx_errors instead of tx_discards
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2025-07-23
1) Premption fixes for xfrm_state_find.
From Sabrina Dubroca.
2) Initialize offload path also for SW IPsec GRO. This fixes a
performance regression on SW IPsec offload.
From Leon Romanovsky.
3) Fix IPsec UDP GRO for IKE packets.
From Tobias Brunner,
4) Fix transport header setting for IPcomp after decompressing.
From Fernando Fernandez Mancera.
5) Fix use-after-free when xfrmi_changelink tries to change
collect_md for a xfrm interface.
From Eyal Birger .
6) Delete the special IPcomp x->tunnel state along with the state x
to avoid refcount problems.
From Sabrina Dubroca.
Please pull or let me know if there are problems.
* tag 'ipsec-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
Revert "xfrm: destroy xfrm_state synchronously on net exit path"
xfrm: delete x->tunnel as we delete x
xfrm: interface: fix use-after-free after changing collect_md xfrm interface
xfrm: ipcomp: adjust transport header after decompressing
xfrm: Set transport header to fix UDP GRO handling
xfrm: always initialize offload path
xfrm: state: use a consistent pcpu_id in xfrm_state_find
xfrm: state: initialize state_ptrs earlier in xfrm_state_find
====================
Link: https://patch.msgid.link/20250723075417.3432644-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Jijie Shao says:
====================
There are some bugfix for the HNS3 ethernet driver
v1: https://lore.kernel.org/all/20250702130901.2879031-1-shaojijie@huawei.com/
====================
Link: https://patch.msgid.link/20250722125423.1270673-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The SMMU engine on HIP09 chip has a hardware issue.
SMMU pagetable prefetch features may prefetch and use a invalid PTE
even the PTE is valid at that time. This will cause the device trigger
fake pagefaults. The solution is to avoid prefetching by adding a
SYNC command when smmu mapping a iova. But the performance of nic has a
sharp drop. Then we do this workaround, always enable tx bounce buffer,
avoid mapping/unmapping on TX path.
This issue only affects HNS3, so we always enable
tx bounce buffer when smmu enabled to improve performance.
Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-5-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the queried maximum of vf channels is the maximum of channels
supported by each TC. However, the actual maximum of channels is
the maximum of channels supported by the device.
Fixes: 849e46077689 ("net: hns3: add ethtool_ops.get_channels support for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-4-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When ptp init failed, we'd better disable the interrupt and clear the
flag, to avoid early report interrupt at next probe.
Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-3-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The vport->req_vlan_fltr_en may be changed concurrently by function
hclge_sync_vlan_fltr_state() called in periodic work task and
function hclge_enable_vport_vlan_filter() called by user configuration.
It may cause the user configuration inoperative. Fixes it by protect
the vport->req_vlan_fltr by vport_lock.
Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722125423.1270673-2-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The s390x ISM device data sheet clearly states that only one
request-response sequence is allowable per ISM function at any point in
time. Unfortunately as of today the s390/ism driver in Linux does not
honor that requirement. This patch aims to rectify that.
This problem was discovered based on Aliaksei's bug report which states
that for certain workloads the ISM functions end up entering error state
(with PEC 2 as seen from the logs) after a while and as a consequence
connections handled by the respective function break, and for future
connection requests the ISM device is not considered -- given it is in a
dysfunctional state. During further debugging PEC 3A was observed as
well.
A kernel message like
[ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a
is a reliable indicator of the stated function entering error state
with PEC 2. Let me also point out that a kernel message like
[ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery
is a reliable indicator that the ISM function won't be auto-recovered
because the ISM driver currently lacks support for it.
On a technical level, without this synchronization, commands (inputs to
the FW) may be partially or fully overwritten (corrupted) by another CPU
trying to issue commands on the same function. There is hard evidence that
this can lead to DMB token values being used as DMB IOVAs, leading to
PEC 2 PCI events indicating invalid DMA. But this is only one of the
failure modes imaginable. In theory even completely losing one command
and executing another one twice and then trying to interpret the outputs
as if the command we intended to execute was actually executed and not
the other one is also possible. Frankly, I don't feel confident about
providing an exhaustive list of possible consequences.
Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory")
Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Pull drm fixes from Dave Airlie:
"This might just be part one, but I'm sending it a bit early as it has
two sets of reverts for regressions, one is all the gem/dma-buf
handling and another was a nouveau ioctl change.
Otherwise there is an amdgpu fix, nouveau fix and a scheduler fix.
If any other changes come in I'll follow up with another more usual
Fri/Sat MR.
gem:
- revert all the dma-buf/gem changes as there as lifetime issues
with them
nouveau:
- revert an ioctl change as it causes issues
- fix NULL ptr on fermi
bridge:
- remove extra semicolon
sched:
- remove hang causing optimisation
amdgpu:
- fix garbage in cleared vram after resume"
* tag 'drm-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/kernel:
drm/bridge: ti-sn65dsi86: Remove extra semicolon in ti_sn_bridge_probe()
Revert "drm/nouveau: check ioctl command codes better"
drm/nouveau/nvif: fix null ptr deref on pre-fermi boards
Revert "drm/gem-dma: Use dma_buf from GEM object instance"
Revert "drm/gem-shmem: Use dma_buf from GEM object instance"
Revert "drm/gem-framebuffer: Use dma_buf from GEM object instance"
Revert "drm/prime: Use dma_buf from GEM object instance"
Revert "drm/etnaviv: Use dma_buf from GEM object instance"
Revert "drm/vmwgfx: Use dma_buf from GEM object instance"
Revert "drm/virtio: Use dma_buf from GEM object instance"
drm/sched: Remove optimization that causes hang when killing dependent jobs
drm/amdgpu: Reset the clear flag in buddy during resume
|
|
A few packets may still be sent out during the termination of iperf
processes. These late packets cause failures in rss_ctx.py when they
arrive on queues expected to be empty.
Example failure observed:
Check failed 2 != 0 traffic on inactive queues (context 1):
[0, 0, 1, 1, 386385, 397196, 0, 0, 0, 0, ...]
Check failed 4 != 0 traffic on inactive queues (context 2):
[0, 0, 0, 0, 2, 2, 247152, 253013, 0, 0, ...]
Check failed 2 != 0 traffic on inactive queues (context 3):
[0, 0, 0, 0, 0, 0, 1, 1, 282434, 283070, ...]
To avoid such failures, wait until all client sockets for the requested
port are either closed or in the TIME_WAIT state.
Fixes: 847aa551fa78 ("selftests: drv-net: rss_ctx: factor out send traffic and check")
Signed-off-by: Nimrod Oren <noren@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722122655.3194442-1-noren@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
My CC-adding automation returned nothing on a future patch to the
include/linux/in6.h file, and I went looking for why. Add the missed
in6.h to MAINTAINERS.
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250722165645.work.047-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull kvm fix from Paolo Bonzini:
- Fix cleanup mistake (probably a cut-and-paste error) in a Xen
hypercall
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/xen: Fix cleanup logic in emulation of Xen schedop poll hypercalls
|
|
kvm_xen_schedop_poll does a kmalloc_array() when a VM polls the host
for more than one event channel potr (nr_ports > 1).
After the kmalloc_array(), the error paths need to go through the
"out" label, but the call to kvm_read_guest_virt() does not.
Fixes: 92c58965e965 ("KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly")
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Manuel Andreas <manuel.andreas@tum.de>
[Adjusted commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.16-rc8/final?:
- Revert all uses of drm_gem_object->dmabuf to
drm_gem_object->import_attach->dmabuf.
- Fix amdgpu returning BIOS cluttered VRAM after resume.
- Scheduler hang fix.
- Revert nouveau ioctl fix as it caused regressions.
- Fix null pointer deref in nouveau.
- Fix unnecessary semicolon in ti_sn_bridge_probe.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/72235afd-c849-49fe-9cc1-2b1781abdf08@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ufs fix from Al Viro:
"Fix regression in ufs options parsing"
* tag 'pull-ufs-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix the regression in ufs options parsing
|
|
A really dumb braino on rebasing and a dumber fuckup with managing #for-next
Fixes: b70cb459890b ("ufs: convert ufs to the new mount API")
Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-07-22
The patch is by me and fixes a potential NULL pointer deref in the CAN
device driver infrastructure. It can be triggered from user space.
* tag 'linux-can-fixes-for-6.16-20250722' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode
====================
Link: https://patch.msgid.link/20250722110059.3664104-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The test is supposed to observe that the 'clash_resolve' stat counter
incremented (i.e., the code path was covered).
This check was incorrect, 'conntrack -S' needs to be called in the
revevant namespace, not the initial netns.
The clash resolution logic in conntrack is only exercised when multiple
packets with the same udp quadruple race. Depending on kernel config,
number of CPUs, scheduling policy etc. this might not trigger even
after several retries. Thus the script eventually returns SKIP if the
retry count is exceeded.
The udpclash tool with also exit with a failure if it did not observe
the expected number of replies.
In the script, make a note of this but do not fail anymore, just check if
the clash resolution logic triggered after all.
Remove the 'single-core' test: while unlikely, with preemptible kernel it
should be possible to also trigger clash resolution logic.
With this change the test will either SKIP or pass.
Hard error could be restored later once its clear whats going on, so
also dump 'conntrack -S' when some packets went missing to see if
conntrack dropped them on insert.
Fixes: 78a588363587 ("selftests: netfilter: add conntrack clash resolution test case")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250721223652.6956-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-07-21 (i40e, ice, e1000e)
For i40e:
Dennis Chen adjusts reporting of VF Tx dropped to a more appropriate
field.
Jamie Bainbridge fixes a check which can cause a PF set VF MAC address
to be lost.
For ice:
Haoxiang Li adds an error check in DDP load to prevent NULL pointer
dereference.
For e1000e:
Jacek Kowalski adds workarounds for issues surrounding Tiger Lake
platforms with uninitialized NVMs.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: ignore uninitialized checksum word on tgp
e1000e: disregard NVM checksum on tgp when valid checksum bit is not set
ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
i40e: When removing VF MAC filters, only check PF-set MAC
i40e: report VF tx_dropped with tx_errors instead of tx_discards
====================
Link: https://patch.msgid.link/20250721173733.2248057-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As reported by the kernel test robot, a recent patch introduced an
unnecessary semicolon. Remove it.
Fixes: 55e8ff842051 ("drm/bridge: ti-sn65dsi86: Add HPD for DisplayPort connector type")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506301704.0SBj6ply-lkp@intel.com/
Reviewed-by: Devarsh Thakkar <devarsht@ti.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20250714130631.1.I1cfae3222e344a3b3c770d079ee6b6f7f3b5d636@changeid
|
|
My previous patch ended up causing a regression for the
DRM_IOCTL_NOUVEAU_NVIF ioctl. The intention of my patch was to only
pass ioctl commands that have the correct dir/type/nr bits into the
nouveau_abi16_ioctl() function.
This turned out to be too strict, as userspace does use at least
write-only and write-read direction settings. Checking for both of these
still did not fix the issue, so the best we can do for the 6.16 release
is to revert back to what we've had since linux-3.16.
This version is still fragile, but at least it is known to work with
existing userspace. Fixing this properly requires a better understanding
of what commands are being passed from userspace in practice, and how
that relies on the undocumented (miss)behavior in nouveau_drm_ioctl().
Fixes: e5478166dffb ("drm/nouveau: check ioctl command codes better")
Reported-by: Satadru Pramanik <satadru@gmail.com>
Closes: https://lore.kernel.org/lkml/CAFrh3J85tsZRpOHQtKgNHUVnn=EG=QKBnZTRtWS8eWSc1K1xkA@mail.gmail.com/
Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Closes: https://lore.kernel.org/lkml/aH9n_QGMFx2ZbKlw@debian.local/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250722115830.2587297-1-arnd@kernel.org
[ Add Closes: tags, fix minor typo in commit message. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
can_priv::do_set_mode
Andrei Lalaev reported a NULL pointer deref when a CAN device is
restarted from Bus Off and the driver does not implement the struct
can_priv::do_set_mode callback.
There are 2 code path that call struct can_priv::do_set_mode:
- directly by a manual restart from the user space, via
can_changelink()
- delayed automatic restart after bus off (deactivated by default)
To prevent the NULL pointer deference, refuse a manual restart or
configure the automatic restart delay in can_changelink() and report
the error via extack to user space.
As an additional safety measure let can_restart() return an error if
can_priv::do_set_mode is not set instead of dereferencing it
unchecked.
Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
Closes: https://lore.kernel.org/all/20250714175520.307467-1-andrey.lalaev@gmail.com
Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
Link: https://patch.msgid.link/20250718-fix-nullptr-deref-do_set_mode-v1-1-0b520097bb96@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
qfq_delete_class
might_sleep could be trigger in the atomic context in qfq_delete_class.
qfq_destroy_class was moved into atomic context locked
by sch_tree_lock to avoid a race condition bug on
qfq_aggregate. However, might_sleep could be triggered by
qfq_destroy_class, which introduced sleeping in atomic context (path:
qfq_destroy_class->qdisc_put->__qdisc_destroy->lockdep_unregister_key
->might_sleep).
Considering the race is on the qfq_aggregate objects, keeping
qfq_rm_from_agg in the lock but moving the left part out can solve
this issue.
Fixes: 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on qfq_aggregate")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/4a04e0cc-a64b-44e7-9213-2880ed641d77@sabinyo.mountain
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/20250717230128.159766-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
gve_tx_timeout was calculating missed completions in a way that is only
relevant in the GQ queue format. Additionally, it was attempting to
disable device interrupts, which is not needed in either GQ or DQ queue
formats.
As a result, TX timeouts with the DQ queue format likely would have
triggered early resets without kicking the queue at all.
This patch drops the check for pending work altogether and always kicks
the queue after validating the queue has not seen a TX timeout too
recently.
Cc: stable@vger.kernel.org
Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
Co-developed-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250717192024.1820931-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The AARP proxy‐probe routine (aarp_proxy_probe_network) sends a probe,
releases the aarp_lock, sleeps, then re-acquires the lock. During that
window an expire timer thread (__aarp_expire_timer) can remove and
kfree() the same entry, leading to a use-after-free.
race condition:
cpu 0 | cpu 1
atalk_sendmsg() | atif_proxy_probe_device()
aarp_send_ddp() | aarp_proxy_probe_network()
mod_timer() | lock(aarp_lock) // LOCK!!
timeout around 200ms | alloc(aarp_entry)
and then call | proxies[hash] = aarp_entry
aarp_expire_timeout() | aarp_send_probe()
| unlock(aarp_lock) // UNLOCK!!
lock(aarp_lock) // LOCK!! | msleep(100);
__aarp_expire_timer(&proxies[ct]) |
free(aarp_entry) |
unlock(aarp_lock) // UNLOCK!! |
| lock(aarp_lock) // LOCK!!
| UAF aarp_entry !!
==================================================================
BUG: KASAN: slab-use-after-free in aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
Read of size 4 at addr ffff8880123aa360 by task repro/13278
CPU: 3 UID: 0 PID: 13278 Comm: repro Not tainted 6.15.2 #3 PREEMPT(full)
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0xc1/0x630 mm/kasan/report.c:521
kasan_report+0xca/0x100 mm/kasan/report.c:634
aarp_proxy_probe_network+0x560/0x630 net/appletalk/aarp.c:493
atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818
sock_do_ioctl+0xdc/0x260 net/socket.c:1190
sock_ioctl+0x239/0x6a0 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl fs/ioctl.c:892 [inline]
__x64_sys_ioctl+0x194/0x200 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcb/0x250 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Allocated:
aarp_alloc net/appletalk/aarp.c:382 [inline]
aarp_proxy_probe_network+0xd8/0x630 net/appletalk/aarp.c:468
atif_proxy_probe_device net/appletalk/ddp.c:332 [inline]
atif_ioctl+0xb58/0x16c0 net/appletalk/ddp.c:857
atalk_ioctl+0x198/0x2f0 net/appletalk/ddp.c:1818
Freed:
kfree+0x148/0x4d0 mm/slub.c:4841
__aarp_expire net/appletalk/aarp.c:90 [inline]
__aarp_expire_timer net/appletalk/aarp.c:261 [inline]
aarp_expire_timeout+0x480/0x6e0 net/appletalk/aarp.c:317
The buggy address belongs to the object at ffff8880123aa300
which belongs to the cache kmalloc-192 of size 192
The buggy address is located 96 bytes inside of
freed 192-byte region [ffff8880123aa300, ffff8880123aa3c0)
Memory state around the buggy address:
ffff8880123aa200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8880123aa280: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880123aa300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880123aa380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff8880123aa400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
Link: https://patch.msgid.link/20250717012843.880423-1-hxzene@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On ASP versions v2.x we need to program the TX map vector register to
properly exercise end-to-end flow control, otherwise the TX engine can
either lock-up, or cause the hardware calculated checksum to be
wrong/corrupted when multiple back to back packets are being submitted
for transmission. This register defaults to 0, which means no flow
control being applied.
Fixes: e9f31435ee7d ("net: bcmasp: Add support for asp-v3.0")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250718212242.3447751-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
selftests: mptcp: connect: cover alt modes
mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
make sure everything works as expected when using "mmap" and "sendfile"
modes instead of "poll", and with the MPTCP checksum support.
These modes should be validated, but they are not when the selftests are
executed via the kselftest helpers. It means that most CIs validating
these selftests, like NIPA for the net development trees and LKFT for
the stable ones, are not covering these modes.
To fix that, new test programs have been added, simply calling
mptcp_connect.sh with the right parameters.
The first patch can be backported up to v5.6, and the second one up to
v5.14.
v1: https://lore.kernel.org/20250714-net-mptcp-sft-connect-alt-v1-0-bf1c5abbe575@kernel.org
====================
Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-0-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The checksum mode has been added a while ago, but it is only validated
when manually launching mptcp_connect.sh with "-C".
The different CIs were then not validating these MPTCP Connect tests
with checksum enabled. To make sure they do, add a new test program
executing mptcp_connect.sh with the checksum mode.
Fixes: 94d66ba1d8e4 ("selftests: mptcp: enable checksum in mptcp_connect.sh")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-2-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The "mmap" and "sendfile" alternate modes for mptcp_connect.sh/.c are
available from the beginning, but only tested when mptcp_connect.sh is
manually launched with "-m mmap" or "-m sendfile", not via the
kselftests helpers.
The MPTCP CI was manually running "mptcp_connect.sh -m mmap", but not
"-m sendfile". Plus other CIs, especially the ones validating the stable
releases, were not validating these alternate modes.
To make sure these modes are validated by these CIs, add two new test
programs executing mptcp_connect.sh with the alternate modes.
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250715-net-mptcp-sft-connect-alt-v2-1-8230ddd82454@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As described by Vitaly Lifshits:
> Starting from Tiger Lake, LAN NVM is locked for writes by SW, so the
> driver cannot perform checksum validation and correction. This means
> that all NVM images must leave the factory with correct checksum and
> checksum valid bit set.
Unfortunately some systems have left the factory with an uninitialized
value of 0xFFFF at register address 0x3F (checksum word location).
So on Tiger Lake platform we ignore the computed checksum when such
condition is encountered.
Signed-off-by: Jacek Kowalski <jacek@jacekk.info>
Tested-by: Vlad URSU <vlad@ursu.me>
Fixes: 4051f68318ca9 ("e1000e: Do not take care about recovery NVM checksum")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|