Age | Commit message (Collapse) | Author |
|
pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec()
does, the reference should be put after ip6_mc_clear_src() return.
Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
net/mlx5e: Add support for PCIe congestion events
Dragos says:
PCIe congestion events are events generated by the firmware when the
device side has sustained PCIe inbound or outbound traffic above
certain thresholds. The high and low threshold are hysteresis thresholds
to prevent flapping: once the high threshold has been reached, a low
threshold event will be triggered only after the bandwidth usage went
below the low threshold.
This series adds support for receiving and exposing such events as
ethtool counters.
2 new pairs of counters are exposed: pci_bw_in/outbound_high/low. These
should help the user understand if the device PCI is under pressure.
Planned followup patches:
- Allow configuration of thresholds through devlink.
- Add ethtool counter for wakeups which did not result in any state
change.
====================
Link: https://patch.msgid.link/1752589821-145787-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement the PCIe Congestion Event notifier which triggers a work item
to query the PCIe Congestion Event object. The result of the congestion
state is reflected in the new ethtool stats:
* pci_bw_inbound_high: the device has crossed the high threshold for
inbound PCIe traffic.
* pci_bw_inbound_low: the device has crossed the low threshold for
inbound PCIe traffic
* pci_bw_outbound_high: the device has crossed the high threshold for
outbound PCIe traffic.
* pci_bw_outbound_low: the device has crossed the low threshold for
outbound PCIe traffic
The high and low thresholds are currently configured at 90% and 75%.
These are hysteresis thresholds which help to check if the
PCI bus on the device side is in a congested state.
If low + 1 = high then the device is in a congested state. If low == high
then the device is not in a congested state.
The counters are also documented.
A follow-up patch will make the thresholds configurable.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752589821-145787-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add initial infrastructure to create and destroy the PCIe Congestion
Event object if the object is supported.
The verb for the object creation function is "set" instead of
"create" because the function will accommodate the modify operation
as well in a subsequent patch.
The next patches will hook it up to the event handler and will add
actual functionality.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752589821-145787-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The netiucv driver creates TCP/IP interfaces over IUCV between Linux
guests on z/VM and other z/VM entities.
Rationale for removal:
- NETIUCV connections are only supported for compatibility with
earlier versions and not to be used for new network setups,
since at least Linux kernel 4.0.
- No known active users, use cases, or product dependencies
- The driver is no longer relevant for z/VM networking;
preferred methods include:
* Device pass-through (e.g., OSA, RoCE)
* z/VM Virtual Switch (VSWITCH)
The IUCV mechanism itself remains supported and is actively used
via AF_IUCV, hvc_iucv, and smsg_iucv.
Signed-off-by: Nagamani PV <nagamani@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250715074210.3999296-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ryan Wanner says:
====================
Expose REFCLK for RMII and enable RMII
This set allows the REFCLK property to be exposed as a dt-property to
properly reflect the correct RMII layout. RMII can take an external or
internal provided REFCLK, since this is not SoC dependent but board
dependent this must be exposed as a DT property for the macb driver.
This set also enables RMII mode for the SAMA7 SoCs gigabit mac.
v1: https://lore.kernel.org/cover.1750346271.git.Ryan.Wanner@microchip.com
====================
Link: https://patch.msgid.link/cover.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove USARIO_CLKEN flag since this is now a device tree argument and
not fixed to the SoC.
This will instead be selected by the "cdns,refclk-ext"
device tree property.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/1e7a8c324526f631f279925aa8a6aa937d55c796.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This macro enables the RMII mode bit in the USRIO register when RMII
mode is requested.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/6698836e4ee7df5f6bee181f0d2e38d4b8e4cec2.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The RMII and RGMII can both support internal or external provided
REFCLKs 50MHz and 125MHz respectively. Since this is dependent on
the board that the SoC is on this needs to be set via the device tree.
This property flag is checked in the MACB DT node so the REFCLK cap is
configured the correct way for the RMII or RGMII is configured on the
board.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/7f9b65896d6b7b48275bc527b72a16347f8ce10a.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
REFCLK can be provided by an external source so this should be exposed
by a DT property. The REFCLK is used for RMII and in some SoCs that use
this driver the RGMII 125MHz clk can also be provided by an external
source.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/d558467c4d5b27fb3135ffdead800b14cd9c6c0a.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Breno Leitao says:
====================
selftest: net: Add selftest for netpoll
I am submitting a new selftest for the netpoll subsystem specifically
targeting the case where the RX is polling in the TX path, which is
a case that we don't have any test in the tree today. This is done when
netpoll_poll_dev() called, and this test creates a scenario when that is
probably.
The test does the following:
1) Configuring a single RX/TX queue to increase contention on the
interface.
2) Generating background traffic to saturate the network, mimicking
real-world congestion.
3) Sending netconsole messages to trigger netpoll polling and monitor
its behavior.
4) Using dynamic netconsole targets via configfs, with the ability to
delete and recreate targets during the test.
5) Running bpftrace in parallel to verify that netpoll_poll_dev() is
called when expected. If it is called, then the test passes,
otherwise the test is marked as skipped.
In order to achieve it, I stole Jakub's bpftrace helper from [1], and
did some small changes that I found useful to use the helper.
So, this patchset basically contains:
1) The code stolen from Jakub
2) Improvements on bpftrace() helper
3) The selftest itself
Link: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/ [1]
====================
Link: https://patch.msgid.link/20250714-netpoll_test-v7-0-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a basic selftest for the netpoll polling mechanism, specifically
targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at
maximum speed, and netpoll needs to poll the NIC. This is achieved by:
1. Configuring a single RX/TX queue to create contention
2. Generating background traffic to saturate the interface
3. Sending netconsole messages to trigger netpoll polling
4. Using dynamic netconsole targets via configfs
5. Delete and create new netconsole targets after some messages
6. Start a bpftrace in parallel to make sure netpoll_poll_dev() is
called
7. If bpftrace exists and netpoll_poll_dev() was called, stop.
The test validates a critical netpoll code path by monitoring traffic
flow and ensuring netpoll_poll_dev() is called when the normal TX path
is blocked.
This addresses a gap in netpoll test coverage for a path that is
tricky for the network stack.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-3-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The '@' prefix in bpftrace map keys is specific to bpftrace and can be
safely removed when processing results. This patch modifies the bpftrace
utility to strip the '@' from map keys before storing them in the result
dictionary, making the keys more consistent with Python conventions.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-2-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bpftrace is very useful for low level driver testing. perf or trace-cmd
would also do for collecting data from tracepoints, but they require
much more post-processing.
Add a wrapper for running bpftrace and sanitizing its output.
bpftrace has JSON output, which is great, but it prints loose objects
and in a slightly inconvenient format. We have to read the objects
line by line, and while at it return them indexed by the map name.
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-1-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clarify that drivers must remove device-reserved metadata from the
data_meta area before passing frames to XDP programs.
Additionally, expand the explanation of how userspace and BPF programs
should coordinate the use of METADATA_SIZE, and add a detailed diagram
to illustrate pointer adjustments and metadata layout.
Also describe the requirements and constraints enforced by
bpf_xdp_adjust_meta().
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/r/20250716154846.3513575-1-yoong.siang.song@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice)
Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.
For ice:
Dave adds a NULL check for LAG netdev.
Michal corrects a pointer check in debugfs initialization.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: check correct pointer in fwlog debugfs
ice: add NULL check in eswitch lag check
ethernet: intel: fix building with large NR_CPUS
====================
Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.
This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.
Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use __skb_queue_purge() instead of re-implementing it. Note that it uses
kfree_skb_reason() instead of kfree_skb() internally, and pass
SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250715120709.3941510-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
`vsock_do_ioctl` returns -ENOIOCTLCMD if an ioctl support is not
implemented, like for SIOCINQ before commit f7c722659275 ("vsock: Add
support for SIOCINQ ioctl"). In net/socket.c, -ENOIOCTLCMD is re-mapped
to -ENOTTY for the user space. So, our test suite, without that commit
applied, is failing in this way:
34 - SOCK_STREAM ioctl(SIOCINQ) functionality...ioctl(21531): Inappropriate ioctl for device
Return false in vsock_ioctl_int() to skip the test in this case as well,
instead of failing.
Fixes: 53548d6bffac ("test/vsock: Add retry mechanism to ioctl wrapper")
Cc: niuxuewei.nxw@antgroup.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Link: https://patch.msgid.link/20250715093233.94108-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The CI reported a UaF in tcp_prune_ofo_queue():
BUG: KASAN: slab-use-after-free in tcp_prune_ofo_queue+0x55d/0x660
Read of size 4 at addr ffff8880134729d8 by task socat/20348
CPU: 0 UID: 0 PID: 20348 Comm: socat Not tainted 6.16.0-rc5-virtme #1 PREEMPT(full)
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x82/0xd0
print_address_description.constprop.0+0x2c/0x400
print_report+0xb4/0x270
kasan_report+0xca/0x100
tcp_prune_ofo_queue+0x55d/0x660
tcp_try_rmem_schedule+0x855/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcf73ef2337
Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffd4f924708 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf73ef2337
RDX: 0000000000002000 RSI: 0000555f11d1a000 RDI: 0000000000000008
RBP: 0000555f11d1a000 R08: 0000000000002000 R09: 0000000000000000
R10: 0000000000000040 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000002000 R14: 0000555ee1a44570 R15: 0000000000002000
</TASK>
Allocated by task 20348:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_slab_alloc+0x59/0x70
kmem_cache_alloc_node_noprof+0x110/0x340
__alloc_skb+0x213/0x2e0
tcp_collapse+0x43f/0xff0
tcp_try_rmem_schedule+0x6b9/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 20348:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x38/0x50
kmem_cache_free+0x149/0x330
tcp_prune_ofo_queue+0x211/0x660
tcp_try_rmem_schedule+0x855/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff888013472900
which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 216 bytes inside of
freed 232-byte region [ffff888013472900, ffff8880134729e8)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13472
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x80000000000040(head|node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290
raw: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000
head: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290
head: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000
head: 0080000000000001 ffffea00004d1c81 00000000ffffffff 00000000ffffffff
head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
^
ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines
above. The caller wants to enqueue 'in_skb', lets check space vs the
latter.
Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: syzbot+865aca08c0533171bf6a@syzkaller.appspotmail.com
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/b78d2d9bdccca29021eed9a0e7097dd8dc00f485.1752567053.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.
For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.
This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.
So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.
v2:
- Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
(Tariq Toukan <tariqt@nvidia.com>)
- Improve commit-message (Gal Pressman <gal@nvidia.com>)
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit f5fda1a86884 ("selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt")
added this test recently, but it's failing with:
# tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
# script packet: 1.230105 . 1:1(0) ack 54001 win 0
# actual packet: 1.190101 . 1:1(0) ack 54001 win 0
It's unclear why the test expects the ack to be delayed.
Correct it.
Link: https://patch.msgid.link/20250715142849.959444-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to
verify that we have no conflict of input_xfrm with the RSS fields
options, there is no point in doing it if input_xfrm is not
supported/requested.
This is under the assumption that a driver that supports input_xfrm will
also support ->get_rxfh_fields(), so add a WARN_ON() to
ethtool_check_ops() to verify it, and remove the op NULL check.
This fixes the following error in mlx4_en, which doesn't support
getting/setting RXFH fields.
$ ethtool --set-rxfh-indir eth2 hfunc xor
Cannot set RX flow hash configuration: Operation not supported
Fixes: 72792461c8e8 ("net: ethtool: don't mux RXFH via rxnfc callbacks")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub reported that the rtnetlink test for the preferred lifetime of an
address has become quite flaky. The issue started appearing around the 6.16
merge window in May, and the test fails with:
FAIL: preferred_lft addresses remaining
The flakiness might be related to power-saving behavior, as address
expiration is handled by a "power-efficient" workqueue.
To address this, use slowwait to check more frequently whether the address
still exists. This reduces the likelihood of the system entering a low-power
state during the test, improving reliability.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250715043459.110523-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- fprobe-event: The @params variable was being used in an error path
without being initialized. The fix to return an error code.
* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Avoid using params uninitialized in parse_btf_arg()
|
|
without board ID
For GF variant of WCN6855 without board ID programmed
btusb_generate_qca_nvm_name() will chose wrong NVM
'qca/nvm_usb_00130201.bin' to download.
Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'.
Also simplify NVM choice logic of btusb_generate_qca_nvm_name().
Fixes: d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller")
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The 'quirks' member already ran out of bits on some platforms some time
ago. Replace the integer member by a bitmap in order to have enough bits
in future. Replace raw bit operations by accessor macros.
Fixes: ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING")
Fixes: 127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE")
Suggested-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Macro parameters should always be put into braces when accessing it.
Fixes: 4fc9857ab8c6 ("Bluetooth: hci_sync: Add check simultaneous roles support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The provided macro parameter is named 'dev' (rather than 'hdev', which
may be a variable on the stack where the macro is used).
Fixes: a9a830a676a9 ("Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZE")
Fixes: 6126ffabba6b ("Bluetooth: Introduce HCI_CONN_FLAG_DEVICE_PRIVACY device flag")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This replaces the usage of HCI_ERROR_REMOTE_USER_TERM, which as the name
suggest is to indicate a regular disconnection initiated by an user,
with HCI_ERROR_AUTH_FAILURE to indicate the session has timeout thus any
pairing shall be considered as failed.
Fixes: 1e91c29eb60c ("Bluetooth: Use hci_disconnect for immediate disconnection from SMP")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
If a command is received while a bonding is ongoing consider it a
pairing failure so the session is cleanup properly and the device is
disconnected immediately instead of continuing with other commands that
may result in the session to get stuck without ever completing such as
the case bellow:
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
SMP: Identity Information (0x08) len 16
Identity resolving key[16]: d7e08edef97d3e62cd2331f82d8073b0
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
SMP: Signing Information (0x0a) len 16
Signature key[16]: 1716c536f94e843a9aea8b13ffde477d
Bluetooth: hci0: unexpected SMP command 0x0a from XX:XX:XX:XX:XX:XX
> ACL Data RX: Handle 2048 flags 0x02 dlen 12
SMP: Identity Address Information (0x09) len 7
Address: XX:XX:XX:XX:XX:XX (Intel Corporate)
While accourding to core spec 6.1 the expected order is always BD_ADDR
first first then CSRK:
When using LE legacy pairing, the keys shall be distributed in the
following order:
LTK by the Peripheral
EDIV and Rand by the Peripheral
IRK by the Peripheral
BD_ADDR by the Peripheral
CSRK by the Peripheral
LTK by the Central
EDIV and Rand by the Central
IRK by the Central
BD_ADDR by the Central
CSRK by the Central
When using LE Secure Connections, the keys shall be distributed in the
following order:
IRK by the Peripheral
BD_ADDR by the Peripheral
CSRK by the Peripheral
IRK by the Central
BD_ADDR by the Central
CSRK by the Central
According to the Core 6.1 for commands used for key distribution "Key
Rejected" can be used:
'3.6.1. Key distribution and generation
A device may reject a distributed key by sending the Pairing Failed command
with the reason set to "Key Rejected".
Fixes: b28b4943660f ("Bluetooth: Add strict checks for allowed SMP PDUs")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
btintel_classify_pkt_type
Due to what seem to be a bug with variant version returned by some
firmwares the code may set hdev->classify_pkt_type with
btintel_classify_pkt_type when in fact the controller doesn't even
support ISO channels feature but may use the handle range expected from
a controllers that does causing the packets to be reclassified as ISO
causing several bugs.
To fix the above btintel_classify_pkt_type will attempt to check if the
controller really supports ISO channels and in case it doesn't don't
reclassify even if the handle range is considered to be ISO, this is
considered safer than trying to fix the specific controller/firmware
version as that could change over time and causing similar problems in
the future.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219553
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2100565
Link: https://github.com/StarLabsLtd/firmware/issues/180
Fixes: f25b7fd36cc3 ("Bluetooth: Add vendor-specific packet classification for ISO data")
Cc: stable@vger.kernel.org
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Sean Rhodes <sean@starlabs.systems>
|
|
random address
Currently, the connectable flag used by the setup of an extended
advertising instance drives whether we require privacy when trying to pass
a random address to the advertising parameters (Own Address).
If privacy is not required, then it automatically falls back to using the
controller's public address. This can cause problems when using controllers
that do not have a public address set, but instead use a static random
address.
e.g. Assume a BLE controller that does not have a public address set.
The controller upon powering is set with a random static address by default
by the kernel.
< HCI Command: LE Set Random Address (0x08|0x0005) plen 6
Address: E4:AF:26:D8:3E:3A (Static)
> HCI Event: Command Complete (0x0e) plen 4
LE Set Random Address (0x08|0x0005) ncmd 1
Status: Success (0x00)
Setting non-connectable extended advertisement parameters in bluetoothctl
mgmt
add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g 1
correctly sets Own address type as Random
< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
Own address type: Random (0x01)
Setting connectable extended advertisement parameters in bluetoothctl mgmt
add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g -c 1
mistakenly sets Own address type to Public (which causes to use Public
Address 00:00:00:00:00:00)
< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
Own address type: Public (0x00)
This causes either the controller to emit an Invalid Parameters error or to
mishandle the advertising.
This patch makes sure that we use the already set static random address
when requesting a connectable extended advertising when we don't require
privacy and our public address is not set (00:00:00:00:00:00).
Fixes: 3fe318ee72c5 ("Bluetooth: move hci_get_random_address() to hci_sync")
Signed-off-by: Alessandro Gasbarroni <alex.gasbarroni@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
syzbot reported null-ptr-deref in l2cap_sock_resume_cb(). [0]
l2cap_sock_resume_cb() has a similar problem that was fixed by commit
1bff51ea59a9 ("Bluetooth: fix use-after-free error in lock_sock_nested()").
Since both l2cap_sock_kill() and l2cap_sock_resume_cb() are executed
under l2cap_sock_resume_cb(), we can avoid the issue simply by checking
if chan->data is NULL.
Let's not access to the killed socket in l2cap_sock_resume_cb().
[0]:
BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:82 [inline]
BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
BUG: KASAN: null-ptr-deref in l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
Write of size 8 at addr 0000000000000570 by task kworker/u9:0/52
CPU: 1 UID: 0 PID: 52 Comm: kworker/u9:0 Not tainted 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Workqueue: hci0 hci_rx_work
Call trace:
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:501 (C)
__dump_stack+0x30/0x40 lib/dump_stack.c:94
dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
print_report+0x58/0x84 mm/kasan/report.c:524
kasan_report+0xb0/0x110 mm/kasan/report.c:634
check_region_inline mm/kasan/generic.c:-1 [inline]
kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
__kasan_check_write+0x20/0x30 mm/kasan/shadow.c:37
instrument_atomic_write include/linux/instrumented.h:82 [inline]
clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
l2cap_security_cfm+0x524/0xea0 net/bluetooth/l2cap_core.c:7357
hci_auth_cfm include/net/bluetooth/hci_core.h:2092 [inline]
hci_auth_complete_evt+0x2e8/0xa4c net/bluetooth/hci_event.c:3514
hci_event_func net/bluetooth/hci_event.c:7511 [inline]
hci_event_packet+0x650/0xe9c net/bluetooth/hci_event.c:7565
hci_rx_work+0x320/0xb18 net/bluetooth/hci_core.c:4070
process_one_work+0x7e8/0x155c kernel/workqueue.c:3238
process_scheduled_works kernel/workqueue.c:3321 [inline]
worker_thread+0x958/0xed8 kernel/workqueue.c:3402
kthread+0x5fc/0x75c kernel/kthread.c:464
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:847
Fixes: d97c899bde33 ("Bluetooth: Introduce L2CAP channel callback for resuming")
Reported-by: syzbot+e4d73b165c3892852d22@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/686c12bd.a70a0220.29fe6c.0b13.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The ovpn_netdev_write() function is responsible for injecting
decapsulated and decrypted packets back into the local network stack.
Prior to this patch, the skb could retain GSO metadata from the outer,
encrypted tunnel packet. This original GSO metadata, relevant to the
sender's transport context, becomes invalid and misleading for the
tunnel/data path once the inner packet is exposed.
Leaving this stale metadata intact causes internal GSO validation checks
further down the kernel's network stack (validate_xmit_skb()) to fail,
leading to packet drops. The reasons for these failures vary by
protocol, for example:
- for ICMP, no offload handler is registered;
- for TCP and UDP, the respective offload handlers return errors when
comparing skb->len to the outdated skb_shinfo(skb)->gso_size.
By calling skb_gso_reset(skb) we ensure the inner packet is presented to
gro_cells_receive() with a clean state, correctly indicating it is an
individual packet from the perspective of the local stack.
This change eliminates the "Driver has suspect GRO implementation, TCP
performance may be compromised" warning and improves overall TCP
performance by allowing GSO/GRO to function as intended on the
decapsulated traffic.
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/4
Tested-by: Gert Doering <gert@greenie.muc.de>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
Netlink ops do not expect all attributes to be always set, however
this condition is not explicitly coded any where, leading the user
to believe that all sent attributes are somewhat processed.
Fix this behaviour by introducing explicit checks.
For CMD_OVPN_PEER_GET and CMD_OVPN_KEY_GET directly open-code the
needed condition in the related ops handlers.
While for all other ops use attribute subsets in the ovpn.yaml spec file.
Fixes: b7a63391aa98 ("ovpn: add basic netlink support")
Reported-by: Ralf Lici <ralf@mandelbit.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/19
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
OpenVPN allows users to configure a FW mark on sockets used to
communicate with other peers. The mark is set by means of the
`SO_MARK` Linux socket option.
However, in the ovpn UDP code path, the socket's `sk_mark` value is
currently ignored and it is not propagated to outgoing `skbs`.
This commit ensures proper inheritance of the field by setting
`skb->mark` to `sk->sk_mark` before handing the `skb` to the network
stack for transmission.
Fixes: 08857b5ec5d9 ("ovpn: implement basic TX path (UDP)")
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31877.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
After a recent change in clang to strengthen uninitialized warnings [1],
it points out that in one of the error paths in parse_btf_arg(), params
is used uninitialized:
kernel/trace/trace_probe.c:660:19: warning: variable 'params' is uninitialized when used here [-Wuninitialized]
660 | return PTR_ERR(params);
| ^~~~~~
Match many other NO_BTF_ENTRY error cases and return -ENOENT, clearing
up the warning.
Link: https://lore.kernel.org/all/20250715-trace_probe-fix-const-uninit-warning-v1-1-98960f91dd04@kernel.org/
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2110
Fixes: d157d7694460 ("tracing/probes: Support BTF field access from $retval")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Matthieu Baerts says:
====================
mptcp: fix fallback-related races
This series contains 3 fixes somewhat related to various races we have
while handling fallback.
The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL -- where the window race is even
wider.
Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2. These fixes can be
backported up to v5.19 and v5.15.
Note that mptcp_disconnect() unconditionally clears the fallback status
(zeroing msk->flags) but don't touch the `allows_infinite_fallback`
flag. Such issue is addressed in patch 3, and can be backported up to
v5.17.
====================
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-0-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mptcp_disconnect() clears the fallback bit unconditionally, without
touching the associated flags.
The bit clear is safe, as no fallback operation can race with that --
all subflow are already in TCP_CLOSE status thanks to the previous
FASTCLOSE -- but we need to consistently reset all the fallback related
status.
Also acquire the relevant lock, to avoid fouling static analyzers.
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-3-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have races similar to the one addressed by the previous patch between
subflow failing and additional subflow creation. They are just harder to
trigger.
The solution is similar. Use a separate flag to track the condition
'socket state prevent any additional subflow creation' protected by the
fallback lock.
The socket fallback makes such flag true, and also receiving or sending
an MP_FAIL option.
The field 'allow_infinite_fallback' is now always touched under the
relevant lock, we can drop the ONCE annotation on write.
Fixes: 478d770008b0 ("mptcp: send out MP_FAIL when data checksum fails")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-2-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzkaller reported the following splat:
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Modules linked in:
CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted 6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary)
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline]
RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00
RSP: 0018:ffff8880a3f08448 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880180a8000 RCX: ffffffff84afcf45
RDX: ffff888090223700 RSI: ffffffff84afdaa7 RDI: 0000000000000001
RBP: ffff888017955780 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880180a8910 R14: ffff8880a3e9d058 R15: 0000000000000000
FS: 00005555791b8500(0000) GS:ffff88811c495000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000110c2800b7 CR3: 0000000058e44000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432
tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975
tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166
tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925
tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363
ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:469 [inline]
ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975
__netif_receive_skb+0x1f/0x120 net/core/dev.c:6088
process_backlog+0x301/0x1360 net/core/dev.c:6440
__napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453
napi_poll net/core/dev.c:7517 [inline]
net_rx_action+0xb44/0x1010 net/core/dev.c:7644
handle_softirqs+0x1d0/0x770 kernel/softirq.c:579
do_softirq+0x3f/0x90 kernel/softirq.c:480
</IRQ>
<TASK>
__local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407
local_bh_enable include/linux/bottom_half.h:33 [inline]
inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524
mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985
mptcp_check_listen_stop net/mptcp/mib.h:118 [inline]
__mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000
mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066
inet_release+0xed/0x200 net/ipv4/af_inet.c:435
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487
__sock_release+0xb3/0x270 net/socket.c:649
sock_close+0x1c/0x30 net/socket.c:1439
__fput+0x402/0xb70 fs/file_table.c:465
task_work_run+0x150/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xd4/0xe0 kernel/entry/common.c:114
exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
do_syscall_64+0x245/0x360 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc92f8a36ad
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcf52802d8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 00007ffcf52803a8 RCX: 00007fc92f8a36ad
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 00007fc92fae7ba0 R08: 0000000000000001 R09: 0000002800000000
R10: 00007fc92f700000 R11: 0000000000000246 R12: 00007fc92fae5fac
R13: 00007fc92fae5fa0 R14: 0000000000026d00 R15: 0000000000026c51
</TASK>
irq event stamp: 4068
hardirqs last enabled at (4076): [<ffffffff81544816>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:344
hardirqs last disabled at (4085): [<ffffffff815447fb>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:342
softirqs last enabled at (3096): [<ffffffff840e1be0>] local_bh_enable include/linux/bottom_half.h:33 [inline]
softirqs last enabled at (3096): [<ffffffff840e1be0>] inet_csk_listen_stop+0x2c0/0x1070 net/ipv4/inet_connection_sock.c:1524
softirqs last disabled at (3097): [<ffffffff813b6b9f>] do_softirq+0x3f/0x90 kernel/softirq.c:480
Since we need to track the 'fallback is possible' condition and the
fallback status separately, there are a few possible races open between
the check and the actual fallback action.
Add a spinlock to protect the fallback related information and use it
close all the possible related races. While at it also remove the
too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect():
the field will be correctly cleared by subflow_finish_connect() if/when
the connection will complete successfully.
If fallback is not possible, as per RFC, reset the current subflow.
Since the fallback operation can now fail and return value should be
checked, rename the helper accordingly.
Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status")
Cc: stable@vger.kernel.org
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570
Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-1-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Multicast good packets received by PF rings that pass ethternet MAC
address filtering are counted for rtnl_link_stats64.multicast. The
counter is not cleared on read. Fix the duplicate counting on updating
statistics.
Fixes: 46b92e10d631 ("net: libwx: support hardware statistics")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/DA229A4F58B70E51+20250714015656.91772-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jiawen Wu says:
====================
Fix Rx fatal errors
There are some fatal errors on the Rx NAPI path, which can cause
the kernel to crash. Fix known issues and potential risks.
The part of the patches has been mentioned before[1].
[1]: https://lore.kernel.org/all/C8A23A11DB646E60+20250630094102.22265-1-jiawenwu@trustnetic.com/
v1: https://lore.kernel.org/20250709064025.19436-1-jiawenwu@trustnetic.com
====================
Link: https://patch.msgid.link/20250714024755.17512-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When device reset is triggered by feature changes such as toggling Rx
VLAN offload, wx->do_reset() is called to reinitialize Rx rings. The
hardware descriptor ring may retain stale values from previous sessions.
And only set the length to 0 in rx_desc[0] would result in building
malformed SKBs. Fix it to ensure a clean slate after device reset.
[ 549.186435] [ C16] ------------[ cut here ]------------
[ 549.186457] [ C16] kernel BUG at net/core/skbuff.c:2814!
[ 549.186468] [ C16] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 549.186472] [ C16] CPU: 16 UID: 0 PID: 0 Comm: swapper/16 Kdump: loaded Not tainted 6.16.0-rc4+ #23 PREEMPT(voluntary)
[ 549.186476] [ C16] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[ 549.186478] [ C16] RIP: 0010:__pskb_pull_tail+0x3ff/0x510
[ 549.186484] [ C16] Code: 06 f0 ff 4f 34 74 7b 4d 8b 8c 24 c8 00 00 00 45 8b 84 24 c0 00 00 00 e9 c8 fd ff ff 48 c7 44 24 08 00 00 00 00 e9 5e fe ff ff <0f> 0b 31 c0 e9 23 90 5b ff 41 f7 c6 ff 0f 00 00 75 bf 49 8b 06 a8
[ 549.186487] [ C16] RSP: 0018:ffffb391c0640d70 EFLAGS: 00010282
[ 549.186490] [ C16] RAX: 00000000fffffff2 RBX: ffff8fe7e4d40200 RCX: 00000000fffffff2
[ 549.186492] [ C16] RDX: ffff8fe7c3a4bf8e RSI: 0000000000000180 RDI: ffff8fe7c3a4bf40
[ 549.186494] [ C16] RBP: ffffb391c0640da8 R08: ffff8fe7c3a4c0c0 R09: 000000000000000e
[ 549.186496] [ C16] R10: ffffb391c0640d88 R11: 000000000000000e R12: ffff8fe7e4d40200
[ 549.186497] [ C16] R13: 00000000fffffff2 R14: ffff8fe7fa01a000 R15: 00000000fffffff2
[ 549.186499] [ C16] FS: 0000000000000000(0000) GS:ffff8fef5ae40000(0000) knlGS:0000000000000000
[ 549.186502] [ C16] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 549.186503] [ C16] CR2: 00007f77d81d6000 CR3: 000000051a032000 CR4: 0000000000750ef0
[ 549.186505] [ C16] PKRU: 55555554
[ 549.186507] [ C16] Call Trace:
[ 549.186510] [ C16] <IRQ>
[ 549.186513] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186517] [ C16] __skb_pad+0xc7/0xf0
[ 549.186523] [ C16] wx_clean_rx_irq+0x355/0x3b0 [libwx]
[ 549.186533] [ C16] wx_poll+0x92/0x120 [libwx]
[ 549.186540] [ C16] __napi_poll+0x28/0x190
[ 549.186544] [ C16] net_rx_action+0x301/0x3f0
[ 549.186548] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186551] [ C16] ? __raw_spin_lock_irqsave+0x1e/0x50
[ 549.186554] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186557] [ C16] ? wake_up_nohz_cpu+0x35/0x160
[ 549.186559] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186563] [ C16] handle_softirqs+0xf9/0x2c0
[ 549.186568] [ C16] __irq_exit_rcu+0xc7/0x130
[ 549.186572] [ C16] common_interrupt+0xb8/0xd0
[ 549.186576] [ C16] </IRQ>
[ 549.186577] [ C16] <TASK>
[ 549.186579] [ C16] asm_common_interrupt+0x22/0x40
[ 549.186582] [ C16] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[ 549.186585] [ C16] Code: 00 00 e8 11 0e 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 0d ed 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[ 549.186587] [ C16] RSP: 0018:ffffb391c0277e78 EFLAGS: 00000246
[ 549.186590] [ C16] RAX: ffff8fef5ae40000 RBX: 0000000000000003 RCX: 0000000000000000
[ 549.186591] [ C16] RDX: 0000007fde0faac5 RSI: ffffffff826e53f6 RDI: ffffffff826fa9b3
[ 549.186593] [ C16] RBP: ffff8fe7c3a20800 R08: 0000000000000002 R09: 0000000000000000
[ 549.186595] [ C16] R10: 0000000000000000 R11: 000000000000ffff R12: ffffffff82ed7a40
[ 549.186596] [ C16] R13: 0000007fde0faac5 R14: 0000000000000003 R15: 0000000000000000
[ 549.186601] [ C16] ? cpuidle_enter_state+0xb3/0x420
[ 549.186605] [ C16] cpuidle_enter+0x29/0x40
[ 549.186609] [ C16] cpuidle_idle_call+0xfd/0x170
[ 549.186613] [ C16] do_idle+0x7a/0xc0
[ 549.186616] [ C16] cpu_startup_entry+0x25/0x30
[ 549.186618] [ C16] start_secondary+0x117/0x140
[ 549.186623] [ C16] common_startup_64+0x13e/0x148
[ 549.186628] [ C16] </TASK>
Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-4-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The wx_rx_buffer structure contained two DMA address fields: 'dma' and
'page_dma'. However, only 'page_dma' was actually initialized and used
to program the Rx descriptor. But 'dma' was uninitialized and used in
some paths.
This could lead to undefined behavior, including DMA errors or
use-after-free, if the uninitialized 'dma' was used. Althrough such
error has not yet occurred, it is worth fixing in the code.
Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-3-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
page_pool_put_full_page() should only be invoked when freeing Rx buffers
or building a skb if the size is too short. At other times, the pages
need to be reused. So remove the redundant page put. In the original
code, double free pages cause kernel panic:
[ 876.949834] __irq_exit_rcu+0xc7/0x130
[ 876.949836] common_interrupt+0xb8/0xd0
[ 876.949838] </IRQ>
[ 876.949838] <TASK>
[ 876.949840] asm_common_interrupt+0x22/0x40
[ 876.949841] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[ 876.949843] Code: 00 00 e8 d1 1d 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 cd fc 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[ 876.949844] RSP: 0018:ffffaa7340267e78 EFLAGS: 00000246
[ 876.949845] RAX: ffff9e3f135be000 RBX: 0000000000000002 RCX: 0000000000000000
[ 876.949846] RDX: 000000cc2dc4cb7c RSI: ffffffff89ee49ae RDI: ffffffff89ef9f9e
[ 876.949847] RBP: ffff9e378f940800 R08: 0000000000000002 R09: 00000000000000ed
[ 876.949848] R10: 000000000000afc8 R11: ffff9e3e9e5a9b6c R12: ffffffff8a6d8580
[ 876.949849] R13: 000000cc2dc4cb7c R14: 0000000000000002 R15: 0000000000000000
[ 876.949852] ? cpuidle_enter_state+0xb3/0x420
[ 876.949855] cpuidle_enter+0x29/0x40
[ 876.949857] cpuidle_idle_call+0xfd/0x170
[ 876.949859] do_idle+0x7a/0xc0
[ 876.949861] cpu_startup_entry+0x25/0x30
[ 876.949862] start_secondary+0x117/0x140
[ 876.949864] common_startup_64+0x13e/0x148
[ 876.949867] </TASK>
[ 876.949868] ---[ end trace 0000000000000000 ]---
[ 876.949869] ------------[ cut here ]------------
[ 876.949870] list_del corruption, ffffead40445a348->next is NULL
[ 876.949873] WARNING: CPU: 14 PID: 0 at lib/list_debug.c:52 __list_del_entry_valid_or_report+0x67/0x120
[ 876.949875] Modules linked in: snd_hrtimer(E) bnep(E) binfmt_misc(E) amdgpu(E) squashfs(E) vfat(E) loop(E) fat(E) amd_atl(E) snd_hda_codec_realtek(E) intel_rapl_msr(E) snd_hda_codec_generic(E) intel_rapl_common(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) edac_mce_amd(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hda_core(E) amdxcp(E) kvm_amd(E) snd_hwdep(E) gpu_sched(E) drm_panel_backlight_quirks(E) cec(E) snd_pcm(E) drm_buddy(E) snd_seq_dummy(E) drm_ttm_helper(E) btusb(E) kvm(E) snd_seq_oss(E) btrtl(E) ttm(E) btintel(E) snd_seq_midi(E) btbcm(E) drm_exec(E) snd_seq_midi_event(E) i2c_algo_bit(E) snd_rawmidi(E) bluetooth(E) drm_suballoc_helper(E) irqbypass(E) snd_seq(E) ghash_clmulni_intel(E) sha512_ssse3(E) drm_display_helper(E) aesni_intel(E) snd_seq_device(E) rfkill(E) snd_timer(E) gf128mul(E) drm_client_lib(E) drm_kms_helper(E) snd(E) i2c_piix4(E) joydev(E) soundcore(E) wmi_bmof(E) ccp(E) k10temp(E) i2c_smbus(E) gpio_amdpt(E) i2c_designware_platform(E) gpio_generic(E) sg(E)
[ 876.949914] i2c_designware_core(E) sch_fq_codel(E) parport_pc(E) drm(E) ppdev(E) lp(E) parport(E) fuse(E) nfnetlink(E) ip_tables(E) ext4 crc16 mbcache jbd2 sd_mod sfp mdio_i2c i2c_core txgbe ahci ngbe pcs_xpcs libahci libwx r8169 phylink libata realtek ptp pps_core video wmi
[ 876.949933] CPU: 14 UID: 0 PID: 0 Comm: swapper/14 Kdump: loaded Tainted: G W E 6.16.0-rc2+ #20 PREEMPT(voluntary)
[ 876.949935] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[ 876.949936] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[ 876.949936] RIP: 0010:__list_del_entry_valid_or_report+0x67/0x120
[ 876.949938] Code: 00 00 00 48 39 7d 08 0f 85 a6 00 00 00 5b b8 01 00 00 00 5d 41 5c e9 73 0d 93 ff 48 89 fe 48 c7 c7 a0 31 e8 89 e8 59 7c b3 ff <0f> 0b 31 c0 5b 5d 41 5c e9 57 0d 93 ff 48 89 fe 48 c7 c7 c8 31 e8
[ 876.949940] RSP: 0018:ffffaa73405d0c60 EFLAGS: 00010282
[ 876.949941] RAX: 0000000000000000 RBX: ffffead40445a348 RCX: 0000000000000000
[ 876.949942] RDX: 0000000000000105 RSI: 0000000000000001 RDI: 00000000ffffffff
[ 876.949943] RBP: 0000000000000000 R08: 000000010006dfde R09: ffffffff8a47d150
[ 876.949944] R10: ffffffff8a47d150 R11: 0000000000000003 R12: dead000000000122
[ 876.949945] R13: ffff9e3e9e5af700 R14: ffffead40445a348 R15: ffff9e3e9e5af720
[ 876.949946] FS: 0000000000000000(0000) GS:ffff9e3f135be000(0000) knlGS:0000000000000000
[ 876.949947] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 876.949948] CR2: 00007fa58b480048 CR3: 0000000156724000 CR4: 0000000000750ef0
[ 876.949949] PKRU: 55555554
[ 876.949950] Call Trace:
[ 876.949951] <IRQ>
[ 876.949952] __rmqueue_pcplist+0x53/0x2c0
[ 876.949955] alloc_pages_bulk_noprof+0x2e0/0x660
[ 876.949958] __page_pool_alloc_pages_slow+0xa9/0x400
[ 876.949961] page_pool_alloc_pages+0xa/0x20
[ 876.949963] wx_alloc_rx_buffers+0xd7/0x110 [libwx]
[ 876.949967] wx_clean_rx_irq+0x262/0x430 [libwx]
[ 876.949971] wx_poll+0x92/0x130 [libwx]
[ 876.949975] __napi_poll+0x28/0x190
[ 876.949977] net_rx_action+0x301/0x3f0
[ 876.949980] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949981] ? profile_tick+0x30/0x70
[ 876.949983] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949984] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949986] ? timerqueue_add+0xa3/0xc0
[ 876.949988] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949989] ? __raise_softirq_irqoff+0x16/0x70
[ 876.949991] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949993] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949994] ? wx_msix_clean_rings+0x41/0x50 [libwx]
[ 876.949998] handle_softirqs+0xf9/0x2c0
Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-2-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jijie Shao says:
====================
net: hns3: use seq_file for debugfs
Arnd reported that there are two build warning for on-stasck
buffer oversize. As Arnd's suggestion, using seq file way
to avoid the stack buffer or kmalloc buffer allocating.
v2: https://lore.kernel.org/20250711061725.225585-1-shaojijie@huawei.com
v1: https://lore.kernel.org/20250708130029.1310872-1-shaojijie@huawei.com
====================
Link: https://patch.msgid.link/20250714061037.2616413-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch use seq_file for the following nodes:
tx_bd_queue_*/rx_bd_queue_*
This patch is the last modification to debugfs file.
Unused functions and variables are removed together.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250714061037.2616413-11-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch use seq_file for the following nodes:
mng_tbl/loopback/interrupt_info/reset_info/imp_info/ncl_config/
mac_tnl_status/service_task_info/vlan_config/ptp_info
This patch is the last modification to debugfs file
of hclge layer. Unused functions and variables
are removed together.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250714061037.2616413-10-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|