Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next into main
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next:
Patch #1 to #11 to shrink memory consumption for transaction objects:
struct nft_trans_chain { /* size: 120 (-32), cachelines: 2, members: 10 */
struct nft_trans_elem { /* size: 72 (-40), cachelines: 2, members: 4 */
struct nft_trans_flowtable { /* size: 80 (-48), cachelines: 2, members: 5 */
struct nft_trans_obj { /* size: 72 (-40), cachelines: 2, members: 4 */
struct nft_trans_rule { /* size: 80 (-32), cachelines: 2, members: 6 */
struct nft_trans_set { /* size: 96 (-24), cachelines: 2, members: 8 */
struct nft_trans_table { /* size: 56 (-40), cachelines: 1, members: 2 */
struct nft_trans_elem can now be allocated from kmalloc-96 instead of
kmalloc-128 slab.
Series from Florian Westphal. For the record, I have mangled patch #1
to add nft_trans_container_*() and use if for every transaction object.
I have also added BUILD_BUG_ON to ensure struct nft_trans always comes
at the beginning of the container transaction object. And few minor
cleanups, any new bugs are of my own.
Patch #12 simplify check for SCTP GSO in IPVS, from Ismael Luceno.
Patch #13 nf_conncount key length remains in the u32 bound, from Yunjian Wang.
Patch #14 removes unnecessary check for CTA_TIMEOUT_L3PROTO when setting
default conntrack timeouts via nfnetlink_cttimeout API, from
Lin Ma.
Patch #15 updates NFT_SECMARK_CTX_MAXLEN to 4096, SELinux could use
larger secctx names than the existing 256 bytes length.
Patch #16 adds a selftest to exercise nfnetlink_queue listeners leaving
nfnetlink_queue, from Florian Westphal.
Patch #17 increases hitcount from 255 to 65535 in xt_recent, from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I don't see anything checking that TCP_METRICS_ATTR_SADDR_IPV4
is at least 4 bytes long, and the policy doesn't have an entry
for this attribute at all (neither does it for IPv6 but v6 is
manually validated).
Reviewed-by: Eric Dumazet <edumazet@google.com>
Fixes: 3e7013ddf55a ("tcp: metrics: Allow selective get/del of tcp-metrics based on src IP")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull NFS client fix from Trond Myklebust:
- One more SUNRPC fix for the NFSv4.x backchannel timeouts
* tag 'nfs-for-6.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Fix backchannel reply, again
|
|
There were several instances of the string "assocat" in the kernel, which
should have been spelled "associat", with the various endings of -ive,
-ed, -ion, and sometimes beginnging with dis-.
Add to the spelling dictionary the corrections so that future instances
will be caught by checkpatch, and fix the instances found.
Originally noticed by accident with a 'git grep socat'.
Link: https://lkml.kernel.org/r/20240612001247.356867-1-jesse.brandeburg@intel.com
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
On 'ethtool -x' with rss_context != 0, instead of calling the driver to
read the RSS settings for the context, just get the settings from the
rss_ctx xarray, and return them to the user with no driver involvement.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/2d0190fa29638f307ea720f882ebd41f6f867694.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While this is not needed to serialise the ethtool entry points (which
are all under RTNL), drivers may have cause to asynchronously access
dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
do this safely without needing to take the RTNL.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/7f9c15eb7525bf87af62c275dde3a8570ee8bf0a.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently passed as NULL, but will allow drivers to report back errors
when ethnl support for these ops is added.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/6e0012347d175fdd1280363d7bfa76a2f2777e17.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new API to create/modify/remove RSS contexts, that passes in the
newly-chosen context ID (not as a pointer) rather than leaving the
driver to choose it on create. Also pass in the ctx, allowing drivers
to easily use its private data area to store their hardware-specific
state.
Keep the existing .set_rxfh API for now as a fallback, but deprecate it
for custom contexts (rss_context != 0).
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/45f1fe61df2163c091ec394c9f52000c8b16cc3b.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since drivers are still choosing the context IDs, we have to force the
XArray to use the ID they've chosen rather than picking one ourselves,
and handle the case where they give us an ID that's already in use.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/801f5faa4cec87c65b2c6e27fb220c944bce593a.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Each context stores the RXFH settings (indir, key, and hfunc) as well
as optionally some driver private data.
Delete any still-existing contexts at netdev unregister time.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/cbd1c402cec38f2e03124f2ab65b4ae4e08bd90d.1719502240.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
net_dev->ethtool is a pointer to new struct ethtool_netdev_state, which
currently contains only the wol_enabled field.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/293a562278371de7534ed1eb17531838ca090633.1719502239.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Today sending a UDP GSO packet from a TUN device results in an EIO error:
import fcntl, os, struct
from socket import *
TUNSETIFF = 0x400454CA
IFF_TUN = 0x0001
IFF_NO_PI = 0x1000
UDP_SEGMENT = 103
tun_fd = os.open("/dev/net/tun", os.O_RDWR)
ifr = struct.pack("16sH", b"tun0", IFF_TUN | IFF_NO_PI)
fcntl.ioctl(tun_fd, TUNSETIFF, ifr)
os.system("ip addr add 192.0.2.1/24 dev tun0")
os.system("ip link set dev tun0 up")
s = socket(AF_INET, SOCK_DGRAM)
s.setsockopt(SOL_UDP, UDP_SEGMENT, 1200)
s.sendto(b"x" * 3000, ("192.0.2.2", 9)) # EIO
This is due to a check in the udp stack if the egress device offers
checksum offload. While TUN/TAP devices, by default, don't advertise this
capability because it requires support from the TUN/TAP reader.
However, the GSO stack has a software fallback for checksum calculation,
which we can use. This way we don't force UDP_SEGMENT users to handle the
EIO error and implement a segmentation fallback.
Lift the restriction so that UDP_SEGMENT can be used with any egress
device. We also need to adjust the UDP GSO code to match the GSO stack
expectation about ip_summed field, as set in commit 8d63bee643f1 ("net:
avoid skb_warn_bad_offload false positives on UFO"). Otherwise we will hit
the bad offload check.
Users should, however, expect a potential performance impact when
batch-sending packets with UDP_SEGMENT without checksum offload on the
egress device. In such case the packet payload is read twice: first during
the sendmsg syscall when copying data from user memory, and then in the GSO
stack for checksum computation. This double memory read can be less
efficient than a regular sendmsg where the checksum is calculated during
the initial data copy from user memory.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240626-linux-udpgso-v2-1-422dfcbd6b48@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This fixes the following deadlock introduced by 39a92a55be13
("bluetooth/l2cap: sync sock recv cb and release")
============================================
WARNING: possible recursive locking detected
6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
--------------------------------------------
kworker/u5:0/35 is trying to acquire lock:
ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_sock_recv_cb+0x44/0x1e0
but task is already holding lock:
ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_get_chan_by_scid+0xaf/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&chan->lock#2/1);
lock(&chan->lock#2/1);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u5:0/35:
#0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
process_one_work+0x750/0x930
#1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
at: process_one_work+0x44e/0x930
#2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_get_chan_by_scid+0xaf/0xd0
To fix the original problem this introduces l2cap_chan_lock at
l2cap_conless_channel to ensure that l2cap_sock_recv_cb is called with
chan->lock held.
Fixes: 89e856e124f9 ("bluetooth/l2cap: sync sock recv cb and release")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Syzbot hit warning in hci_conn_del() caused by freeing handle that was
not allocated using ida allocator.
This is caused by handle bigger than HCI_CONN_HANDLE_MAX passed by
hci_le_big_sync_established_evt(), which makes code think it's unset
connection.
Add same check for handle upper bound as in hci_conn_set_handle() to
prevent warning.
Link: https://syzkaller.appspot.com/bug?extid=b2545b087a01a7319474
Reported-by: syzbot+b2545b087a01a7319474@syzkaller.appspotmail.com
Fixes: 181a42edddf5 ("Bluetooth: Make handle of hci_conn be unique")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This fixes the following Smatch static checker warning:
net/bluetooth/iso.c:1364 iso_sock_recvmsg()
error: we previously assumed 'pi->conn->hcon' could be null (line 1359)
net/bluetooth/iso.c
1347 static int iso_sock_recvmsg(struct socket *sock, struct msghdr *msg,
1348 size_t len, int flags)
1349 {
1350 struct sock *sk = sock->sk;
1351 struct iso_pinfo *pi = iso_pi(sk);
1352
1353 BT_DBG("sk %p", sk);
1354
1355 if (test_and_clear_bit(BT_SK_DEFER_SETUP,
&bt_sk(sk)->flags)) {
1356 lock_sock(sk);
1357 switch (sk->sk_state) {
1358 case BT_CONNECT2:
1359 if (pi->conn->hcon &&
^^^^^^^^^^^^^^ If ->hcon is NULL
1360 test_bit(HCI_CONN_PA_SYNC,
&pi->conn->hcon->flags)) {
1361 iso_conn_big_sync(sk);
1362 sk->sk_state = BT_LISTEN;
1363 } else {
--> 1364 iso_conn_defer_accept(pi->conn->hcon);
^^^^^^^^^^^^^^
then we're toast
1365 sk->sk_state = BT_CONFIG;
1366 }
1367 release_sock(sk);
1368 return 0;
1369 case BT_CONNECTED:
1370 if (test_bit(BT_SK_PA_SYNC,
Fixes: fbdc4bc47268 ("Bluetooth: ISO: Use defer setup to separate PA sync and BIG sync")
Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The problem occurs between the system call to close the sock and hci_rx_work,
where the former releases the sock and the latter accesses it without lock protection.
CPU0 CPU1
---- ----
sock_close hci_rx_work
l2cap_sock_release hci_acldata_packet
l2cap_sock_kill l2cap_recv_frame
sk_free l2cap_conless_channel
l2cap_sock_recv_cb
If hci_rx_work processes the data that needs to be received before the sock is
closed, then everything is normal; Otherwise, the work thread may access the
released sock when receiving data.
Add a chan mutex in the rx callback of the sock to achieve synchronization between
the sock release and recv cb.
Sock is dead, so set chan data to NULL, avoid others use invalid sock pointer.
Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
hci_le_big_sync_established_evt is necessary to filter out cases where the
handle value is belonging to ida id range, otherwise ida will be erroneously
released in hci_conn_cleanup.
Fixes: 181a42edddf5 ("Bluetooth: Make handle of hci_conn be unique")
Reported-by: syzbot+b2545b087a01a7319474@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b2545b087a01a7319474
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
syzbot is reporting that calling hci_release_dev() from hci_error_reset()
due to hci_dev_put() from hci_error_reset() can cause deadlock at
destroy_workqueue(), for hci_error_reset() is called from
hdev->req_workqueue which destroy_workqueue() needs to flush.
We need to make sure that hdev->{rx_work,cmd_work,tx_work} which are
queued into hdev->workqueue and hdev->{power_on,error_reset} which are
queued into hdev->req_workqueue are no longer running by the moment
destroy_workqueue(hdev->workqueue);
destroy_workqueue(hdev->req_workqueue);
are called from hci_release_dev().
Call cancel_work_sync() on these work items from hci_unregister_dev()
as soon as hdev->list is removed from hci_dev_list.
Reported-by: syzbot <syzbot+da0a9c9721e36db712e8@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=da0a9c9721e36db712e8
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
qos->ucast interval reffers to the SDU interval, and should not
be set to the interval value reported by the LE CIS Established
event since the latter reffers to the ISO interval. These two
interval are not the same thing:
BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 6, Part G
Isochronous interval:
The time between two consecutive BIS or CIS events (designated
ISO_Interval in the Link Layer)
SDU interval:
The nominal time between two consecutive SDUs that are sent or
received by the upper layer.
So this instead uses the following formula from the spec to calculate
the resulting SDU interface:
BLUETOOTH CORE SPECIFICATION Version 5.4 | Vol 6, Part G
page 3075:
Transport_Latency_C_To_P = CIG_Sync_Delay + (FT_C_To_P) ×
ISO_Interval + SDU_Interval_C_To_P
Transport_Latency_P_To_C = CIG_Sync_Delay + (FT_P_To_C) ×
ISO_Interval + SDU_Interval_P_To_C
Link: https://github.com/bluez/bluez/issues/823
Fixes: 2be22f1941d5 ("Bluetooth: hci_event: Fix parsing of CIS Established Event")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Some Broadcom controllers found on Apple Silicon machines abuse the
reserved bits inside the PHY fields of LE Extended Advertising Report
events for additional flags. Add a quirk to drop these and correctly
extract the Primary/Secondary_PHY field.
The following excerpt from a btmon trace shows a report received with
"Reserved" for "Primary PHY" on a 4388 controller:
> HCI Event: LE Meta Event (0x3e) plen 26
LE Extended Advertising Report (0x0d)
Num reports: 1
Entry 0
Event type: 0x2515
Props: 0x0015
Connectable
Directed
Use legacy advertising PDUs
Data status: Complete
Reserved (0x2500)
Legacy PDU Type: Reserved (0x2515)
Address type: Random (0x01)
Address: 00:00:00:00:00:00 (Static)
Primary PHY: Reserved
Secondary PHY: No packets
SID: no ADI field (0xff)
TX power: 127 dBm
RSSI: -60 dBm (0xc4)
Periodic advertising interval: 0.00 msec (0x0000)
Direct address type: Public (0x00)
Direct address: 00:00:00:00:00:00 (Apple, Inc.)
Data length: 0x00
Cc: stable@vger.kernel.org
Fixes: 2e7ed5f5e69b ("Bluetooth: hci_sync: Use advertised PHYs on hci_le_ext_create_conn_sync")
Reported-by: Janne Grunau <j@jannau.net>
Closes: https://lore.kernel.org/all/Zjz0atzRhFykROM9@robin
Tested-by: Janne Grunau <j@jannau.net>
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Due to a late review, revert and re-fix a recent crasher fix
* tag 'nfsd-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
Revert "nfsd: fix oops when reading pool_stats before server is started"
nfsd: initialise nfsd_info.mutex early.
|
|
Support tracking of up to 65535 packets per table entry instead of just
255 to better facilitate longer term tracking or higher throughput
scenarios.
Note how this aligns sizes of struct recent_entry's 'nstamps' and
'index' fields when 'nstamps' was larger before. This is unnecessary as
the value of 'nstamps' grows along with that of 'index' after being
initialized to 1 (see recent_entry_update()). Its value will thus never
exceed that of 'index' and therefore does not need to provide space for
larger values.
Requested-by: Fabio <pedretti.fabio@gmail.com>
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1745
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan into main
|
|
Add the ability to flash the modules' firmware by implementing the
interface between the user space and the kernel.
Example from a succeeding implementation:
# ethtool --flash-module-firmware swp40 file test.bin
Transceiver module firmware flashing started for device swp40
Transceiver module firmware flashing in progress for device swp40
Progress: 99%
Transceiver module firmware flashing completed for device swp40
In addition, add infrastructure that allows modules to set socket-specific
private data. This ensures that when a socket is closed from user space
during the flashing process, the right socket halts sending notifications
to user space until the work item is completed.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to the CMIS standard, the firmware update process is done using
a CDB commands sequence.
Implement a work that will be triggered from the module layer in the
next patch the will initiate and execute all the CDB commands in order, to
eventually complete the firmware update process.
This flashing process includes, writing the firmware image, running the new
firmware image and committing it after testing, so that it will run upon
reset.
This work will also notify user space about the progress of the firmware
update process.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
CDB (Command Data Block Message Communication) reads and writes are
performed on memory map pages 9Fh-AFh according to the CMIS standard,
section 8.20 of revision 5.2.
Page 9Fh is used to specify the CDB command to be executed and also
provides an area for a local payload (LPL).
According to the CMIS standard, the firmware update process is done using
a CDB commands sequence that will be implemented in the next patch.
The kernel interface that will implement the firmware update using CDB
command will include 2 layers that will be added under ethtool:
* The upper layer that will be triggered from the module layer, is
cmis_fw_update.
* The lower one is cmis_cdb.
In the future there might be more operations to implement using CDB
commands. Therefore, the idea is to keep the CDB interface clean and the
cmis_fw_update specific to the CDB commands handling it.
These two layers will communicate using the API the consists of three
functions:
- struct ethtool_cmis_cdb *
ethtool_cmis_cdb_init(struct net_device *dev,
struct ethtool_module_fw_flash_params *params);
- void ethtool_cmis_cdb_fini(struct ethtool_cmis_cdb *cdb);
- int ethtool_cmis_cdb_execute_cmd(struct net_device *dev,
struct ethtool_cmis_cdb_cmd_args *args);
Add the CDB layer to support initializing, finishing and executing CDB
commands:
* The initialization process will include creating of an ethtool_cmis_cdb
instance, querying the module CDB support, entering and validating the
password from user space (CMD 0x0000) and querying the module features
(CMD 0x0040).
* The finishing API will simply free the ethtool_cmis_cdb instance.
* The executing process will write the CDB command to EEPROM using
set_module_eeprom_by_page() that was presented earlier, and will
process the reply from EEPROM.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some operations cannot be performed during the firmware flashing
process.
For example:
- Port must be down during the whole flashing process to avoid packet loss
while committing reset for example.
- Writing to EEPROM interrupts the flashing process, so operations like
ethtool dump, module reset, get and set power mode should be vetoed.
- Split port firmware flashing should be vetoed.
In order to veto those scenarios, add a flag in 'struct net_device' that
indicates when a firmware flash is taking place on the module and use it
to prevent interruptions during the process.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add progress notifications ability to user space while flashing modules'
firmware by implementing the interface between the user space and the
kernel.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In some production workloads we noticed that connections could
sometimes close extremely prematurely with ETIMEDOUT after
transmitting only 1 TLP and RTO retransmission (when we would normally
expect roughly tcp_retries2 = TCP_RETR2 = 15 RTOs before a connection
closes with ETIMEDOUT).
From tracing we determined that these workloads can suffer from a
scenario where in fast recovery, after some retransmits, a DSACK undo
can happen at a point where the scoreboard is totally clear (we have
retrans_out == sacked_out == lost_out == 0). In such cases, calling
tcp_try_keep_open() means that we do not execute any code path that
clears tp->retrans_stamp to 0. That means that tp->retrans_stamp can
remain erroneously set to the start time of the undone fast recovery,
even after the fast recovery is undone. If minutes or hours elapse,
and then a TLP/RTO/RTO sequence occurs, then the start_ts value in
retransmits_timed_out() (which is from tp->retrans_stamp) will be
erroneously ancient (left over from the fast recovery undone via
DSACKs). Thus this ancient tp->retrans_stamp value can cause the
connection to die very prematurely with ETIMEDOUT via
tcp_write_err().
The fix: we change DSACK undo in fast recovery (TCP_CA_Recovery) to
call tcp_try_to_open() instead of tcp_try_keep_open(). This ensures
that if no retransmits are in flight at the time of DSACK undo in fast
recovery then we properly zero retrans_stamp. Note that calling
tcp_try_to_open() is more consistent with other loss recovery
behavior, since normal fast recovery (CA_Recovery) and RTO recovery
(CA_Loss) both normally end when tp->snd_una meets or exceeds
tp->high_seq and then in tcp_fastretrans_alert() the "default" switch
case executes tcp_try_to_open(). Also note that by inspection this
change to call tcp_try_to_open() implies at least one other nice bug
fix, where now an ECE-marked DSACK that causes an undo will properly
invoke tcp_enter_cwr() rather than ignoring the ECE mark.
Fixes: c7d9d6a185a7 ("tcp: undo on DSACK during recovery")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This flag is annoying because it puts a lot of logic into mac80211
that could just as well be in the driver (only iwlmvm uses it) and
the implementation is also broken for MLO.
Remove the flag in favour of calling drv_mgd_prepare_tx() without
any conditions even for the deauth-while-assoc case. The drivers
that implement it can take the appropriate actions, which for the
only user of DEAUTH_NEED_MGD_TX_PREP (iwlmvm) is a bit more tricky
than the implementation in mac80211 is anyway, and all others have
no need and can just exit if info->was_assoc is set.
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240627132527.94924bcc9c9e.I328a219e45f2e2724cd52e75bb9feee3bf21a463@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The beacon processing should be fully done in the context of the link.
This also resolves a bug with CQM handling with MLO as in such a case
the RSSI thresholds configuration is maintained in the link context and
not in the interface context.
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Link: https://patch.msgid.link/20240627104600.bb2f0f697881.I675b6a8a186b717f3eef79113c27361fd1a7622c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This is now unused in nl80211, so there's no need to set
the pointer/length.
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240627104411.f8ac49a8cfe0.Ic01f775903f0acd10b9ba77eb39d4ed50e709173@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When a key is requested by userspace, there's really no need
to include the key data, the sequence counter is really what
userspace needs in this case. The fact that it's included is
just a historic quirk.
Remove the key data.
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240627104411.b6a4f097e4ea.I7e6cc976cb9e8a80ef25a3351330f313373b4578@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This reverts commit 934c29999b57b835d65442da6f741d5e27f3b584.
This triggered a usercopy BUG() in systems with HIGHMEM, reported
by the test robot in:
https://lore.kernel.org/oe-lkp/202406161539.b5ff7b20-oliver.sang@intel.com
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://patch.msgid.link/20240626070153.759257-1-sagi@grimberg.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Highlights this time are:
- cfg80211/nl80211:
* improvements for 6 GHz regulatory flexibility
- mac80211:
* use generic netdev stats
* multi-link improvements/fixes
- brcmfmac:
* MFP support (to enable WPA3)
- wilc1000:
* suspend/resume improvements
- iwlwifi:
* remove support for older FW for new devices
* fast resume (keeping the device configured)
- wl18xx:
* support newer firmware versions
* tag 'wireless-next-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (100 commits)
wifi: brcmfmac: of: Support interrupts-extended
wifi: brcmsmac: advertise MFP_CAPABLE to enable WPA3
net: rfkill: Correct return value in invalid parameter case
wifi: mac80211: fix NULL dereference at band check in starting tx ba session
wifi: iwlwifi: mvm: fix rs.h kernel-doc
wifi: iwlwifi: fw: api: datapath: fix kernel-doc
wifi: iwlwifi: fix remaining mistagged kernel-doc comments
wifi: iwlwifi: fix prototype mismatch kernel-doc warnings
wifi: iwlwifi: fix kernel-doc in iwl-fh.h
wifi: iwlwifi: fix kernel-doc in iwl-trans.h
wifi: iwlwifi: pcie: fix kernel-doc
wifi: iwlwifi: dvm: fix kernel-doc warnings
wifi: iwlwifi: mvm: don't log error for failed UATS table read
wifi: iwlwifi: trans: make bad state warnings
wifi: iwlwifi: fw: api: fix some kernel-doc
wifi: iwlwifi: mvm: remove init_dbg module parameter
wifi: iwlwifi: update the BA notification API
wifi: iwlwifi: mvm: always unblock EMLSR on ROC end
wifi: iwlwifi: mvm: use IWL_FW_CHECK for link ID check
wifi: iwlwifi: mvm: don't flush BSSes on restart with MLD API
...
====================
Link: https://patch.msgid.link/20240627114135.28507-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
e3f02f32a050 ("ionic: fix kernel panic due to multi-buffer handling")
d9c04209990b ("ionic: Mark error paths in the data path as unlikely")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Just a few changes:
- maintainers: Larry Finger sadly passed away
- maintainers: ath trees are in their group now
- TXQ FQ quantum configuration fix
- TI wl driver: work around stuck FW in AP mode
- mac80211: disable softirqs in some new code
needing that
* tag 'wireless-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
MAINTAINERS: wifi: update ath.git location
MAINTAINERS: Remembering Larry Finger
wifi: mac80211: disable softirqs for queued frame handling
wifi: cfg80211: restrict NL80211_ATTR_TXQ_QUANTUM values
wifi: wlcore: fix wlcore AP mode
====================
Link: https://patch.msgid.link/20240627083627.15312-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can, bpf and netfilter.
There are a bunch of regressions addressed here, but hopefully nothing
spectacular. We are still waiting the driver fix from Intel, mentioned
by Jakub in the previous networking pull.
Current release - regressions:
- core: add softirq safety to netdev_rename_lock
- tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed
TFO
- batman-adv: fix RCU race at module unload time
Previous releases - regressions:
- openvswitch: get related ct labels from its master if it is not
confirmed
- eth: bonding: fix incorrect software timestamping report
- eth: mlxsw: fix memory corruptions on spectrum-4 systems
- eth: ionic: use dev_consume_skb_any outside of napi
Previous releases - always broken:
- netfilter: fully validate NFT_DATA_VALUE on store to data registers
- unix: several fixes for OoB data
- tcp: fix race for duplicate reqsk on identical SYN
- bpf:
- fix may_goto with negative offset
- fix the corner case with may_goto and jump to the 1st insn
- fix overrunning reservations in ringbuf
- can:
- j1939: recover socket queue on CAN bus error during BAM
transmission
- mcp251xfd: fix infinite loop when xmit fails
- dsa: microchip: monitor potential faults in half-duplex mode
- eth: vxlan: pull inner IP header in vxlan_xmit_one()
- eth: ionic: fix kernel panic due to multi-buffer handling
Misc:
- selftest: unix tests refactor and a lot of new cases added"
* tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
net: mana: Fix possible double free in error handling path
selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c.
af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head.
selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c
selftest: af_unix: Check SIGURG after every send() in msg_oob.c
selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c
af_unix: Don't stop recv() at consumed ex-OOB skb.
selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c.
af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.
af_unix: Stop recv(MSG_PEEK) at consumed OOB skb.
selftest: af_unix: Add msg_oob.c.
selftest: af_unix: Remove test_unix_oob.c.
tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
net: usb: qmi_wwan: add Telit FN912 compositions
tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFO
ionic: use dev_consume_skb_any outside of napi
net: dsa: microchip: fix wrong register write when masking interrupt
Fix race for duplicate reqsk on identical SYN
ibmvnic: Add tx check to prevent skb leak
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains two Netfilter fixes for net:
Patch #1 fixes CONFIG_SYSCTL=n for a patch coming in the previous PR
to move the sysctl toggle to enable SRv6 netfilter hooks from
nf_conntrack to the core, from Jianguo Wu.
Patch #2 fixes a possible pointer leak to userspace due to insufficient
validation of NFT_DATA_VALUE.
Linus found this pointer leak to userspace via zdi-disclosures@ and
forwarded the notice to Netfilter maintainers, he appears as reporter
because whoever found this issue never approached Netfilter
maintainers neither via security@ nor in private.
netfilter pull request 24-06-27
* tag 'nf-24-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
netfilter: fix undefined reference to 'netfilter_lwtunnel_*' when CONFIG_SYSCTL=n
====================
Link: https://patch.msgid.link/20240626233845.151197-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Even if OOB data is recv()ed, ioctl(SIOCATMARK) must return 1 when the
OOB skb is at the head of the receive queue and no new OOB data is queued.
Without fix:
# RUN msg_oob.no_peek.oob ...
# msg_oob.c:305:oob:Expected answ[0] (0) == oob_head (1)
# oob: Test terminated by assertion
# FAIL msg_oob.no_peek.oob
not ok 2 msg_oob.no_peek.oob
With fix:
# RUN msg_oob.no_peek.oob ...
# OK msg_oob.no_peek.oob
ok 2 msg_oob.no_peek.oob
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, recv() is stopped at a consumed OOB skb even if a new
OOB skb is queued and we can ignore the old OOB skb.
>>> from socket import *
>>> c1, c2 = socket(AF_UNIX, SOCK_STREAM)
>>> c1.send(b'hellowor', MSG_OOB)
8
>>> c2.recv(1, MSG_OOB) # consume OOB data stays at middle of recvq.
b'r'
>>> c1.send(b'ld', MSG_OOB)
2
>>> c2.recv(10) # recv() stops at the old consumed OOB
b'hellowo' # should be 'hellowol'
manage_oob() should not stop recv() at the old consumed OOB skb if
there is a new OOB data queued.
Note that TCP behaviour is apparently wrong in this test case because
we can recv() the same OOB data twice.
Without fix:
# RUN msg_oob.no_peek.ex_oob_ahead_break ...
# msg_oob.c:138:ex_oob_ahead_break:AF_UNIX :hellowo
# msg_oob.c:139:ex_oob_ahead_break:Expected:hellowol
# msg_oob.c:141:ex_oob_ahead_break:Expected ret[0] (7) == expected_len (8)
# ex_oob_ahead_break: Test terminated by assertion
# FAIL msg_oob.no_peek.ex_oob_ahead_break
not ok 11 msg_oob.no_peek.ex_oob_ahead_break
With fix:
# RUN msg_oob.no_peek.ex_oob_ahead_break ...
# msg_oob.c:146:ex_oob_ahead_break:AF_UNIX :hellowol
# msg_oob.c:147:ex_oob_ahead_break:TCP :helloworl
# OK msg_oob.no_peek.ex_oob_ahead_break
ok 11 msg_oob.no_peek.ex_oob_ahead_break
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Let's say a socket send()s "hello" with MSG_OOB and "world" without flags,
>>> from socket import *
>>> c1, c2 = socketpair(AF_UNIX)
>>> c1.send(b'hello', MSG_OOB)
5
>>> c1.send(b'world')
5
and its peer recv()s "hell" and "o".
>>> c2.recv(10)
b'hell'
>>> c2.recv(1, MSG_OOB)
b'o'
Now the consumed OOB skb stays at the head of recvq to return a correct
value for ioctl(SIOCATMARK), which is broken now and fixed by a later
patch.
Then, if peer issues recv() with MSG_DONTWAIT, manage_oob() returns NULL,
so recv() ends up with -EAGAIN.
>>> c2.setblocking(False) # This causes -EAGAIN even with available data
>>> c2.recv(5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
BlockingIOError: [Errno 11] Resource temporarily unavailable
However, next recv() will return the following available data, "world".
>>> c2.recv(5)
b'world'
When the consumed OOB skb is at the head of the queue, we need to fetch
the next skb to fix the weird behaviour.
Note that the issue does not happen without MSG_DONTWAIT because we can
retry after manage_oob().
This patch also adds a test case that covers the issue.
Without fix:
# RUN msg_oob.no_peek.ex_oob_break ...
# msg_oob.c:134:ex_oob_break:AF_UNIX :Resource temporarily unavailable
# msg_oob.c:135:ex_oob_break:Expected:ld
# msg_oob.c:137:ex_oob_break:Expected ret[0] (-1) == expected_len (2)
# ex_oob_break: Test terminated by assertion
# FAIL msg_oob.no_peek.ex_oob_break
not ok 8 msg_oob.no_peek.ex_oob_break
With fix:
# RUN msg_oob.no_peek.ex_oob_break ...
# OK msg_oob.no_peek.ex_oob_break
ok 8 msg_oob.no_peek.ex_oob_break
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
After consuming OOB data, recv() reading the preceding data must break at
the OOB skb regardless of MSG_PEEK.
Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is
not compliant with TCP.
>>> from socket import *
>>> c1, c2 = socketpair(AF_UNIX)
>>> c1.send(b'hello', MSG_OOB)
5
>>> c1.send(b'world')
5
>>> c2.recv(1, MSG_OOB)
b'o'
>>> c2.recv(9, MSG_PEEK) # This should return b'hell'
b'hellworld' # even with enough buffer.
Let's fix it by returning NULL for consumed skb and unlinking it only if
MSG_PEEK is not specified.
This patch also adds test cases that add recv(MSG_PEEK) before each recv().
Without fix:
# RUN msg_oob.peek.oob_ahead_break ...
# msg_oob.c:134:oob_ahead_break:AF_UNIX :hellworld
# msg_oob.c:135:oob_ahead_break:Expected:hell
# msg_oob.c:137:oob_ahead_break:Expected ret[0] (9) == expected_len (4)
# oob_ahead_break: Test terminated by assertion
# FAIL msg_oob.peek.oob_ahead_break
not ok 13 msg_oob.peek.oob_ahead_break
With fix:
# RUN msg_oob.peek.oob_ahead_break ...
# OK msg_oob.peek.oob_ahead_break
ok 13 msg_oob.peek.oob_ahead_break
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
register store validation for NFT_DATA_VALUE is conditional, however,
the datatype is always either NFT_DATA_VALUE or NFT_DATA_VERDICT. This
only requires a new helper function to infer the register type from the
set datatype so this conditional check can be removed. Otherwise,
pointer to chain object can be leaked through the registers.
Fixes: 96518518cc41 ("netfilter: add nftables")
Reported-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add the ability to send out RFC-3948 NAT keepalives from the xfrm stack.
To use, Userspace sets an XFRM_NAT_KEEPALIVE_INTERVAL integer property when
creating XFRM outbound states which denotes the number of seconds between
keepalive messages.
Keepalive messages are sent from a per net delayed work which iterates over
the xfrm states. The logic is guarded by the xfrm state spinlock due to the
xfrm state walk iterator.
Possible future enhancements:
- Adding counters to keep track of sent keepalives.
- deduplicate NAT keepalives between states sharing the same nat keepalive
parameters.
- provisioning hardware offloads for devices capable of implementing this.
- revise xfrm state list to use an rcu list in order to avoid running this
under spinlock.
Suggested-by: Paul Wouters <paul.wouters@aiven.io>
Tested-by: Paul Wouters <paul.wouters@aiven.io>
Tested-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
rfkill_set_hw_state_reason() does not return current combined
block state when its parameter @reason is invalid, that is
wrong according to its comments, fix it by correcting the
value returned.
Also reformat the WARN while at it.
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://patch.msgid.link/1718287476-28227-1-git-send-email-quic_zijuhu@quicinc.com
[edit/reformat commit message, remove unneeded variable]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In MLD connection, link_data/link_conf are dynamically allocated. They
don't point to vif->bss_conf. So, there will be no chanreq assigned to
vif->bss_conf and then the chan will be NULL. Tweak the code to check
ht_supported/vht_supported/has_he/has_eht on sta deflink.
Crash log (with rtw89 version under MLO development):
[ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 9890.526102] #PF: supervisor read access in kernel mode
[ 9890.526105] #PF: error_code(0x0000) - not-present page
[ 9890.526109] PGD 0 P4D 0
[ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI
[ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G OE 6.9.0 #1
[ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018
[ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core]
[ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
[ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3
All code
========
0: f7 e8 imul %eax
2: d5 (bad)
3: 93 xchg %eax,%ebx
4: 3e ea ds (bad)
6: 48 83 c4 28 add $0x28,%rsp
a: 89 d8 mov %ebx,%eax
c: 5b pop %rbx
d: 41 5c pop %r12
f: 41 5d pop %r13
11: 41 5e pop %r14
13: 41 5f pop %r15
15: 5d pop %rbp
16: c3 retq
17: cc int3
18: cc int3
19: cc int3
1a: cc int3
1b: 49 8b 84 24 e0 f1 ff mov -0xe20(%r12),%rax
22: ff
23: 48 8b 80 90 1b 00 00 mov 0x1b90(%rax),%rax
2a:* 83 38 03 cmpl $0x3,(%rax) <-- trapping instruction
2d: 0f 84 37 fe ff ff je 0xfffffffffffffe6a
33: bb ea ff ff ff mov $0xffffffea,%ebx
38: eb cc jmp 0x6
3a: 49 rex.WB
3b: 8b .byte 0x8b
3c: 84 24 10 test %ah,(%rax,%rdx,1)
3f: f3 repz
Code starting with the faulting instruction
===========================================
0: 83 38 03 cmpl $0x3,(%rax)
3: 0f 84 37 fe ff ff je 0xfffffffffffffe40
9: bb ea ff ff ff mov $0xffffffea,%ebx
e: eb cc jmp 0xffffffffffffffdc
10: 49 rex.WB
11: 8b .byte 0x8b
12: 84 24 10 test %ah,(%rax,%rdx,1)
15: f3 repz
[ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246
[ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8
[ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685
[ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873
[ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70
[ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000
[ 9890.526313] FS: 0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000
[ 9890.526316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0
[ 9890.526321] Call Trace:
[ 9890.526324] <TASK>
[ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479)
[ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713)
[ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3))
[ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Link: https://patch.msgid.link/20240617115217.22344-1-kevin_yang@realtek.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This will allow the low level driver to take different actions for
different flows.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240618192529.739036208b6e.Ie18a2fe8e02bf2717549d39420b350cfdaf3d317@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
As noticed by syzbot, calling ieee80211_handle_queued_frames()
(and actually handling frames there) requires softirqs to be
disabled, since we call into the RX code. Fix that in the case
of cleaning up frames left over during shutdown.
Fixes: 177c6ae9725d ("wifi: mac80211: handle tasklet frames before stopping")
Reported-by: syzbot+1d516edf1e74469ba5d3@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20240626091559.cd6f08105a6e.I74778610a5ff2cf8680964698131099d2960352a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
syzbot is able to trigger softlockups, setting NL80211_ATTR_TXQ_QUANTUM
to 2^31.
We had a similar issue in sch_fq, fixed with commit
d9e15a273306 ("pkt_sched: fq: do not accept silly TCA_FQ_QUANTUM")
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/1:0:24]
Modules linked in:
irq event stamp: 131135
hardirqs last enabled at (131134): [<ffff80008ae8778c>] __exit_to_kernel_mode arch/arm64/kernel/entry-common.c:85 [inline]
hardirqs last enabled at (131134): [<ffff80008ae8778c>] exit_to_kernel_mode+0xdc/0x10c arch/arm64/kernel/entry-common.c:95
hardirqs last disabled at (131135): [<ffff80008ae85378>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (131135): [<ffff80008ae85378>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last enabled at (125892): [<ffff80008907e82c>] neigh_hh_init net/core/neighbour.c:1538 [inline]
softirqs last enabled at (125892): [<ffff80008907e82c>] neigh_resolve_output+0x268/0x658 net/core/neighbour.c:1553
softirqs last disabled at (125896): [<ffff80008904166c>] local_bh_disable+0x10/0x34 include/linux/bottom_half.h:19
CPU: 1 PID: 24 Comm: kworker/1:0 Not tainted 6.9.0-rc7-syzkaller-gfda5695d692c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: mld mld_ifc_work
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __list_del include/linux/list.h:195 [inline]
pc : __list_del_entry include/linux/list.h:218 [inline]
pc : list_move_tail include/linux/list.h:310 [inline]
pc : fq_tin_dequeue include/net/fq_impl.h:112 [inline]
pc : ieee80211_tx_dequeue+0x6b8/0x3b4c net/mac80211/tx.c:3854
lr : __list_del_entry include/linux/list.h:218 [inline]
lr : list_move_tail include/linux/list.h:310 [inline]
lr : fq_tin_dequeue include/net/fq_impl.h:112 [inline]
lr : ieee80211_tx_dequeue+0x67c/0x3b4c net/mac80211/tx.c:3854
sp : ffff800093d36700
x29: ffff800093d36a60 x28: ffff800093d36960 x27: dfff800000000000
x26: ffff0000d800ad50 x25: ffff0000d800abe0 x24: ffff0000d800abf0
x23: ffff0000e0032468 x22: ffff0000e00324d4 x21: ffff0000d800abf0
x20: ffff0000d800abf8 x19: ffff0000d800abf0 x18: ffff800093d363c0
x17: 000000000000d476 x16: ffff8000805519dc x15: ffff7000127a6cc8
x14: 1ffff000127a6cc8 x13: 0000000000000004 x12: ffffffffffffffff
x11: ffff7000127a6cc8 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff80009287aa08 x4 : 0000000000000008 x3 : ffff80008034c7fc
x2 : ffff0000e0032468 x1 : 00000000da0e46b8 x0 : ffff0000e0032470
Call trace:
__list_del include/linux/list.h:195 [inline]
__list_del_entry include/linux/list.h:218 [inline]
list_move_tail include/linux/list.h:310 [inline]
fq_tin_dequeue include/net/fq_impl.h:112 [inline]
ieee80211_tx_dequeue+0x6b8/0x3b4c net/mac80211/tx.c:3854
wake_tx_push_queue net/mac80211/util.c:294 [inline]
ieee80211_handle_wake_tx_queue+0x118/0x274 net/mac80211/util.c:315
drv_wake_tx_queue net/mac80211/driver-ops.h:1350 [inline]
schedule_and_wake_txq net/mac80211/driver-ops.h:1357 [inline]
ieee80211_queue_skb+0x18e8/0x2244 net/mac80211/tx.c:1664
ieee80211_tx+0x260/0x400 net/mac80211/tx.c:1966
ieee80211_xmit+0x278/0x354 net/mac80211/tx.c:2062
__ieee80211_subif_start_xmit+0xab8/0x122c net/mac80211/tx.c:4338
ieee80211_subif_start_xmit+0xe0/0x438 net/mac80211/tx.c:4532
__netdev_start_xmit include/linux/netdevice.h:4903 [inline]
netdev_start_xmit include/linux/netdevice.h:4917 [inline]
xmit_one net/core/dev.c:3531 [inline]
dev_hard_start_xmit+0x27c/0x938 net/core/dev.c:3547
__dev_queue_xmit+0x1678/0x33fc net/core/dev.c:4341
dev_queue_xmit include/linux/netdevice.h:3091 [inline]
neigh_resolve_output+0x558/0x658 net/core/neighbour.c:1563
neigh_output include/net/neighbour.h:542 [inline]
ip6_finish_output2+0x104c/0x1ee8 net/ipv6/ip6_output.c:137
ip6_finish_output+0x428/0x7a0 net/ipv6/ip6_output.c:222
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip6_output+0x270/0x594 net/ipv6/ip6_output.c:243
dst_output include/net/dst.h:450 [inline]
NF_HOOK+0x160/0x4f0 include/linux/netfilter.h:314
mld_sendpack+0x7b4/0x10f4 net/ipv6/mcast.c:1818
mld_send_cr net/ipv6/mcast.c:2119 [inline]
mld_ifc_work+0x840/0xd0c net/ipv6/mcast.c:2650
process_one_work+0x7b8/0x15d4 kernel/workqueue.c:3267
process_scheduled_works kernel/workqueue.c:3348 [inline]
worker_thread+0x938/0xef4 kernel/workqueue.c:3429
kthread+0x288/0x310 kernel/kthread.c:388
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240615160800.250667-1-edumazet@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|