| Age | Commit message (Collapse) | Author |
|
Move struct device from ism_dev and smc_lo_dev to dibs_dev, and define a
corresponding release function. Free ism_dev in ism_remove() and smc_lo_dev
in smc_lo_dev_remove().
Replace smcd->ops->get_dev(smcd) by using dibs->dev directly.
An alternative design would be to embed dibs_dev as a field in ism_dev and
do the same for other dibs device driver specific structs. However that
would have the disadvantage that each dibs device driver needs to allocate
dibs_dev and each dibs device driver needs a different device release
function. The advantage would be that ism_dev and other device driver
specific structs would be covered by device reference counts.
Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
Co-developed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Link: https://patch.msgid.link/20250918110500.1731261-9-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Move the device add() and remove() functions from ism_client to
dibs_client_ops and call add_dev()/del_dev() for ism devices and
dibs_loopback devices. dibs_client_ops->add_dev() = smcd_register_dev() for
the smc_dibs_client. This is the first step to handle ism and loopback
devices alike (as dibs devices) in the smc dibs client.
Define dibs_dev->ops and move smcd_ops->get_chid to
dibs_dev_ops->get_fabric_id() for ism and loopback devices. See below for
why this needs to be in the same patch as dibs_client_ops->add_dev().
The following changes contain intermediate steps, that will be obsoleted by
follow-on patches, once more functionality has been moved to dibs:
Use different smcd_ops and max_dmbs for ism and loopback. Follow-on patches
will change SMC-D to directly use dibs_ops instead of smcd_ops.
In smcd_register_dev() it is now necessary to identify a dibs_loopback
device before smcd_dev and smcd_ops->get_chid() are available. So provide
dibs_dev_ops->get_fabric_id() in this patch and evaluate it in
smc_ism_is_loopback().
Call smc_loopback_init() in smcd_register_dev() and call
smc_loopback_exit() in smcd_unregister_dev() to handle the functionality
that is still in smc_loopback. Follow-on patches will move all smc_loopback
code to dibs_loopback.
In smcd_[un]register_dev() use only ism device name, this will be replaced
by dibs device name by a follow-on patch.
End of changes with intermediate parts.
Allocate an smcd event workqueue for all dibs devices, although
dibs_loopback does not generate events.
Use kernel memory instead of devres memory for smcd_dev and smcd->conn.
Since commit a72178cfe855 ("net/smc: Fix dependency of SMC on ISM") an ism
device and its driver can have a longer lifetime than the smc module, so
smc should not rely on devres to free its resources [1]. It is now the
responsibility of the smc client to free smcd and smcd->conn for all dibs
devices, ism devices as well as loopback. Call client->ops->del_dev() for
all existing dibs devices in dibs_unregister_client(), so all device
related structures can be freed in the client.
When dibs_unregister_client() is called in the context of smc_exit() or
smc_core_reboot_event(), these functions have already called
smc_lgrs_shutdown() which calls smc_smcd_terminate_all(smcd) and sets
going_away. This is done a second time in smcd_unregister_dev(). This is
analogous to how smcr is handled in these functions, by calling first
smc_lgrs_shutdown() and then smc_ib_unregister_client() >
smc_ib_remove_dev(), so leave it that way. It may be worth investigating,
whether smc_lgrs_shutdown() is still required or useful.
Remove CONFIG_SMC_LO. CONFIG_DIBS_LO now controls whether a dibs loopback
device exists or not.
Link: https://www.kernel.org/doc/Documentation/driver-model/devres.txt [1]
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Link: https://patch.msgid.link/20250918110500.1731261-8-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Formally register smc as dibs client. Functionality will be moved by
follow-on patches from ism_client to dibs_client until eventually
ism_client can be removed.
As DIBS is only a shim layer without any dependencies, we can depend SMC
on DIBS without adding indirect dependencies. A follow-on patch will
remove dependency of SMC on ISM.
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Julian Ruess <julianr@linux.ibm.com>
Link: https://patch.msgid.link/20250918110500.1731261-5-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Create the file structure for a 'DIBS - Direct Internal Buffer Sharing'
shim layer that will provide generic functionality and declarations for
dibs device drivers and dibs clients.
Following patches will add functionality.
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://patch.msgid.link/20250918110500.1731261-4-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Before this patch there was the following assumption in
smc_loopback.c>smc_lo_move_data():
sf (signalling flag) == 0 : data is already in an attached target dmb
sf == 1 : data is not yet in the target dmb
This is true for the 2 callers in smc client
smcd_cdc_msg_send() : sf=1
smcd_tx_rdma_writes() : sf=0
but should not be a general assumption.
Add a bool to struct smc_buf_desc to indicate whether an SMC-D sndbuf_desc
is an attached buffer. Don't call move_data() for attached send_buffers,
because it is not necessary.
Move the data in smc_lo_move_data() if len != 0 and signal when requested.
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Link: https://patch.msgid.link/20250918110500.1731261-3-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
smcd_buf_free() calls smc_ism_unregister_dmb(lgr->smcd, buf_desc) and
then unconditionally frees buf_desc.
Remove the cleaning up of fields of buf_desc in
smc_ism_unregister_dmb(), because it is not helpful.
This removes the only usage of ISM_ERROR from the smc module. So move it
to drivers/s390/net/ism.h.
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Link: https://patch.msgid.link/20250918110500.1731261-2-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Today, once an inet_bind_bucket enters a state where fastreuse >= 0 or
fastreuseport >= 0 after a socket is explicitly bound to a port, it remains
in that state until all sockets are removed and the bucket is destroyed.
In this state, the bucket is skipped during ephemeral port selection in
connect(). For applications using a reduced ephemeral port
range (IP_LOCAL_PORT_RANGE socket option), this can cause faster port
exhaustion since blocked buckets are excluded from reuse.
The reason the bucket state isn't updated on port release is unclear.
Possibly a performance trade-off to avoid scanning bucket owners, or just
an oversight.
Fix it by recalculating the bucket state when a socket releases a port. To
limit overhead, each inet_bind2_bucket stores its own (fastreuse,
fastreuseport) state. On port release, only the relevant port-addr bucket
is scanned, and the overall state is derived from these.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250917-update-bind-bucket-state-on-unhash-v5-1-57168b661b47@cloudflare.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As hinted in commit 501a90c94510 ("inet: protect against too small mtu
values."), net_device->mtu is vulnerable to race conditions if it is
written and read without holding the RTNL.
At the moment, all the writes are done while the interface is down,
either in the devices' probe() function or in can_changelink(). So
there are no such issues yet. But upcoming changes will allow to
modify the MTU while the CAN XL devices are up.
In preparation to the introduction of CAN XL, annotate all the
net_device->mtu accesses which are not yet guarded by the RTNL with a
READ_ONCE().
Note that all the write accesses are already either guarded by the
RTNL or are already annotated and thus need no changes.
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20250923-can-fix-mtu-v3-1-581bde113f52@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The original code used nl80211_chan_width_to_mhz(), which returns the width in MHz.
However, the expected unit is KHz.
Fixes: 510dba80ed66 ("wifi: cfg80211: add helper for checking if a chandef is valid on a radio")
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Link: https://patch.msgid.link/df54294e6c4ed0f3ceff6e818b710478ddfc62c0.1758579480.git.Ryder%20Lee%20ryder.lee@mediatek.com/
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
synflood_warned had to be u32 for xchg(), but ensuring
atomicity is not really needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tp->tcp_clean_acked is fetched in tx path when snd_una is updated.
This field thus belongs to tcp_sock_read_tx group.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tcp_ack() writes this field, it belongs to tcp_sock_write_txrx.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Maintaining the CACHELINE_ASSERT_GROUP_SIZE() uses
for struct tcp_sock has been painful.
This had little benefit, so remove them.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk->sk_sndbuf is read-mostly in tx path, so move it from
sock_write_tx group to more appropriate sock_read_tx.
sk->sk_err_soft was not identified previously, but
is used from tcp_ack().
Move it to sock_write_tx group for better cache locality.
Also change tcp_ack() to clear sk->sk_err_soft only if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk_uid and sk_protocol are read from inet6_csk_route_socket()
for each TCP transmit.
Also read from udpv6_sendmsg(), udp_sendmsg() and others.
Move them to sock_read_tx for better cache locality.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250919204856.2977245-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This change adds a new WQ_PERCPU flag at the network subsystem, to explicitly
request the use of the per-CPU behavior. Both flags coexist for one release
cycle to allow callers to transition their calls.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
All existing users have been updated accordingly.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-4-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-3-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-2-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2025-09-22
1) Fix 0 assignment for SPIs. 0 is not a valid SPI,
it means no SPI assigned.
2) Fix offloading for inter address family tunnels.
* tag 'ipsec-2025-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: fix offloading of cross-family tunnels
xfrm: xfrm_alloc_spi shouldn't use 0 as SPI
====================
Link: https://patch.msgid.link/20250922073512.62703-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When working on a fix modifying mptcp_check_data_fin(), I noticed the
returned value was no longer used.
It looks like it was used for 3 days, between commit 7ed90803a213
("mptcp: send explicit ack on delayed ack_seq incr") and commit
ea4ca586b16f ("mptcp: refine MPTCP-level ack scheduling").
This returned value can be safely removed.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-6-a97a5d561a8b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that such info is in the 'flags' attribute, it is time to deprecate
the dedicated 'server-side' attribute.
It will be removed in a few versions.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-3-a97a5d561a8b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that the 'flags' attribute is used, it seems interesting to add one
flag for 'server-side', a boolean value.
This is duplicating the info from the dedicated 'server-side' attribute,
but it will be deprecated in the next commit, and removed in a few
versions.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-2-a97a5d561a8b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This attribute is a boolean. No need to add it to set it to 'false'.
Indeed, the default value when this attribute is not set is naturally
'false'. A few bytes can then be saved by not adding this attribute if
the connection is not on the server side.
This prepares the future deprecation of its attribute, in favour of a
new flag.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-1-a97a5d561a8b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet_unhash() checks sk_unhashed() twice at the entry and after locking
ehash/lhash bucket.
The former was somehow added redundantly by commit 4f9bf2a2f5aa ("tcp:
Don't acquire inet_listen_hashbucket::lock with disabled BH.").
inet_unhash() is called for the full socket from 4 places, and it is
always under lock_sock() or the socket is not yet published to other
threads:
1. __sk_prot_rehash()
-> called from inet_sk_reselect_saddr(), which has
lockdep_sock_is_held()
2. sk_common_release()
-> called when inet_create() or inet6_create() fail, then the
socket is not yet published
3. tcp_set_state()
-> calls tcp_call_bpf_2arg(), and tcp_call_bpf() has
sock_owned_by_me()
4. inet_ctl_sock_create()
-> creates a kernel socket and unhashes it immediately, but TCP
socket is not hashed in sock_create_kern() (only SOCK_RAW is)
So we do not need to check sk_unhashed() twice before/after ehash/lhash
lock in inet_unhash().
Let's remove the 2nd one.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250919083706.1863217-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet_hash() and inet6_hash() are exactly the same.
Also, we do not need to export inet6_hash().
Let's consolidate the two into __inet_hash() and rename it to inet_hash().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250919083706.1863217-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__inet_hash() is called from inet_hash() and inet6_hash with osk NULL.
Let's remove the 2nd arg from __inet_hash().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250919083706.1863217-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This attemps to fix possible UAFs caused by struct mgmt_pending being
freed while still being processed like in the following trace, in order
to fix mgmt_pending_valid is introduce and use to check if the
mgmt_pending hasn't been removed from the pending list, on the complete
callbacks it is used to check and in addtion remove the cmd from the list
while holding mgmt_pending_lock to avoid TOCTOU problems since if the cmd
is left on the list it can still be accessed and freed.
BUG: KASAN: slab-use-after-free in mgmt_add_adv_patterns_monitor_sync+0x35/0x50 net/bluetooth/mgmt.c:5223
Read of size 8 at addr ffff8880709d4dc0 by task kworker/u11:0/55
CPU: 0 UID: 0 PID: 55 Comm: kworker/u11:0 Not tainted 6.16.4 #2 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
mgmt_add_adv_patterns_monitor_sync+0x35/0x50 net/bluetooth/mgmt.c:5223
hci_cmd_sync_work+0x210/0x3a0 net/bluetooth/hci_sync.c:332
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xade/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x711/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16.4/arch/x86/entry/entry_64.S:245
</TASK>
Allocated by task 12210:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4364
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1039 [inline]
mgmt_pending_new+0x65/0x1e0 net/bluetooth/mgmt_util.c:269
mgmt_pending_add+0x35/0x140 net/bluetooth/mgmt_util.c:296
__add_adv_patterns_monitor+0x130/0x200 net/bluetooth/mgmt.c:5247
add_adv_patterns_monitor+0x214/0x360 net/bluetooth/mgmt.c:5364
hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719
hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839
sock_sendmsg_nosec net/socket.c:714 [inline]
__sock_sendmsg+0x219/0x270 net/socket.c:729
sock_write_iter+0x258/0x330 net/socket.c:1133
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x5c9/0xb30 fs/read_write.c:686
ksys_write+0x145/0x250 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 12221:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x62/0x70 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2381 [inline]
slab_free mm/slub.c:4648 [inline]
kfree+0x18e/0x440 mm/slub.c:4847
mgmt_pending_free net/bluetooth/mgmt_util.c:311 [inline]
mgmt_pending_foreach+0x30d/0x380 net/bluetooth/mgmt_util.c:257
__mgmt_power_off+0x169/0x350 net/bluetooth/mgmt.c:9444
hci_dev_close_sync+0x754/0x1330 net/bluetooth/hci_sync.c:5290
hci_dev_do_close net/bluetooth/hci_core.c:501 [inline]
hci_dev_close+0x108/0x200 net/bluetooth/hci_core.c:526
sock_do_ioctl+0xd9/0x300 net/socket.c:1192
sock_ioctl+0x576/0x790 net/socket.c:1313
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: cf75ad8b41d2 ("Bluetooth: hci_sync: Convert MGMT_SET_POWERED")
Fixes: 2bd1b237616b ("Bluetooth: hci_sync: Convert MGMT_OP_SET_DISCOVERABLE to use cmd_sync")
Fixes: f056a65783cc ("Bluetooth: hci_sync: Convert MGMT_OP_SET_CONNECTABLE to use cmd_sync")
Fixes: 3244845c6307 ("Bluetooth: hci_sync: Convert MGMT_OP_SSP")
Fixes: d81a494c43df ("Bluetooth: hci_sync: Convert MGMT_OP_SET_LE")
Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh")
Fixes: 6f6ff38a1e14 ("Bluetooth: hci_sync: Convert MGMT_OP_SET_LOCAL_NAME")
Fixes: 71efbb08b538 ("Bluetooth: hci_sync: Convert MGMT_OP_SET_PHY_CONFIGURATION")
Fixes: b747a83690c8 ("Bluetooth: hci_sync: Refactor add Adv Monitor")
Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY")
Fixes: 26ac4c56f03f ("Bluetooth: hci_sync: Convert MGMT_OP_SET_ADVERTISING")
Reported-by: cen zhang <zzzccc427@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Simply derive the ns operations from the namespace type.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
These were used by S1G for older chandef representation, but
are no longer needed. Clean them up, even if we can't drop
them from the userspace API entirely.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This fixes the following UFA in hci_acl_create_conn_sync where a
connection still pending is command submission (conn->state == BT_OPEN)
maybe freed, also since this also can happen with the likes of
hci_le_create_conn_sync fix it as well:
BUG: KASAN: slab-use-after-free in hci_acl_create_conn_sync+0x5ef/0x790 net/bluetooth/hci_sync.c:6861
Write of size 2 at addr ffff88805ffcc038 by task kworker/u11:2/9541
CPU: 1 UID: 0 PID: 9541 Comm: kworker/u11:2 Not tainted 6.16.0-rc7 #3 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Workqueue: hci3 hci_cmd_sync_work
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x230 mm/kasan/report.c:480
kasan_report+0x118/0x150 mm/kasan/report.c:593
hci_acl_create_conn_sync+0x5ef/0x790 net/bluetooth/hci_sync.c:6861
hci_cmd_sync_work+0x210/0x3a0 net/bluetooth/hci_sync.c:332
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
</TASK>
Allocated by task 123736:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1039 [inline]
__hci_conn_add+0x233/0x1b30 net/bluetooth/hci_conn.c:939
hci_conn_add_unset net/bluetooth/hci_conn.c:1051 [inline]
hci_connect_acl+0x16c/0x4e0 net/bluetooth/hci_conn.c:1634
pair_device+0x418/0xa70 net/bluetooth/mgmt.c:3556
hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719
hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839
sock_sendmsg_nosec net/socket.c:712 [inline]
__sock_sendmsg+0x219/0x270 net/socket.c:727
sock_write_iter+0x258/0x330 net/socket.c:1131
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x54b/0xa90 fs/read_write.c:686
ksys_write+0x145/0x250 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 103680:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x62/0x70 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2381 [inline]
slab_free mm/slub.c:4643 [inline]
kfree+0x18e/0x440 mm/slub.c:4842
device_release+0x9c/0x1c0
kobject_cleanup lib/kobject.c:689 [inline]
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x22b/0x480 lib/kobject.c:737
hci_conn_cleanup net/bluetooth/hci_conn.c:175 [inline]
hci_conn_del+0x8ff/0xcb0 net/bluetooth/hci_conn.c:1173
hci_conn_complete_evt+0x3c7/0x1040 net/bluetooth/hci_event.c:3199
hci_event_func net/bluetooth/hci_event.c:7477 [inline]
hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531
hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
Last potentially related work creation:
kasan_save_stack+0x3e/0x60 mm/kasan/common.c:47
kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:548
insert_work+0x3d/0x330 kernel/workqueue.c:2183
__queue_work+0xbd9/0xfe0 kernel/workqueue.c:2345
queue_delayed_work_on+0x18b/0x280 kernel/workqueue.c:2561
pairing_complete+0x1e7/0x2b0 net/bluetooth/mgmt.c:3451
pairing_complete_cb+0x1ac/0x230 net/bluetooth/mgmt.c:3487
hci_connect_cfm include/net/bluetooth/hci_core.h:2064 [inline]
hci_conn_failed+0x24d/0x310 net/bluetooth/hci_conn.c:1275
hci_conn_complete_evt+0x3c7/0x1040 net/bluetooth/hci_event.c:3199
hci_event_func net/bluetooth/hci_event.c:7477 [inline]
hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531
hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
Fixes: aef2aa4fa98e ("Bluetooth: hci_event: Fix creating hci_conn object on error status")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This fixes the following UAF caused by not properly locking hdev when
processing HCI_EV_NUM_COMP_PKTS:
BUG: KASAN: slab-use-after-free in hci_conn_tx_dequeue+0x1be/0x220 net/bluetooth/hci_conn.c:3036
Read of size 4 at addr ffff8880740f0940 by task kworker/u11:0/54
CPU: 1 UID: 0 PID: 54 Comm: kworker/u11:0 Not tainted 6.16.0-rc7 #3 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Workqueue: hci1 hci_rx_work
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x230 mm/kasan/report.c:480
kasan_report+0x118/0x150 mm/kasan/report.c:593
hci_conn_tx_dequeue+0x1be/0x220 net/bluetooth/hci_conn.c:3036
hci_num_comp_pkts_evt+0x1c8/0xa50 net/bluetooth/hci_event.c:4404
hci_event_func net/bluetooth/hci_event.c:7477 [inline]
hci_event_packet+0x7e0/0x1200 net/bluetooth/hci_event.c:7531
hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
</TASK>
Allocated by task 54:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
kasan_kmalloc include/linux/kasan.h:260 [inline]
__kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1039 [inline]
__hci_conn_add+0x233/0x1b30 net/bluetooth/hci_conn.c:939
le_conn_complete_evt+0x3d6/0x1220 net/bluetooth/hci_event.c:5628
hci_le_enh_conn_complete_evt+0x189/0x470 net/bluetooth/hci_event.c:5794
hci_event_func net/bluetooth/hci_event.c:7474 [inline]
hci_event_packet+0x78c/0x1200 net/bluetooth/hci_event.c:7531
hci_rx_work+0x46a/0xe80 net/bluetooth/hci_core.c:4070
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
Freed by task 9572:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x62/0x70 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2381 [inline]
slab_free mm/slub.c:4643 [inline]
kfree+0x18e/0x440 mm/slub.c:4842
device_release+0x9c/0x1c0
kobject_cleanup lib/kobject.c:689 [inline]
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x22b/0x480 lib/kobject.c:737
hci_conn_cleanup net/bluetooth/hci_conn.c:175 [inline]
hci_conn_del+0x8ff/0xcb0 net/bluetooth/hci_conn.c:1173
hci_abort_conn_sync+0x5d1/0xdf0 net/bluetooth/hci_sync.c:5689
hci_cmd_sync_work+0x210/0x3a0 net/bluetooth/hci_sync.c:332
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3321
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 home/kwqcheii/source/fuzzing/kernel/kasan/linux-6.16-rc7/arch/x86/entry/entry_64.S:245
Fixes: 134f4b39df7b ("Bluetooth: add support for skb TX SND/COMPLETION timestamping")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
hci_resume_advertising_sync is suppose to resume all instance paused by
hci_pause_advertising_sync, this logic is used for procedures are only
allowed when not advertising, but instance 0x00 was not being
re-enabled.
Fixes: ad383c2c65a5 ("Bluetooth: hci_sync: Enable advertising when LL privacy is enabled")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Replace synchronize_rcu() with synchronize_net() in __netpoll_free().
synchronize_net() is RTNL-aware and will use the more efficient
synchronize_rcu_expedited() when called under RTNL lock, avoiding
the potentially expensive synchronize_rcu() in RTNL critical sections.
Since __netpoll_free() is called with RTNL held (as indicated by
ASSERT_RTNL()), this change improves performance by reducing the
time spent in the RTNL critical section.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250918-netpoll_jv-v1-2-67d50eeb2c26@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The netpoll_info structure contains an useless pointer back to its
associated netpoll. This field is never used, and the assignment in
__netpoll_setup() is does not comtemplate multiple instances, as
reported by Jay[1].
Drop both the member and its initialization to simplify the structure.
Link: https://lore.kernel.org/all/2930648.1757463506@famine/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250918-netpoll_jv-v1-1-67d50eeb2c26@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This converts the only path not returning drop reasons in
ip_rcv_finish_core.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250915091958.15382-4-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of setting the drop reason to SKB_DROP_REASON_NOT_SPECIFIED
early and having to reset it each time it is overridden by a function
returned value, just set the drop reason to the expected value before
returning from ip_rcv_finish_core.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://patch.msgid.link/20250915091958.15382-3-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
udp_v4_early_demux already returns drop reasons as it either returns 0
or ip_mc_validate_source, which itself returns drop reasons. Its return
value is also already used as a drop reason itself.
Makes this explicit by making it return drop reasons.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250915091958.15382-2-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Various network interface types make use of needed_{head,tail}room values
to efficiently reserve buffer space for additional encapsulation headers,
such as VXLAN, Geneve, IPSec, etc. However, it is not currently possible
to query these values in a generic way.
Introduce ability to query the needed_{head,tail}room values of a network
device via rtnetlink, such that applications that may wish to use these
values can do so.
For example, Cilium agent iterates over present devices based on user config
(direct routing, vxlan, geneve, wireguard etc.) and in future will configure
netkit in order to expose the needed_{head,tail}room into K8s pods. See
b9ed315d3c4c ("netkit: Allow for configuring needed_{head,tail}room").
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alasdair McWilliam <alasdair@mcwilliam.dev>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250917095543.14039-1-alasdair@mcwilliam.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
psp_dev_rcv() decapsulates psp headers from a received frame. This
will make any csum complete computed by the device inaccurate. Rather
than attempt to patch up skb->csum in psp_dev_rcv() just make it clear
to callers what they can expect regarding checksum complete.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250918212723.17495-1-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace two calls to kfree_skb_reason() with sk_skb_reason_drop().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Zahka <daniel.zahka@gmail.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250918132007.325299-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
smc_lo_register_dmb() allocates DMB buffers with kzalloc(), which are
later passed to get_page() in smc_rx_splice(). Since kmalloc memory is
not page-backed, this triggers WARN_ON_ONCE() in get_page() and prevents
holding a refcount on the buffer. This can lead to use-after-free if
the memory is released before splice_to_pipe() completes.
Use folio_alloc() instead, ensuring DMBs are page-backed and safe for
get_page().
WARNING: CPU: 18 PID: 12152 at ./include/linux/mm.h:1330 smc_rx_splice+0xaf8/0xe20 [smc]
CPU: 18 UID: 0 PID: 12152 Comm: smcapp Kdump: loaded Not tainted 6.17.0-rc3-11705-g9cf4672ecfee #10 NONE
Hardware name: IBM 3931 A01 704 (z/VM 7.4.0)
Krnl PSW : 0704e00180000000 000793161032696c (smc_rx_splice+0xafc/0xe20 [smc])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000000 001cee80007d3001 00077400000000f8 0000000000000005
0000000000000001 001cee80007d3006 0007740000001000 001c000000000000
000000009b0c99e0 0000000000001000 001c0000000000f8 001c000000000000
000003ffcc6f7c88 0007740003e98000 0007931600000005 000792969b2ff7b8
Krnl Code: 0007931610326960: af000000 mc 0,0
0007931610326964: a7f4ff43 brc 15,00079316103267ea
#0007931610326968: af000000 mc 0,0
>000793161032696c: a7f4ff3f brc 15,00079316103267ea
0007931610326970: e320f1000004 lg %r2,256(%r15)
0007931610326976: c0e53fd1b5f5 brasl %r14,000793168fd5d560
000793161032697c: a7f4fbb5 brc 15,00079316103260e6
0007931610326980: b904002b lgr %r2,%r11
Call Trace:
smc_rx_splice+0xafc/0xe20 [smc]
smc_rx_splice+0x756/0xe20 [smc])
smc_rx_recvmsg+0xa74/0xe00 [smc]
smc_splice_read+0x1ce/0x3b0 [smc]
sock_splice_read+0xa2/0xf0
do_splice_read+0x198/0x240
splice_file_to_pipe+0x7e/0x110
do_splice+0x59e/0xde0
__do_splice+0x11a/0x2d0
__s390x_sys_splice+0x140/0x1f0
__do_syscall+0x122/0x280
system_call+0x6e/0x90
Last Breaking-Event-Address:
smc_rx_splice+0x960/0xe20 [smc]
---[ end trace 0000000000000000 ]---
Fixes: f7a22071dbf3 ("net/smc: implement DMB-related operations of loopback-ism")
Reviewed-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Signed-off-by: Sidraya Jayagond <sidraya@linux.ibm.com>
Link: https://patch.msgid.link/20250917184220.801066-1-sidraya@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
struct raw_sock has several holes. Reorder the fields to save 8 bytes.
Statistics before:
$ pahole --class_name=raw_sock net/can/raw.o
struct raw_sock {
struct sock sk __attribute__((__aligned__(8))); /* 0 776 */
/* XXX last struct has 1 bit hole */
/* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
int ifindex; /* 776 4 */
/* XXX 4 bytes hole, try to pack */
struct net_device * dev; /* 784 8 */
netdevice_tracker dev_tracker; /* 792 0 */
struct list_head notifier; /* 792 16 */
unsigned int bound:1; /* 808: 0 4 */
unsigned int loopback:1; /* 808: 1 4 */
unsigned int recv_own_msgs:1; /* 808: 2 4 */
unsigned int fd_frames:1; /* 808: 3 4 */
unsigned int xl_frames:1; /* 808: 4 4 */
unsigned int join_filters:1; /* 808: 5 4 */
/* XXX 2 bits hole, try to pack */
/* Bitfield combined with next fields */
struct can_raw_vcid_options raw_vcid_opts; /* 809 4 */
/* XXX 3 bytes hole, try to pack */
canid_t tx_vcid_shifted; /* 816 4 */
canid_t rx_vcid_shifted; /* 820 4 */
canid_t rx_vcid_mask_shifted; /* 824 4 */
int count; /* 828 4 */
/* --- cacheline 13 boundary (832 bytes) --- */
struct can_filter dfilter; /* 832 8 */
struct can_filter * filter; /* 840 8 */
can_err_mask_t err_mask; /* 848 4 */
/* XXX 4 bytes hole, try to pack */
struct uniqframe * uniq; /* 856 8 */
/* size: 864, cachelines: 14, members: 20 */
/* sum members: 852, holes: 3, sum holes: 11 */
/* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */
/* member types with bit holes: 1, total: 1 */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
...and after:
$ pahole --class_name=raw_sock net/can/raw.o
struct raw_sock {
struct sock sk __attribute__((__aligned__(8))); /* 0 776 */
/* XXX last struct has 1 bit hole */
/* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
struct net_device * dev; /* 776 8 */
netdevice_tracker dev_tracker; /* 784 0 */
struct list_head notifier; /* 784 16 */
int ifindex; /* 800 4 */
unsigned int bound:1; /* 804: 0 4 */
unsigned int loopback:1; /* 804: 1 4 */
unsigned int recv_own_msgs:1; /* 804: 2 4 */
unsigned int fd_frames:1; /* 804: 3 4 */
unsigned int xl_frames:1; /* 804: 4 4 */
unsigned int join_filters:1; /* 804: 5 4 */
/* XXX 2 bits hole, try to pack */
/* Bitfield combined with next fields */
struct can_raw_vcid_options raw_vcid_opts; /* 805 4 */
/* XXX 3 bytes hole, try to pack */
canid_t tx_vcid_shifted; /* 812 4 */
canid_t rx_vcid_shifted; /* 816 4 */
canid_t rx_vcid_mask_shifted; /* 820 4 */
can_err_mask_t err_mask; /* 824 4 */
int count; /* 828 4 */
/* --- cacheline 13 boundary (832 bytes) --- */
struct can_filter dfilter; /* 832 8 */
struct can_filter * filter; /* 840 8 */
struct uniqframe * uniq; /* 848 8 */
/* size: 856, cachelines: 14, members: 20 */
/* sum members: 852, holes: 1, sum holes: 3 */
/* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */
/* member types with bit holes: 1, total: 1 */
/* forced alignments: 1 */
/* last cacheline: 24 bytes */
} __attribute__((__aligned__(8)));
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20250917-can-raw-repack-v2-3-395e8b3a4437@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The bound, loopback, recv_own_msgs, fd_frames, xl_frames and
join_filters fields of struct raw_sock just need to store one bit of
information.
Declare all those members as a bitfields of type unsigned int and
width one bit.
Add a temporary variable to raw_setsockopt() and raw_getsockopt() to
make the conversion between the stored bits and the socket interface.
This reduces the size of struct raw_sock by sixteen bytes.
Statistics before:
$ pahole --class_name=raw_sock net/can/raw.o
struct raw_sock {
struct sock sk __attribute__((__aligned__(8))); /* 0 776 */
/* XXX last struct has 1 bit hole */
/* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
int bound; /* 776 4 */
int ifindex; /* 780 4 */
struct net_device * dev; /* 784 8 */
netdevice_tracker dev_tracker; /* 792 0 */
struct list_head notifier; /* 792 16 */
int loopback; /* 808 4 */
int recv_own_msgs; /* 812 4 */
int fd_frames; /* 816 4 */
int xl_frames; /* 820 4 */
struct can_raw_vcid_options raw_vcid_opts; /* 824 4 */
canid_t tx_vcid_shifted; /* 828 4 */
/* --- cacheline 13 boundary (832 bytes) --- */
canid_t rx_vcid_shifted; /* 832 4 */
canid_t rx_vcid_mask_shifted; /* 836 4 */
int join_filters; /* 840 4 */
int count; /* 844 4 */
struct can_filter dfilter; /* 848 8 */
struct can_filter * filter; /* 856 8 */
can_err_mask_t err_mask; /* 864 4 */
/* XXX 4 bytes hole, try to pack */
struct uniqframe * uniq; /* 872 8 */
/* size: 880, cachelines: 14, members: 20 */
/* sum members: 876, holes: 1, sum holes: 4 */
/* member types with bit holes: 1, total: 1 */
/* forced alignments: 1 */
/* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));
...and after:
$ pahole --class_name=raw_sock net/can/raw.o
struct raw_sock {
struct sock sk __attribute__((__aligned__(8))); /* 0 776 */
/* XXX last struct has 1 bit hole */
/* --- cacheline 12 boundary (768 bytes) was 8 bytes ago --- */
int ifindex; /* 776 4 */
/* XXX 4 bytes hole, try to pack */
struct net_device * dev; /* 784 8 */
netdevice_tracker dev_tracker; /* 792 0 */
struct list_head notifier; /* 792 16 */
unsigned int bound:1; /* 808: 0 4 */
unsigned int loopback:1; /* 808: 1 4 */
unsigned int recv_own_msgs:1; /* 808: 2 4 */
unsigned int fd_frames:1; /* 808: 3 4 */
unsigned int xl_frames:1; /* 808: 4 4 */
unsigned int join_filters:1; /* 808: 5 4 */
/* XXX 2 bits hole, try to pack */
/* Bitfield combined with next fields */
struct can_raw_vcid_options raw_vcid_opts; /* 809 4 */
/* XXX 3 bytes hole, try to pack */
canid_t tx_vcid_shifted; /* 816 4 */
canid_t rx_vcid_shifted; /* 820 4 */
canid_t rx_vcid_mask_shifted; /* 824 4 */
int count; /* 828 4 */
/* --- cacheline 13 boundary (832 bytes) --- */
struct can_filter dfilter; /* 832 8 */
struct can_filter * filter; /* 840 8 */
can_err_mask_t err_mask; /* 848 4 */
/* XXX 4 bytes hole, try to pack */
struct uniqframe * uniq; /* 856 8 */
/* size: 864, cachelines: 14, members: 20 */
/* sum members: 852, holes: 3, sum holes: 11 */
/* sum bitfield members: 6 bits, bit holes: 1, sum bit holes: 2 bits */
/* member types with bit holes: 1, total: 1 */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20250917-can-raw-repack-v2-2-395e8b3a4437@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
struct uniqframe has one hole. Reorder the fields to save 8 bytes.
Statistics before:
$ pahole --class_name=uniqframe net/can/raw.o
struct uniqframe {
int skbcnt; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
const struct sk_buff * skb; /* 8 8 */
unsigned int join_rx_count; /* 16 4 */
/* size: 24, cachelines: 1, members: 3 */
/* sum members: 16, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 24 bytes */
};
...and after:
$ pahole --class_name=uniqframe net/can/raw.o
struct uniqframe {
const struct sk_buff * skb; /* 0 8 */
int skbcnt; /* 8 4 */
unsigned int join_rx_count; /* 12 4 */
/* size: 16, cachelines: 1, members: 3 */
/* last cacheline: 16 bytes */
};
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20250917-can-raw-repack-v2-1-395e8b3a4437@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Don't directly acess the namespace count. There's even a dedicated
helper for this.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Don't directly acess the namespace count. There's even a dedicated
helper for this.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Don't directly acess the namespace count. There's even a dedicated
helper for this.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
And drop ns_free_inum(). Anything common that can be wasted centrally
should be wasted in the new common helper.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
When a first MPTCP connection gets successfully established after a
blackhole period, 'active_disable_times' was supposed to be reset when
this connection was done via any non-loopback interfaces.
Unfortunately, the opposite condition was checked: only reset when the
connection was established via a loopback interface. Fixing this by
simply looking at the opposite.
This is similar to what is done with TCP FastOpen, see
tcp_fastopen_active_disable_ofo_check().
This patch is a follow-up of a previous discussion linked to commit
893c49a78d9f ("mptcp: Use __sk_dst_get() and dst_dev_rcu() in
mptcp_active_enable()."), see [1].
Fixes: 27069e7cb3d1 ("mptcp: disable active MPTCP in case of blackhole")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/4209a283-8822-47bd-95b7-87e96d9b7ea3@kernel.org [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250918-net-next-mptcp-blackhole-reset-loopback-v1-1-bf5818326639@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use __sk_dst_get() and dst_dev_rcu(), because dst->dev could
be changed under us.
Fixes: 6b46ca260e22 ("net: psp: add socket security association code")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Tested-by: Daniel Zahka <daniel.zahka@gmail.com>
Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250918115238.237475-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|