Age | Commit message (Collapse) | Author |
|
A LED trigger's activate() callback gets called when the LED trigger
gets activated for a specific LED, so that the trigger code can ensure
the LED state matches the current state of the trigger condition
(LED_FULL when HCI_UP is set in this case).
led_trigger_event() is intended for trigger condition state changes and
iterates over _all_ LEDs which are controlled by this trigger changing
the brightness of each of them.
In the activate() case only the brightness of the LED which is being
activated needs to change and that LED is passed as an argument to
activate(), switch to led_set_brightness() to only change the brightness
of the LED being activated.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Since kzalloc already zeroes the allocated memory, the subsequent
memset call is unnecessary. This patch removes the redundant memset to
clean up the code and enhance efficiency.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
To update with the latest fixes.
|
|
In commit 48b6190a0042 ("net/smc: Limit SMC visits when handshake workqueue congested"),
we introduce a mechanism to put constraint on SMC connections visit
according to the pressure of SMC handshake process.
At that time, we believed that controlling the feature through netlink
was sufficient. However, most people have realized now that netlink is
not convenient in container scenarios, and sysctl is a more suitable
approach.
In addition, since commit 462791bbfa35 ("net/smc: add sysctl interface for SMC")
had introcuded smc_sysctl_net_init(), it is reasonable for us to
initialize limit_smc_hs in it instead of initializing it in
smc_pnet_net_int().
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Jan Karcher <jaka@linux.ibm.com>
Link: https://patch.msgid.link/1725590135-5631-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs into slab/for-6.12/kmem_cache_args
Merge prerequisities from the vfs git tree for the following series that
introduces kmem_cache_args. The vfs.file branch includes the addition of
kmem_cache_create_rcu() which was needed in vfs for the filp cache
optimization. The following series refactors this code.
|
|
At the moment, the slab objects are charged to the memcg at the
allocation time. However there are cases where slab objects are
allocated at the time where the right target memcg to charge it to is
not known. One such case is the network sockets for the incoming
connection which are allocated in the softirq context.
Couple hundred thousand connections are very normal on large loaded
server and almost all of those sockets underlying those connections get
allocated in the softirq context and thus not charged to any memcg.
However later at the accept() time we know the right target memcg to
charge. Let's add new API to charge already allocated objects, so we can
have better accounting of the memory usage.
To measure the performance impact of this change, tcp_crr is used from
the neper [1] performance suite. Basically it is a network ping pong
test with new connection for each ping pong.
The server and the client are run inside 3 level of cgroup hierarchy
using the following commands:
Server:
$ tcp_crr -6
Client:
$ tcp_crr -6 -c -H ${server_ip}
If the client and server run on different machines with 50 GBPS NIC,
there is no visible impact of the change.
For the same machine experiment with v6.11-rc5 as base.
base (throughput) with-patch
tcp_crr 14545 (+- 80) 14463 (+- 56)
It seems like the performance impact is within the noise.
Link: https://github.com/google/neper [1]
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com> # net
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|
|
The grc must be initialize first. There can be a condition where if
fou is NULL, goto out will be executed and grc would be used
uninitialized.
Fixes: 7e4196935069 ("fou: Fix null-ptr-deref in GRO.")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240906102839.202798-1-usama.anjum@collabora.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported use-after-free in unix_stream_recv_urg(). [0]
The scenario is
1. send(MSG_OOB)
2. recv(MSG_OOB)
-> The consumed OOB remains in recv queue
3. send(MSG_OOB)
4. recv()
-> manage_oob() returns the next skb of the consumed OOB
-> This is also OOB, but unix_sk(sk)->oob_skb is not cleared
5. recv(MSG_OOB)
-> unix_sk(sk)->oob_skb is used but already freed
The recent commit 8594d9b85c07 ("af_unix: Don't call skb_get() for OOB
skb.") uncovered the issue.
If the OOB skb is consumed and the next skb is peeked in manage_oob(),
we still need to check if the skb is OOB.
Let's do so by falling back to the following checks in manage_oob()
and add the test case in selftest.
Note that we need to add a similar check for SIOCATMARK.
[0]:
BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
Read of size 4 at addr ffff8880326abcc4 by task syz-executor178/5235
CPU: 0 UID: 0 PID: 5235 Comm: syz-executor178 Not tainted 6.11.0-rc5-syzkaller-00742-gfbdaffe41adc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
unix_stream_recv_urg+0x1df/0x320 net/unix/af_unix.c:2640
unix_stream_read_generic+0x2456/0x2520 net/unix/af_unix.c:2778
unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
sock_recvmsg_nosec net/socket.c:1046 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1068
____sys_recvmsg+0x1db/0x470 net/socket.c:2816
___sys_recvmsg net/socket.c:2858 [inline]
__sys_recvmsg+0x2f0/0x3e0 net/socket.c:2888
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5360d6b4e9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff29b3a458 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007fff29b3a638 RCX: 00007f5360d6b4e9
RDX: 0000000000002001 RSI: 0000000020000640 RDI: 0000000000000003
RBP: 00007f5360dde610 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fff29b3a628 R14: 0000000000000001 R15: 0000000000000001
</TASK>
Allocated by task 5235:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3988 [inline]
slab_alloc_node mm/slub.c:4037 [inline]
kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4080
__alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
alloc_skb include/linux/skbuff.h:1320 [inline]
alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6528
sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2815
sock_alloc_send_skb include/net/sock.h:1778 [inline]
queue_oob+0x108/0x680 net/unix/af_unix.c:2198
unix_stream_sendmsg+0xd24/0xf80 net/unix/af_unix.c:2351
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
___sys_sendmsg net/socket.c:2651 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 5235:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2252 [inline]
slab_free mm/slub.c:4473 [inline]
kmem_cache_free+0x145/0x350 mm/slub.c:4548
unix_stream_read_generic+0x1ef6/0x2520 net/unix/af_unix.c:2917
unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
sock_recvmsg_nosec net/socket.c:1046 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1068
__sys_recvfrom+0x256/0x3e0 net/socket.c:2255
__do_sys_recvfrom net/socket.c:2273 [inline]
__se_sys_recvfrom net/socket.c:2269 [inline]
__x64_sys_recvfrom+0xde/0x100 net/socket.c:2269
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff8880326abc80
which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 68 bytes inside of
freed 240-byte region [ffff8880326abc80, ffff8880326abd70)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x326ab
ksm flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xfdffffff(slab)
raw: 00fff00000000000 ffff88801eaee780 ffffea0000b7dc80 dead000000000003
raw: 0000000000000000 00000000800c000c 00000001fdffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 4686, tgid 4686 (udevadm), ts 32357469485, free_ts 28829011109
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
prep_new_page mm/page_alloc.c:1501 [inline]
get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
__alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2321
allocate_slab+0x5a/0x2f0 mm/slub.c:2484
new_slab mm/slub.c:2537 [inline]
___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
__slab_alloc+0x58/0xa0 mm/slub.c:3813
__slab_alloc_node mm/slub.c:3866 [inline]
slab_alloc_node mm/slub.c:4025 [inline]
kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4080
__alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
alloc_skb include/linux/skbuff.h:1320 [inline]
alloc_uevent_skb+0x74/0x230 lib/kobject_uevent.c:289
uevent_net_broadcast_untagged lib/kobject_uevent.c:326 [inline]
kobject_uevent_net_broadcast+0x2fd/0x580 lib/kobject_uevent.c:410
kobject_uevent_env+0x57d/0x8e0 lib/kobject_uevent.c:608
kobject_synth_uevent+0x4ef/0xae0 lib/kobject_uevent.c:207
uevent_store+0x4b/0x70 drivers/base/bus.c:633
kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xa72/0xc90 fs/read_write.c:590
page last free pid 1 tgid 1 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1094 [inline]
free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
kasan_depopulate_vmalloc_pte+0x74/0x90 mm/kasan/shadow.c:408
apply_to_pte_range mm/memory.c:2797 [inline]
apply_to_pmd_range mm/memory.c:2841 [inline]
apply_to_pud_range mm/memory.c:2877 [inline]
apply_to_p4d_range mm/memory.c:2913 [inline]
__apply_to_page_range+0x8a8/0xe50 mm/memory.c:2947
kasan_release_vmalloc+0x9a/0xb0 mm/kasan/shadow.c:525
purge_vmap_node+0x3e3/0x770 mm/vmalloc.c:2208
__purge_vmap_area_lazy+0x708/0xae0 mm/vmalloc.c:2290
_vm_unmap_aliases+0x79d/0x840 mm/vmalloc.c:2885
change_page_attr_set_clr+0x2fe/0xdb0 arch/x86/mm/pat/set_memory.c:1881
change_page_attr_set arch/x86/mm/pat/set_memory.c:1922 [inline]
set_memory_nx+0xf2/0x130 arch/x86/mm/pat/set_memory.c:2110
free_init_pages arch/x86/mm/init.c:924 [inline]
free_kernel_image_pages arch/x86/mm/init.c:943 [inline]
free_initmem+0x79/0x110 arch/x86/mm/init.c:970
kernel_init+0x31/0x2b0 init/main.c:1476
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address:
ffff8880326abb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880326abc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff8880326abc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880326abd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
ffff8880326abd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: 93c99f21db36 ("af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.")
Reported-by: syzbot+8811381d455e3e9ec788@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8811381d455e3e9ec788
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When OOB skb has been already consumed, manage_oob() returns the next
skb if exists. In such a case, we need to fall back to the else branch
below.
Then, we want to keep holding spin_lock(&sk->sk_receive_queue.lock).
Let's move it out of if-else branch and add lightweight check before
spin_lock() for major use cases without OOB skb.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When OOB skb has been already consumed, manage_oob() returns the next
skb if exists. In such a case, we need to fall back to the else branch
below.
Then, we need to keep two skbs and free them later with consume_skb()
and kfree_skb().
Let's rename unlinked_skb accordingly.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a prep for the later fix.
No functional change intended.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
dev_pick_tx_cpu_id() has been introduced with two users by
commit a4ea8a3dacc3 ("net: Add generic ndo_select_queue functions").
The use in AF_PACKET has been removed in 2019 by
commit b71b5837f871 ("packet: rework packet_pick_tx_queue() to use common code selection")
The other user was a Netlogic XLP driver, removed in 2021 by
commit 47ac6f567c28 ("staging: Remove Netlogic XLP network driver").
It's relatively unlikely that any modern driver will need an
.ndo_select_queue implementation which picks purely based on CPU ID
and skips XPS, delete dev_pick_tx_cpu_id()
Found by code inspection.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240906161059.715546-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clang warns (or errors with CONFIG_WERROR):
net/xfrm/xfrm_policy.c:1286:8: error: variable 'dir' is uninitialized when used here [-Werror,-Wuninitialized]
1286 | if ((dir & XFRM_POLICY_MASK) == XFRM_POLICY_OUT) {
| ^~~
net/xfrm/xfrm_policy.c:1257:9: note: initialize the variable 'dir' to silence this warning
1257 | int dir;
| ^
| = 0
1 error generated.
A recent refactoring removed some assignments to dir because
xfrm_policy_is_dead_or_sk() has a dir assignment in it. However, dir is
used elsewhere in xfrm_hash_rebuild(), including within loops where it
needs to be reloaded for each policy. Restore the assignments before the
first use of dir to fix the warning and ensure dir is properly
initialized throughout the function.
Fixes: 08c2182cf0b4 ("xfrm: policy: use recently added helper in more places")
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Julian Wiedmann says:
> + if (!xfrm_pol_hold_rcu(ret))
Coverity spotted that ^^^ needs a s/ret/pol fix-up:
> CID 1599386: Null pointer dereferences (FORWARD_NULL)
> Passing null pointer "ret" to "xfrm_pol_hold_rcu", which dereferences it.
Ditch the bogus 'ret' variable.
Fixes: 563d5ca93e88 ("xfrm: switch migrate to xfrm_policy_lookup_bytype")
Reported-by: Julian Wiedmann <jwiedmann.dev@gmail.com>
Closes: https://lore.kernel.org/netdev/06dc2499-c095-4bd4-aee3-a1d0e3ec87c4@gmail.com/
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable holds the full DS field.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Note that callers of udp_tunnel_dst_lookup() pass the entire DS field in
the 'tos' argument.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling nf_route() which eventually
calls ip_route_output_key() so that in the future it could perform the
FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable includes the full DS field. Either the one
specified as part of the tunnel parameters or the one inherited from the
inner packet.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable includes the full DS field. Either the one
specified via the tunnel key or the one inherited from the inner packet.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask the upper DSCP bits when calling ip_route_output_gre() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unmask upper DSCP bits when calling ip_route_output() so that in the
future it could perform the FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since '__dev_queue_xmit()' should be called with interrupts enabled,
the following backtrace:
ieee80211_do_stop()
...
spin_lock_irqsave(&local->queue_stop_reason_lock, flags)
...
ieee80211_free_txskb()
ieee80211_report_used_skb()
ieee80211_report_ack_skb()
cfg80211_mgmt_tx_status_ext()
nl80211_frame_tx_status()
genlmsg_multicast_netns()
genlmsg_multicast_netns_filtered()
nlmsg_multicast_filtered()
netlink_broadcast_filtered()
do_one_broadcast()
netlink_broadcast_deliver()
__netlink_sendskb()
netlink_deliver_tap()
__netlink_deliver_tap_skb()
dev_queue_xmit()
__dev_queue_xmit() ; with IRQS disabled
...
spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags)
issues the warning (as reported by syzbot reproducer):
WARNING: CPU: 2 PID: 5128 at kernel/softirq.c:362 __local_bh_enable_ip+0xc3/0x120
Fix this by implementing a two-phase skb reclamation in
'ieee80211_do_stop()', where actual work is performed
outside of a section with interrupts disabled.
Fixes: 5061b0c2b906 ("mac80211: cooperate more with network namespaces")
Reported-by: syzbot+1a3986bbd3169c307819@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1a3986bbd3169c307819
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20240906123151.351647-1-dmantipov@yandex.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This reverts commit e7cd191f83fd899c233dfbe7dc6d96ef703dcbbd.
While supporting xfrm interfaces in the packet offload API
is needed, this patch does not do the right thing. There are
more things to do to really support xfrm interfaces, so revert
it for now.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Although not reproduced in practice, these two cases may be
considered by UBSAN as off-by-one errors. So fix them in the
same way as in commit a26a5107bc52 ("wifi: cfg80211: fix UBSAN
noise in cfg80211_wext_siwscan()").
Fixes: 807f8a8c3004 ("cfg80211/nl80211: add support for scheduled scans")
Fixes: 5ba63533bbf6 ("cfg80211: fix alignment problem in scan request")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20240909090806.1091956-1-dmantipov@yandex.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Device class has two namespace relevant fields which are associated by
the following usage:
struct class {
...
const struct kobj_ns_type_operations *ns_type;
const void *(*namespace)(const struct device *dev);
...
}
if (dev->class && dev->class->ns_type)
dev->class->namespace(dev);
The usage looks weird since it checks @ns_type but calls namespace()
it is found for all existing class definitions that the other filed is
also assigned once one is assigned in current kernel tree, so fix this
weird usage by checking @namespace to call namespace().
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot found a new splat [1].
Instead of adding yet another spin_lock_bh(&hsr->seqnr_lock) /
spin_unlock_bh(&hsr->seqnr_lock) pair, remove seqnr_lock
and use atomic_t for hsr->sequence_nr and hsr->sup_sequence_nr.
This also avoid a race in hsr_fill_info().
Also remove interlink_sequence_nr which is unused.
[1]
WARNING: CPU: 1 PID: 9723 at net/hsr/hsr_forward.c:602 handle_std_frame+0x247/0x2c0 net/hsr/hsr_forward.c:602
Modules linked in:
CPU: 1 UID: 0 PID: 9723 Comm: syz.0.1657 Not tainted 6.11.0-rc6-syzkaller-00026-g88fac17500f4 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:handle_std_frame+0x247/0x2c0 net/hsr/hsr_forward.c:602
Code: 49 8d bd b0 01 00 00 be ff ff ff ff e8 e2 58 25 00 31 ff 89 c5 89 c6 e8 47 53 a8 f6 85 ed 0f 85 5a ff ff ff e8 fa 50 a8 f6 90 <0f> 0b 90 e9 4c ff ff ff e8 cc e7 06 f7 e9 8f fe ff ff e8 52 e8 06
RSP: 0018:ffffc90000598598 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffc90000598670 RCX: ffffffff8ae2c919
RDX: ffff888024e94880 RSI: ffffffff8ae2c926 RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
R13: ffff8880627a8cc0 R14: 0000000000000000 R15: ffff888012b03c3a
FS: 0000000000000000(0000) GS:ffff88802b700000(0063) knlGS:00000000f5696b40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000020010000 CR3: 00000000768b4000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
hsr_fill_frame_info+0x2c8/0x360 net/hsr/hsr_forward.c:630
fill_frame_info net/hsr/hsr_forward.c:700 [inline]
hsr_forward_skb+0x7df/0x25c0 net/hsr/hsr_forward.c:715
hsr_handle_frame+0x603/0x850 net/hsr/hsr_slave.c:70
__netif_receive_skb_core.constprop.0+0xa3d/0x4330 net/core/dev.c:5555
__netif_receive_skb_list_core+0x357/0x950 net/core/dev.c:5737
__netif_receive_skb_list net/core/dev.c:5804 [inline]
netif_receive_skb_list_internal+0x753/0xda0 net/core/dev.c:5896
gro_normal_list include/net/gro.h:515 [inline]
gro_normal_list include/net/gro.h:511 [inline]
napi_complete_done+0x23f/0x9a0 net/core/dev.c:6247
gro_cell_poll+0x162/0x210 net/core/gro_cells.c:66
__napi_poll.constprop.0+0xb7/0x550 net/core/dev.c:6772
napi_poll net/core/dev.c:6841 [inline]
net_rx_action+0xa92/0x1010 net/core/dev.c:6963
handle_softirqs+0x216/0x8f0 kernel/softirq.c:554
do_softirq kernel/softirq.c:455 [inline]
do_softirq+0xb2/0xf0 kernel/softirq.c:442
</IRQ>
<TASK>
Fixes: 06afd2c31d33 ("hsr: Synchronize sending frames to have always incremented outgoing seq nr.")
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need the USB fixes in here as well, and this also resolves the merge
conflict in:
drivers/usb/typec/ucsi/ucsi.c
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There are several comments all over the place, which uses a wrong singular
form of jiffies.
Replace 'jiffie' by 'jiffy'. No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
|
|
According to Vinicius (and carefully looking through the whole
https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
once again), txtime branch of 'taprio_change()' is not going to
race against 'advance_sched()'. But using 'rcu_replace_pointer()'
in the former may be a good idea as well.
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
Patch #1 adds ctnetlink support for kernel side filtering for
deletions, from Changliang Wu.
Patch #2 updates nft_counter support to Use u64_stats_t,
from Sebastian Andrzej Siewior.
Patch #3 uses kmemdup_array() in all xtables frontends,
from Yan Zhen.
Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead
opencoded casting, from Shen Lichuan.
Patch #5 removes unused argument in nftables .validate interface,
from Florian Westphal.
Patch #6 is a oneliner to correct a typo in nftables kdoc,
from Simon Horman.
Patch #7 fixes missing kdoc in nftables, also from Simon.
Patch #8 updates nftables to handle timeout less than CONFIG_HZ.
Patch #9 rejects element expiration if timeout is zero,
otherwise it is silently ignored.
Patch #10 disallows element expiration larger than timeout.
Patch #11 removes unnecessary READ_ONCE annotation while mutex is held.
Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset.
Patch #13 annotates data-races around element expiration.
Patch #14 allocates timeout and expiration in one single set element
extension, they are tighly couple, no reason to keep them
separated anymore.
Patch #15 updates nftables to interpret zero timeout element as never
times out. Note that it is already possible to declare sets
with elements that never time out but this generalizes to all
kind of set with timeouts.
Patch #16 supports for element timeout and expiration updates.
* tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: set element timeout update support
netfilter: nf_tables: zero timeout means element never times out
netfilter: nf_tables: consolidate timeout extension for elements
netfilter: nf_tables: annotate data-races around element expiration
netfilter: nft_dynset: annotate data-races around set timeout
netfilter: nf_tables: remove annotation to access set timeout while holding lock
netfilter: nf_tables: reject expiration higher than timeout
netfilter: nf_tables: reject element expiration with no timeout
netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
netfilter: nf_tables: Add missing Kernel doc
netfilter: nf_tables: Correct spelling in nf_tables.h
netfilter: nf_tables: drop unused 3rd argument from validate callback ops
netfilter: conntrack: Convert to use ERR_CAST()
netfilter: Use kmemdup_array instead of kmemdup for multiple allocation
netfilter: nft_counter: Use u64_stats_t for statistic.
netfilter: ctnetlink: support CTA_FILTER for flush
====================
Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
netpoll_srcu is currently used from netpoll_poll_disable() and
__netpoll_cleanup()
Both functions run under RTNL, using netpoll_srcu adds confusion
and no additional protection.
Moreover the synchronize_srcu() call in __netpoll_cleanup() is
performed before clearing np->dev->npinfo, which violates RCU rules.
After this patch, netpoll_poll_disable() and netpoll_poll_enable()
simply use rtnl_dereference().
This saves a big chunk of memory (more than 192KB on platforms
with 512 cpus)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240905084909.2082486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When asynchronous encryption is used KTLS sends out the final data at
proto->close time. This becomes problematic when the task calling
close() receives a signal. In this case it can happen that
tcp_sendmsg_locked() called at close time returns -ERESTARTSYS and the
final data is not sent.
The described situation happens when KTLS is used in conjunction with
io_uring, as io_uring uses task_work_add() to add work to the current
userspace task. A discussion of the problem along with a reproducer can
be found in [1] and [2]
Fix this by waiting for the asynchronous encryption to be completed on
the final message. With this there is no data left to be sent at close
time.
[1] https://lore.kernel.org/all/20231010141932.GD3114228@pengutronix.de/
[2] https://lore.kernel.org/all/20240315100159.3898944-1-s.hauer@pengutronix.de/
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://patch.msgid.link/20240904-ktls-wait-async-v1-1-a62892833110@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-5-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240904093243.3345012-4-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-2-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently DFS works under assumption there could be only one channel
context in the hardware. Hence, drivers just calls the function
ieee80211_radar_detected() passing the hardware structure. However, with
MLO, this obviously will not work since number of channel contexts will be
more than one and hence drivers would need to pass the channel information
as well on which the radar is detected.
Also, when radar is detected in one of the links, other link's CAC should
not be cancelled.
Hence, in order to support DFS with MLO, do the following changes -
* Add channel context conf pointer as an argument to the function
ieee80211_radar_detected(). During MLO, drivers would have to pass on
which channel context conf radar is detected. Otherwise, drivers could
just pass NULL.
* ieee80211_radar_detected() will iterate over all channel contexts
present and
* if channel context conf is passed, only mark that as radar
detected
* if NULL is passed, then mark all channel contexts as radar
detected
* Then as usual, schedule the radar detected work.
* In the worker, go over all the contexts again and for all such context
which is marked with radar detected, cancel the ongoing CAC by calling
ieee80211_dfs_cac_cancel() and then notify cfg80211 via
cfg80211_radar_event().
* To cancel the CAC, pass the channel context as well where radar is
detected to ieee80211_dfs_cac_cancel(). This ensures that CAC is
canceled only on the links using the provided context, leaving other
links unaffected.
This would also help in scenarios where there is split phy 5 GHz radio,
which is capable of DFS channels in both lower and upper band. In this
case, simultaneous radars can be detected.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-9-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Now that all APIs have support to handle DFS per link, use proper link ID
instead of 0.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-8-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In order to support DFS with MLO, handle the link ID now passed from
cfg80211, adjust the code to do everything per link and call the
notifications to cfg80211 correctly.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-7-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Currently, during starting a radar detection, no link id information is
parsed and passed down. In order to support starting radar detection
during Multi Link Operation, it is required to pass link id as well.
Add changes to first parse and then pass link id in the start radar
detection path.
Additionally, update notification APIs to allow drivers/mac80211 to
pass the link ID.
However, everything is handled at link 0 only until all API's are ready to
handle it per link.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-6-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
A few members related to DFS handling are currently under per wireless
device data structure. However, in order to support DFS with MLO, there is
a need to have them on a per-link manner.
Hence, as a preliminary step, move members cac_started, cac_start_time
and cac_time_ms to be on a per-link basis.
Since currently, link ID is not known at all places, use default value of
0 for now.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-5-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
rdev_end_cac trace event is linked with wiphy_netdev_evt event class.
There is no option to pass link ID currently to wiphy_netdev_evt class.
A subsequent change would pass link ID to rdev_end_cac event and hence
it can no longer derive the event class from wiphy_netdev_evt.
Therefore, unlink rdev_end_cac event from wiphy_netdev_evt and define it's
own independent trace event. Link ID would be passed in subsequent change.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-4-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
After locks rework [1], ieee80211_start_radar_detection() function is no
longer acquiring any lock as such explicitly. Hence, it is not unlocking
anything as well. However, label "out_unlock" is still used which creates
confusion. Also, now there is no need of goto label as such.
Get rid of the goto logic and use direct return statements.
[1]: https://lore.kernel.org/all/20230828135928.b1c6efffe9ad.I4aec875e25abc9ef0b5ad1e70b5747fd483fbd3c@changeid/
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-3-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This reverts commit ce9e660ef32e ("wifi: mac80211: move radar detect work to sdata").
To enable radar detection with MLO, it’s essential to handle it on a
per-link basis. This is because when using MLO, multiple links may already
be active and beaconing. In this scenario, another link should be able to
initiate a radar detection. Also, if underlying links are associated with
different hardware devices but grouped together for MLO, they could
potentially start radar detection simultaneously. Therefore, it makes
sense to manage radar detection settings separately for each link by moving
them back to a per-link data structure.
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
Link: https://patch.msgid.link/20240906064426.2101315-2-quic_adisi@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|