Age | Commit message (Collapse) | Author |
|
Since the RDMA Read I/O state is now contained in the recv_ctxt,
svc_rdma_copy_inline_range() can use that recv_ctxt to derive the
read_info rather than the other way around. This removes another
usage of the ri_readctxt field, enabling its removal in a
subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the RDMA Read I/O state is now contained in the recv_ctxt,
svc_rdma_build_read_data_item() can use that recv_ctxt to derive
that information rather than the other way around. This removes
another usage of the ri_readctxt field, enabling its removal in a
subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the RDMA Read I/O state is now contained in the recv_ctxt,
svc_rdma_build_read_chunk_range() can use that recv_ctxt to derive
that information rather than the other way around. This removes
another usage of the ri_readctxt field, enabling its removal in a
subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the RDMA Read I/O state is now contained in the recv_ctxt,
svc_rdma_build_read_chunk() can use that recv_ctxt to derive that
information rather than the other way around. This removes another
usage of the ri_readctxt field, enabling its removal in a
subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the RDMA Read I/O state is now contained in the recv_ctxt,
svc_rdma_build_read_segment() can use the recv_ctxt to derive that
information rather than the other way around. This removes one usage
of the ri_readctxt field, enabling its removal in a subsequent
patch.
At the same time, the use of ri_rqst can similarly be replaced with
a passed-in function parameter.
Start with build_read_segment() because it is a common utility
function at the bottom of the Read chunk path.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Further clean up: move the starting byte offset field into
svc_rdma_recv_ctxt.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Further clean up: move the page index field into svc_rdma_recv_ctxt.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the request's svc_rdma_recv_ctxt will stay around for the
duration of the RDMA Read operation, the contents of struct
svc_rdma_read_info can reside in the request's svc_rdma_recv_ctxt
rather than being allocated separately. This will eventually save a
call to kmalloc() in a hot path.
Start this clean-up by moving the Read chunk's svc_rdma_chunk_ctxt.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Prepare for nestling these into the send and recv ctxts so they
no longer have to be allocated dynamically.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
In every instance, the pointer address in that field is now
available by other means.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma
field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma
field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma
field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma
field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Enable the removal of the svc_rdma_chunk_ctxt::cc_rdma field in a
subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
SG_CHUNK_SIZE is 128, making struct svc_rdma_rw_ctxt + the first
SGL array more than 4200 bytes in length, pushing the memory
allocation well into order 1.
Even so, the RDMA rw core doesn't seem to use more than max_send_sge
entries in that array (typically 32 or less), so that is all wasted
space.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
A send/recv_ctxt already records transport-related information
in the cq.id, thus there is no need to record the IP addresses of
the transport endpoints.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Update the DMA error flow tracepoints to report the completion ID of
the failing context. This ties the wait/failure to a particular
operation or request, which is more useful than knowing only the
failing transport.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Update the Send Queue's error flow tracepoints to report the
completion ID of the waiting or failing context. This ties the
wait/failure to a particular operation or request, which is a little
more useful than knowing only the transport that is about to close.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
De-duplicate some code, making it easier to add new tracepoints that
report only a completion ID.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Two svcrdma-related transport locks can become quite contended.
Collate their use and make them easy to find in /proc/lock_stat for
better observability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
There's no need to protect llist_entry() with a spin lock.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
DMA unmapping can take quite some time, so it should not be handled
in a single-threaded completion handler. Defer releasing write_info
structs to the recently-added workqueue.
With this patch, DMA unmapping can be handled in parallel, and it
does not cause head-of-queue blocking of Write completions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
DMA unmapping can take quite some time, so it should not be handled
in a single-threaded completion handler. Defer releasing send_ctxts
to the recently-added workqueue.
With this patch, DMA unmapping can be handled in parallel, and it
does not cause head-of-queue blocking of Send completions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
To handle work in the background, set up an UNBOUND workqueue for
svcrdma. Subsequent patches will make use of it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The original reason for allocating svc_rdma_recv_ctxt objects during
Receive completion was to ensure the objects were allocated on the
NUMA node closest to the underlying IB device.
Since commit c5d68d25bd6b ("svcrdma: Clean up allocation of
svc_rdma_recv_ctxt"), however, the device's favored node is
explicitly passed to the memory allocator.
To enable switching Receive completion to soft IRQ context, move
memory allocation out of completion handling, since it can be
costly, and it can sleep.
A limited number of objects is now allocated at "accept" time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The svc_rdma_recv_ctxt free list uses a lockless list to avoid the
need for a spin lock in the fast path. llist_del_first(), which is
used by svc_rdma_recv_ctxt_get(), requires serialization, however,
when there are multiple list producers that are unserialized.
I mistakenly thought there was only one caller of
svc_rdma_recv_ctxt_get() (svc_rdma_refresh_recvs()), thus explicit
serialization would not be necessary. But there is another caller:
svc_rdma_bc_sendto(), and these two are not serialized against each
other. I haven't seen ill effects that I could directly ascribe to
a lack of serialization. It's just an observation based on code
audit.
When DMA-mapping before sending a Reply, the passed-in struct
svc_rdma_recv_ctxt is used only for its write and reply PCLs. These
are currently always empty in the backchannel case. So, instead of
passing a full svc_rdma_recv_ctxt object to
svc_rdma_map_reply_msg(), let's pass in just the Write and Reply
PCLs.
This change makes it unnecessary for the backchannel to acquire a
dummy svc_rdma_recv_ctxt object when sending an RPC Call. The need
for svc_rdma_recv_ctxt free list serialization is now completely
avoided.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
This flag is no longer used.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
NFSD will use this new API to determine whether nfsd_splice_read is
safe to use. This avoids the need to add a dependency to NFSD for
CONFIG_SUNRPC_GSS.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
syzbot pointed out [1] that NEXTHDR_FRAGMENT handling is broken.
Reading frag_off can only be done if we pulled enough bytes
to skb->head. Currently we might access garbage.
[1]
BUG: KMSAN: uninit-value in ip6_tnl_parse_tlv_enc_lim+0x94f/0xbb0
ip6_tnl_parse_tlv_enc_lim+0x94f/0xbb0
ipxip6_tnl_xmit net/ipv6/ip6_tunnel.c:1326 [inline]
ip6_tnl_start_xmit+0xab2/0x1a70 net/ipv6/ip6_tunnel.c:1432
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564
__dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x569/0x660 net/core/neighbour.c:1592
neigh_output include/net/neighbour.h:542 [inline]
ip6_finish_output2+0x23a9/0x2b30 net/ipv6/ip6_output.c:137
ip6_finish_output+0x855/0x12b0 net/ipv6/ip6_output.c:222
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip6_output+0x323/0x610 net/ipv6/ip6_output.c:243
dst_output include/net/dst.h:451 [inline]
ip6_local_out+0xe9/0x140 net/ipv6/output_core.c:155
ip6_send_skb net/ipv6/ip6_output.c:1952 [inline]
ip6_push_pending_frames+0x1f9/0x560 net/ipv6/ip6_output.c:1972
rawv6_push_pending_frames+0xbe8/0xdf0 net/ipv6/raw.c:582
rawv6_sendmsg+0x2b66/0x2e70 net/ipv6/raw.c:920
inet_sendmsg+0x105/0x190 net/ipv4/af_inet.c:847
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at:
slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
slab_alloc_node mm/slub.c:3478 [inline]
__kmem_cache_alloc_node+0x5c9/0x970 mm/slub.c:3517
__do_kmalloc_node mm/slab_common.c:1006 [inline]
__kmalloc_node_track_caller+0x118/0x3c0 mm/slab_common.c:1027
kmalloc_reserve+0x249/0x4a0 net/core/skbuff.c:582
pskb_expand_head+0x226/0x1a00 net/core/skbuff.c:2098
__pskb_pull_tail+0x13b/0x2310 net/core/skbuff.c:2655
pskb_may_pull_reason include/linux/skbuff.h:2673 [inline]
pskb_may_pull include/linux/skbuff.h:2681 [inline]
ip6_tnl_parse_tlv_enc_lim+0x901/0xbb0 net/ipv6/ip6_tunnel.c:408
ipxip6_tnl_xmit net/ipv6/ip6_tunnel.c:1326 [inline]
ip6_tnl_start_xmit+0xab2/0x1a70 net/ipv6/ip6_tunnel.c:1432
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564
__dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x569/0x660 net/core/neighbour.c:1592
neigh_output include/net/neighbour.h:542 [inline]
ip6_finish_output2+0x23a9/0x2b30 net/ipv6/ip6_output.c:137
ip6_finish_output+0x855/0x12b0 net/ipv6/ip6_output.c:222
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip6_output+0x323/0x610 net/ipv6/ip6_output.c:243
dst_output include/net/dst.h:451 [inline]
ip6_local_out+0xe9/0x140 net/ipv6/output_core.c:155
ip6_send_skb net/ipv6/ip6_output.c:1952 [inline]
ip6_push_pending_frames+0x1f9/0x560 net/ipv6/ip6_output.c:1972
rawv6_push_pending_frames+0xbe8/0xdf0 net/ipv6/raw.c:582
rawv6_sendmsg+0x2b66/0x2e70 net/ipv6/raw.c:920
inet_sendmsg+0x105/0x190 net/ipv4/af_inet.c:847
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
__sys_sendmsg net/socket.c:2667 [inline]
__do_sys_sendmsg net/socket.c:2676 [inline]
__se_sys_sendmsg net/socket.c:2674 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
CPU: 0 PID: 7345 Comm: syz-executor.3 Not tainted 6.7.0-rc8-syzkaller-00024-gac865f00af29 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix rxrpc_cleanup_ring() to use rxrpc_purge_queue() rather than
skb_queue_purge() so that the count of outstanding skbuffs is correctly
updated when a failed call is cleaned up.
Without this rmmod may hang waiting for rxrpc_n_rx_skbs to become zero.
Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In fib_nl2rule(), 'err' variable has been set to -EINVAL during
declaration, and no need to set the 'err' variable to -EINVAL again.
So, remove it.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of using two bools derived from a flags passed as arguments to
the parent function of tc_action_load_ops, just pass the flags itself
to tc_action_load_ops to simplify its parameters.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RXFH input_xfrm currently has three supported values: 0 (clear all),
symmetric_xor and NO_CHANGE.
Reject any other value sent from user-space.
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20240104212653.394424-1-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-01-05
We've added 40 non-merge commits during the last 2 day(s) which contain
a total of 73 files changed, 1526 insertions(+), 951 deletions(-).
The main changes are:
1) Fix a memory leak when streaming AF_UNIX sockets were inserted
into multiple sockmap slots/maps, from John Fastabend.
2) Fix gotol in s390 BPF JIT with large offsets, from Ilya Leoshkevich.
3) Fix reattachment branch in bpf_tracing_prog_attach() and reject
the request if there is no valid attach_btf, from Jiri Olsa.
4) Remove deprecated bpfilter kernel leftovers given the project
is developed in user space (https://github.com/facebook/bpfilter),
from Quentin Deslandes.
5) Relax tracing BPF program recursive attach rules given right now
it is not possible to create tracing program call cycles,
from Dmitrii Dolgov.
6) Fix excessive memory consumption for the bpf_global_percpu_ma
for systems with a large number of CPUs, from Yonghong Song.
7) Small x86 BPF JIT cleanup to reuse emit_nops instead of open-coding
memcpy of x86_nops, from Leon Hwang.
8) Follow-up for libbpf to support __arg_ctx global function argument tag
semantics to complement the merged kernel side, from Andrii Nakryiko.
9) Introduce "volatile compare" macros for BPF selftests in order
to make the latter more robust against compiler optimization,
from Alexei Starovoitov.
10) Small simplification in verifier's size checking of helper accesses
along with additional selftests, from Andrei Matei.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (40 commits)
selftests/bpf: Test re-attachment fix for bpf_tracing_prog_attach
bpf: Fix re-attachment branch in bpf_tracing_prog_attach
selftests/bpf: Add test for recursive attachment of tracing progs
bpf: Relax tracing prog recursive attach rules
bpf, x86: Use emit_nops to replace memcpy x86_nops
selftests/bpf: Test gotol with large offsets
selftests/bpf: Double the size of test_loader log
s390/bpf: Fix gotol with large offsets
bpfilter: remove bpfilter
bpf: Remove unnecessary cpu == 0 check in memalloc
selftests/bpf: add __arg_ctx BTF rewrite test
selftests/bpf: add arg:ctx cases to test_global_funcs tests
libbpf: implement __arg_ctx fallback logic
libbpf: move BTF loading step after relocation step
libbpf: move exception callbacks assignment logic into relocation step
libbpf: use stable map placeholder FDs
libbpf: don't rely on map->fd as an indicator of map being created
libbpf: use explicit map reuse flag to skip map creation steps
libbpf: make uniform use of btf__fd() accessor inside libbpf
selftests/bpf: Add a selftest with > 512-byte percpu allocation size
...
====================
Link: https://lore.kernel.org/r/20240105170105.21070-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The existing code always pulls the IPv6 header and sets the transport
offset initially. Then optionally again pulls any extension headers in
ipv6_gso_pull_exthdrs and sets the transport offset again on return from
that call. skb->data is set at the start of the first extension header
before calling ipv6_gso_pull_exthdrs, and must disable the frag0
optimization because that function uses pskb_may_pull/pskb_pull instead of
skb_gro_ helpers. It sets the GRO offset to the TCP header with
skb_gro_pull and sets the transport header. Then returns skb->data to its
position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs -
which is used in ipv6_gro_receive to pull ipv6 ext headers instead of
ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all
operations use skb_gro_* helpers, and the frag0 fast path can be taken for
IPv6 packets with ext headers.
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/504130f6-b56c-4dcc-882c-97942c59f5b7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit adds net_offload to IPv6 Hop-by-Hop extension headers (as it
is done for routing and dstopts) since it is supported in GSO and GRO.
This allows to remove specific HBH conditionals in GSO and GRO when
pulling and parsing an incoming packet.
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d4f8825a-1d55-4b12-9d67-a254dbbfa6ae@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
act_ct adds skb->users before defragmentation. If frags arrive in order,
the last frag's reference is reset in:
inet_frag_reasm_prepare
skb_morph
which is not straightforward.
However when frags arrive out of order, nobody unref the last frag, and
all frags are leaked. The situation is even worse, as initiating packet
capture can lead to a crash[0] when skb has been cloned and shared at the
same time.
Fix the issue by removing skb_get() before defragmentation. act_ct
returns TC_ACT_CONSUMED when defrag failed or in progress.
[0]:
[ 843.804823] ------------[ cut here ]------------
[ 843.809659] kernel BUG at net/core/skbuff.c:2091!
[ 843.814516] invalid opcode: 0000 [#1] PREEMPT SMP
[ 843.819296] CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G S 6.7.0-rc3 #2
[ 843.824107] Hardware name: XFUSION 1288H V6/BC13MBSBD, BIOS 1.29 11/25/2022
[ 843.828953] RIP: 0010:pskb_expand_head+0x2ac/0x300
[ 843.833805] Code: 8b 70 28 48 85 f6 74 82 48 83 c6 08 bf 01 00 00 00 e8 38 bd ff ff 8b 83 c0 00 00 00 48 03 83 c8 00 00 00 e9 62 ff ff ff 0f 0b <0f> 0b e8 8d d0 ff ff e9 b3 fd ff ff 81 7c 24 14 40 01 00 00 4c 89
[ 843.843698] RSP: 0018:ffffc9000cce07c0 EFLAGS: 00010202
[ 843.848524] RAX: 0000000000000002 RBX: ffff88811a211d00 RCX: 0000000000000820
[ 843.853299] RDX: 0000000000000640 RSI: 0000000000000000 RDI: ffff88811a211d00
[ 843.857974] RBP: ffff888127d39518 R08: 00000000bee97314 R09: 0000000000000000
[ 843.862584] R10: 0000000000000000 R11: ffff8881109f0000 R12: 0000000000000880
[ 843.867147] R13: ffff888127d39580 R14: 0000000000000640 R15: ffff888170f7b900
[ 843.871680] FS: 0000000000000000(0000) GS:ffff889ffffc0000(0000) knlGS:0000000000000000
[ 843.876242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 843.880778] CR2: 00007fa42affcfb8 CR3: 000000011433a002 CR4: 0000000000770ef0
[ 843.885336] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 843.889809] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 843.894229] PKRU: 55555554
[ 843.898539] Call Trace:
[ 843.902772] <IRQ>
[ 843.906922] ? __die_body+0x1e/0x60
[ 843.911032] ? die+0x3c/0x60
[ 843.915037] ? do_trap+0xe2/0x110
[ 843.918911] ? pskb_expand_head+0x2ac/0x300
[ 843.922687] ? do_error_trap+0x65/0x80
[ 843.926342] ? pskb_expand_head+0x2ac/0x300
[ 843.929905] ? exc_invalid_op+0x50/0x60
[ 843.933398] ? pskb_expand_head+0x2ac/0x300
[ 843.936835] ? asm_exc_invalid_op+0x1a/0x20
[ 843.940226] ? pskb_expand_head+0x2ac/0x300
[ 843.943580] inet_frag_reasm_prepare+0xd1/0x240
[ 843.946904] ip_defrag+0x5d4/0x870
[ 843.950132] nf_ct_handle_fragments+0xec/0x130 [nf_conntrack]
[ 843.953334] tcf_ct_act+0x252/0xd90 [act_ct]
[ 843.956473] ? tcf_mirred_act+0x516/0x5a0 [act_mirred]
[ 843.959657] tcf_action_exec+0xa1/0x160
[ 843.962823] fl_classify+0x1db/0x1f0 [cls_flower]
[ 843.966010] ? skb_clone+0x53/0xc0
[ 843.969173] tcf_classify+0x24d/0x420
[ 843.972333] tc_run+0x8f/0xf0
[ 843.975465] __netif_receive_skb_core+0x67a/0x1080
[ 843.978634] ? dev_gro_receive+0x249/0x730
[ 843.981759] __netif_receive_skb_list_core+0x12d/0x260
[ 843.984869] netif_receive_skb_list_internal+0x1cb/0x2f0
[ 843.987957] ? mlx5e_handle_rx_cqe_mpwrq_rep+0xfa/0x1a0 [mlx5_core]
[ 843.991170] napi_complete_done+0x72/0x1a0
[ 843.994305] mlx5e_napi_poll+0x28c/0x6d0 [mlx5_core]
[ 843.997501] __napi_poll+0x25/0x1b0
[ 844.000627] net_rx_action+0x256/0x330
[ 844.003705] __do_softirq+0xb3/0x29b
[ 844.006718] irq_exit_rcu+0x9e/0xc0
[ 844.009672] common_interrupt+0x86/0xa0
[ 844.012537] </IRQ>
[ 844.015285] <TASK>
[ 844.017937] asm_common_interrupt+0x26/0x40
[ 844.020591] RIP: 0010:acpi_safe_halt+0x1b/0x20
[ 844.023247] Code: ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 65 48 8b 04 25 00 18 03 00 48 8b 00 a8 08 75 0c 66 90 0f 00 2d 81 d0 44 00 fb f4 <fa> c3 0f 1f 00 89 fa ec 48 8b 05 ee 88 ed 00 a9 00 00 00 80 75 11
[ 844.028900] RSP: 0018:ffffc90000533e70 EFLAGS: 00000246
[ 844.031725] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0000000000000000
[ 844.034553] RDX: ffff889ffffc0000 RSI: ffffffff828b7f20 RDI: ffff88a090f45c64
[ 844.037368] RBP: ffff88a0901a2800 R08: ffff88a090f45c00 R09: 00000000000317c0
[ 844.040155] R10: 00ec812281150475 R11: ffff889fffff0e04 R12: ffffffff828b7fa0
[ 844.042962] R13: ffffffff828b7f20 R14: 0000000000000001 R15: 0000000000000000
[ 844.045819] acpi_idle_enter+0x7b/0xc0
[ 844.048621] cpuidle_enter_state+0x7f/0x430
[ 844.051451] cpuidle_enter+0x2d/0x40
[ 844.054279] do_idle+0x1d4/0x240
[ 844.057096] cpu_startup_entry+0x2a/0x30
[ 844.059934] start_secondary+0x104/0x130
[ 844.062787] secondary_startup_64_no_verify+0x16b/0x16b
[ 844.065674] </TASK>
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Tao Liu <taoliu828@163.com>
Link: https://lore.kernel.org/r/20231228081457.936732-1-taoliu828@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the CAIF sub-modules.
Link: https://lore.kernel.org/r/20240104144855.1320993-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add description to net/packet/af_packet.c
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240104144119.1319055-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the DSA tag modules.
The descriptions are copy/pasted Kconfig names, with s/^Tag/DSA tag/.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20240104143759.1318137-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the ATM modules and drivers.
Link: https://lore.kernel.org/r/20240104143737.1317945-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Inserting the device to block xarray in qdisc_create() is not suitable
place to do this. As it requires use of tcf_block() callback, it causes
multiple issues. It is called for all qdisc types, which is incorrect.
So, instead, move it to more suitable place, which is tcf_block_get_ext()
and make sure it is only done for qdiscs that use block infrastructure
and also only for blocks which are shared.
Symmetrically, alter the cleanup path, move the xarray entry removal
into tcf_block_put_ext().
Fixes: 913b47d3424e ("net/sched: Introduce tc block netdev tracking infra")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Closes: https://lore.kernel.org/all/ZY1hBb8GFwycfgvd@shredder/
Reported-by: Kui-Feng Lee <sinquersw@gmail.com>
Closes: https://lore.kernel.org/all/ce8d3e55-b8bc-409c-ace9-5cf1c4f7c88e@gmail.com/
Reported-and-tested-by: syzbot+84339b9e7330daae4d66@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/0000000000007c85f5060dcc3a28@google.com/
Reported-and-tested-by: syzbot+806b0572c8d06b66b234@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000082f2f2060dcc3a92@google.com/
Reported-and-tested-by: syzbot+0039110f932d438130f9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/0000000000007fbc8c060dcc3a5c@google.com/
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()")
0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL")
https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Just a couple of more things over the holidays:
- first kunit tests for both cfg80211 and mac80211
- a few multi-link fixes
- DSCP mapping update
- RCU fix
* tag 'wireless-next-2024-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next:
wifi: mac80211: remove redundant ML element check
wifi: cfg80211: parse all ML elements in an ML probe response
wifi: cfg80211: correct comment about MLD ID
wifi: cfg80211: Update the default DSCP-to-UP mapping
wifi: cfg80211: tests: add some scanning related tests
wifi: mac80211: kunit: extend MFP tests
wifi: mac80211: kunit: generalize public action test
wifi: mac80211: add kunit tests for public action handling
kunit: add a convenience allocation wrapper for SKBs
kunit: add parameter generation macro using description from array
wifi: mac80211: fix spelling typo in comment
wifi: cfg80211: fix RCU dereference in __cfg80211_bss_update
====================
Link: https://lore.kernel.org/r/20240103144423.52269-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and netfilter.
We haven't accumulated much over the break. If it wasn't for the
uninterrupted stream of fixes for Intel drivers this PR would be very
slim. There was a handful of user reports, however, either they stood
out because of the lower traffic or users have had more time to test
over the break. The ones which are v6.7-relevant should be wrapped up.
Current release - regressions:
- Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum
required", it caused issues on networks where routers send prefixes
with preferred_lft=0
- wifi:
- iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock
- mac80211: fix re-adding debugfs entries during reconfiguration
Current release - new code bugs:
- tcp: print AO/MD5 messages only if there are any keys
Previous releases - regressions:
- virtio_net: fix missing dma unmap for resize, prevent OOM
Previous releases - always broken:
- mptcp: prevent tcp diag from closing listener subflows
- nf_tables:
- set transport header offset for egress hook, fix IPv4 mangling
- skip set commit for deleted/destroyed sets, avoid double deactivation
- nat: make sure action is set for all ct states, fix openvswitch
matching on ICMP packets in related state
- eth: mlxbf_gige: fix receive hang under heavy traffic
- eth: r8169: fix PCI error on system resume for RTL8168FP
- net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling"
* tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
net/tcp: Only produce AO/MD5 logs if there are any keys
net: Implement missing SO_TIMESTAMPING_NEW cmsg support
bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
net: ravb: Wait for operating mode to be applied
asix: Add check for usbnet_get_endpoints
octeontx2-af: Re-enable MAC TX in otx2_stop processing
octeontx2-af: Always configure NIX TX link credits based on max frame size
net/smc: fix invalid link access in dumping SMC-R connections
net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
virtio_net: fix missing dma unmap for resize
igc: Fix hicredit calculation
ice: fix Get link status data length
i40e: Restore VF MSI-X state during PCI reset
i40e: fix use-after-free in i40e_aqc_add_filters()
net: Save and restore msg_namelen in sock_sendmsg
netfilter: nft_immediate: drop chain reference counter on error
netfilter: nf_nat: fix action not being set for all ct states
net: bcmgenet: Fix FCS generation for fragmented skbuffs
mptcp: prevent tcp diag from closing listener subflows
MAINTAINERS: add Geliang as reviewer for MPTCP
...
|
|
This reverts commit 32bb4515e34469975abc936deb0a116c4a445817.
This reverts commit d078d480639a4f3b5fc2d56247afa38e0956483a.
This reverts commit fcc4b105caa4b844bf043375bf799c20a9c99db1.
This reverts commit 345237dbc1bdbb274c9fb9ec38976261ff4a40b8.
This reverts commit 7db69ec9cfb8b4ab50420262631fb2d1908b25bf.
This reverts commit 95132a018f00f5dad38bdcfd4180d1af955d46f6.
This reverts commit 63d5eaf35ac36cad00cfb3809d794ef0078c822b.
This reverts commit c29451aefcb42359905d18678de38e52eccb3bb5.
This reverts commit 2ab0edb505faa9ac90dee1732571390f074e8113.
This reverts commit dedd702a35793ab462fce4c737eeba0badf9718e.
This reverts commit 034fcc210349b873ece7356905be5c6ca11eef2a.
This reverts commit 9c5625f559ad6fe9f6f733c11475bf470e637d34.
This reverts commit 02018c544ef113e980a2349eba89003d6f399d22.
Looks like we need more time for reviews, and incremental
changes will be hard to make sense of. So revert.
Link: https://lore.kernel.org/all/ZZP6FV5sXEf+xd58@shell.armlinux.org.uk/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next
Miquel Raynal says:
====================
This pull request mainly brings support for dynamic associations in
the WPAN world. Thanks to the recent improvements it was possible to
discover nearby devices, it is now also possible to associate with them
to form a sub-network using a specific PAN ID. The support includes
several functions, such as:
* Requesting an association to a coordinator, waiting for the response
* Sending a disassociation notification to a coordinator
* Receiving an association request when we are coordinator, answering
the request (for now all devices are accepted up to a limit, to be
refined)
* Sending a disassociation notification to a child
* Users may request the list of associated devices (the parent and the
children).
Here are a few example of userspace calls that can be made:
# iwpan dev <dev> associate pan_id 2 coord $COORD
# iwpan dev <dev> list_associations
# iwpan dev <dev> disassociate ext_addr $COORD
There are as well two patches from Uwe turning remove callbacks into
void functions.
* tag 'ieee802154-for-net-next-2023-12-20' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next:
mac802154: Avoid new associations while disassociating
ieee802154: Avoid confusing changes after associating
mac802154: Only allow PAN controllers to process association requests
mac802154: Use the PAN coordinator parameter when stamping packets
mac80254: Provide real PAN coordinator info in beacons
ieee802154: Give the user the association list
mac802154: Handle disassociation notifications from peers
mac802154: Follow the number of associated devices
ieee802154: Add support for limiting the number of associated devices
mac802154: Handle association requests from peers
mac802154: Handle disassociations
ieee802154: Add support for user disassociation requests
mac802154: Handle associating
ieee802154: Add support for user association requests
ieee802154: Internal PAN management
ieee802154: Let PAN IDs be reset
ieee802154: hwsim: Convert to platform remove callback returning void
ieee802154: fakelb: Convert to platform remove callback returning void
====================
Link: https://lore.kernel.org/r/20231220095556.4d9cef91@xps-13
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For backchannel requests that lookup the appropriate nfs_client, use the
state-management rpc_clnt's rpc_timeout parameters for the backchannel's
response. When the nfs_client cannot be found, fall back to using the
xprt's default timeout parameters.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on
the sending list"), any 4.1 backchannel tasks placed on the sending queue
would immediately return with -ETIMEDOUT since their req timers are zero.
Initialize the backchannel's rpc_rqst timeout parameters from the xprt's
default timeout settings.
Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|