Age | Commit message (Collapse) | Author |
|
Mechanical transformation, no logical changes intended.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nft_counter_reset() resets the counter by subtracting the previously
retrieved value from the counter. This is a write operation on the
counter and as such it requires to be performed with a write sequence of
nft_counter_seq to serialize against its possible reader.
Update the packets/ bytes within write-sequence of nft_counter_seq.
Fixes: d84701ecbcd6a ("netfilter: nft_counter: rework atomic dump and reset")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The sequence counter nft_counter_seq is a per-CPU counter. There is no
lock associated with it. nft_counter_do_eval() is using the same counter
and disables BH which suggest that it can be invoked from a softirq.
This in turn means that nft_counter_offload_stats(), which disables only
preemption, can be interrupted by nft_counter_do_eval() leading to two
writer for one seqcount_t.
This can lead to loosing stats or reading statistics while they are
updated.
Disable BH during stats update in nft_counter_offload_stats() to ensure
one writer at a time.
Fixes: b72920f6e4a9d ("netfilter: nftables: counter hardware offload support")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Wen Gu says:
====================
net/smc: introduce ringbufs usage statistics
Currently, we have histograms that show the sizes of ringbufs that ever
used by SMC connections. However, they are always incremental and since
SMC allows the reuse of ringbufs, we cannot know the actual amount of
ringbufs being allocated or actively used.
So this patch set introduces statistics for the amount of ringbufs that
actually allocated by link group and actively used by connections of a
certain net namespace, so that we can react based on these memory usage
information, e.g. active fallback to TCP.
With appropriate adaptations of smc-tools, we can obtain these ringbufs
usage information:
$ smcr -d linkgroup
LG-ID : 00000500
LG-Role : SERV
LG-Type : ASYML
VLAN : 0
PNET-ID :
Version : 1
Conns : 0
Sndbuf : 12910592 B <-
RMB : 12910592 B <-
or
$ smcr -d stats
[...]
RX Stats
Data transmitted (Bytes) 869225943 (869.2M)
Total requests 18494479
Buffer usage (Bytes) 12910592 (12.31M) <-
[...]
TX Stats
Data transmitted (Bytes) 12760884405 (12.76G)
Total requests 36988338
Buffer usage (Bytes) 12910592 (12.31M) <-
[...]
[...]
Change log:
v3->v2
- use new helper nla_put_uint() instead of nla_put_u64_64bit().
v2->v1
https://lore.kernel.org/r/20240807075939.57882-1-guwen@linux.alibaba.com/
- remove inline keyword in .c files.
- use local variable in macros to avoid potential side effects.
v1
https://lore.kernel.org/r/20240805090551.80786-1-guwen@linux.alibaba.com/
====================
Link: https://patch.msgid.link/20240814130827.73321-1-guwen@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The buffer size histograms in smc_stats, namely rx/tx_rmbsize, record
the sizes of ringbufs for all connections that have ever appeared in
the net namespace. They are incremental and we cannot know the actual
ringbufs usage from these. So here introduces statistics for current
ringbufs usage of existing smc connections in the net namespace into
smc_stats, it will be incremented when new connection uses a ringbuf
and decremented when the ringbuf is unused.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently we have the statistics on sndbuf/RMB sizes of all connections
that have ever been on the link group, namely smc_stats_memsize. However
these statistics are incremental and since the ringbufs of link group
are allowed to be reused, we cannot know the actual allocated buffers
through these. So here introduces the statistic on actual allocated
ringbufs of the link group, it will be incremented when a new ringbuf is
added into buf_list and decremented when it is deleted from buf_list.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This avoids warning:
[ 0.118053] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
Caused by get_c0_compare_int on secondary CPU.
We also skipped saving IRQ number to struct clock_event_device *cd as
it's never used by clockevent core, as per comments it's only meant
for "non CPU local devices".
Reported-by: Serge Semin <fancer.lancer@gmail.com>
Closes: https://lore.kernel.org/linux-mips/6szkkqxpsw26zajwysdrwplpjvhl5abpnmxgu2xuj3dkzjnvsf@4daqrz4mf44k/
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Add an skb helper function to copy a range of bytes from within
an existing skb_seq_state.
Signed-off-by: Christian Hopps <chopps@labn.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We don't have sufficient information to debug:
https://github.com/koverstreet/bcachefs/issues/726
- print out durability of extent ptrs, when non default
- print the number of replicas we need in data_update_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
syzkaller reported UAF in kcm_release(). [0]
The scenario is
1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
2. Thread A resumes building skb from kcm->seq_skb but is blocked
by sk_stream_wait_memory()
3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
and puts the skb to the write queue
4. Thread A faces an error and finally frees skb that is already in the
write queue
5. kcm_release() does double-free the skb in the write queue
When a thread is building a MSG_MORE skb, another thread must not touch it.
Let's add a per-sk mutex and serialise kcm_sendmsg().
[0]:
BUG: KASAN: slab-use-after-free in __skb_unlink include/linux/skbuff.h:2366 [inline]
BUG: KASAN: slab-use-after-free in __skb_dequeue include/linux/skbuff.h:2385 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge include/linux/skbuff.h:3181 [inline]
BUG: KASAN: slab-use-after-free in kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
Read of size 8 at addr ffff0000ced0fc80 by task syz-executor329/6167
CPU: 1 PID: 6167 Comm: syz-executor329 Tainted: G B 6.8.0-rc5-syzkaller-g9abbc24128bc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:291
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:298
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x178/0x518 mm/kasan/report.c:488
kasan_report+0xd8/0x138 mm/kasan/report.c:601
__asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381
__skb_unlink include/linux/skbuff.h:2366 [inline]
__skb_dequeue include/linux/skbuff.h:2385 [inline]
__skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
__skb_queue_purge include/linux/skbuff.h:3181 [inline]
kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
__sock_release net/socket.c:659 [inline]
sock_close+0xa4/0x1e8 net/socket.c:1421
__fput+0x30c/0x738 fs/file_table.c:376
____fput+0x20/0x30 fs/file_table.c:404
task_work_run+0x230/0x2e0 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x618/0x1f64 kernel/exit.c:871
do_group_exit+0x194/0x22c kernel/exit.c:1020
get_signal+0x1500/0x15ec kernel/signal.c:2893
do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Allocated by task 6166:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x40/0x78 mm/kasan/common.c:68
kasan_save_alloc_info+0x70/0x84 mm/kasan/generic.c:626
unpoison_slab_object mm/kasan/common.c:314 [inline]
__kasan_slab_alloc+0x74/0x8c mm/kasan/common.c:340
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc_node+0x204/0x4c0 mm/slub.c:3903
__alloc_skb+0x19c/0x3d8 net/core/skbuff.c:641
alloc_skb include/linux/skbuff.h:1296 [inline]
kcm_sendmsg+0x1d3c/0x2124 net/kcm/kcmsock.c:783
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
sock_sendmsg+0x220/0x2c0 net/socket.c:768
splice_to_socket+0x7cc/0xd58 fs/splice.c:889
do_splice_from fs/splice.c:941 [inline]
direct_splice_actor+0xec/0x1d8 fs/splice.c:1164
splice_direct_to_actor+0x438/0xa0c fs/splice.c:1108
do_splice_direct_actor fs/splice.c:1207 [inline]
do_splice_direct+0x1e4/0x304 fs/splice.c:1233
do_sendfile+0x460/0xb3c fs/read_write.c:1295
__do_sys_sendfile64 fs/read_write.c:1362 [inline]
__se_sys_sendfile64 fs/read_write.c:1348 [inline]
__arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1348
__invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Freed by task 6167:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x40/0x78 mm/kasan/common.c:68
kasan_save_free_info+0x5c/0x74 mm/kasan/generic.c:640
poison_slab_object+0x124/0x18c mm/kasan/common.c:241
__kasan_slab_free+0x3c/0x78 mm/kasan/common.c:257
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kmem_cache_free+0x15c/0x3d4 mm/slub.c:4363
kfree_skbmem+0x10c/0x19c
__kfree_skb net/core/skbuff.c:1109 [inline]
kfree_skb_reason+0x240/0x6f4 net/core/skbuff.c:1144
kfree_skb include/linux/skbuff.h:1244 [inline]
kcm_release+0x104/0x4c8 net/kcm/kcmsock.c:1685
__sock_release net/socket.c:659 [inline]
sock_close+0xa4/0x1e8 net/socket.c:1421
__fput+0x30c/0x738 fs/file_table.c:376
____fput+0x20/0x30 fs/file_table.c:404
task_work_run+0x230/0x2e0 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x618/0x1f64 kernel/exit.c:871
do_group_exit+0x194/0x22c kernel/exit.c:1020
get_signal+0x1500/0x15ec kernel/signal.c:2893
do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The buggy address belongs to the object at ffff0000ced0fc80
which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 0 bytes inside of
freed 240-byte region [ffff0000ced0fc80, ffff0000ced0fd70)
The buggy address belongs to the physical page:
page:00000000d35f4ae4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ed0f
flags: 0x5ffc00000000800(slab|node=0|zone=2|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 05ffc00000000800 ffff0000c1cbf640 fffffdffc3423100 dead000000000004
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff0000ced0fb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff0000ced0fc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff0000ced0fc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff0000ced0fd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
ffff0000ced0fd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60
Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240815220437.69511-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The check for '!to' is redundant here, since skb_can_coalesce() already
contains this check.
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1723730983-22912-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit a1ab24e5fc4a ("mptcp: consolidate sockopt synchronization")
removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816100404.879598-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These are never implenmented since commit b691b1116e82 ("net/mlx5: Implement
devlink port function cmds to control ipsec_packet").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101550.881844-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is no caller and implementations in tree.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101638.882072-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit f13697cc7a19 ("gve: Switch to config-aware queue allocation")
convert this function to gve_rx_alloc_rings_gqi().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101906.882743-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the MCTP route input test, we're routing one skb, then (when delivery
is expected) checking the resulting routed skb.
However, we're currently checking the original skb length, rather than
the routed skb. Check the routed skb instead; the original will have
been freed at this point.
Fixes: 8892c0490779 ("mctp: Add route input to socket tests")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/kernel-janitors/4ad204f0-94cf-46c5-bdab-49592addf315@kili.mountain/
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816-mctp-kunit-skb-fix-v1-1-3c367ac89c27@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the netlink policy to validate IPv6 address length.
Destination address currently has policy for max len set,
and source has no policy validation. In both cases
the code does the real check. With correct policy
check the code can be removed.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240816212245.467745-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We need to read a u32 value (4 bytes), not size of a pointer to that
value.
Also, himax_read_mcu() wrapper is an overkill, remove it and use
himax_bus_read() directly.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408200301.Ujpj7Vov-lkp@intel.com/
Fixes: 0944829d491e ("Input: himax_hx83112b - implement MCU register reading")
Tested-by: Felix Kaechele <felix@kaechele.ca>
Link: https://lore.kernel.org/r/ZsPdmtfC54R7JVxR@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Michal Luczaj says:
====================
Series takes care of few bugs and missing features with the aim to improve
the test coverage of sockmap/sockhash.
Last patch is a create_pair() rewrite making use of
__attribute__((cleanup)) to handle socket fd lifetime.
---
Changes in v2:
- Rebase on bpf-next (Jakub)
- Use cleanup helpers from kernel's cleanup.h (Jakub)
- Fix subject of patch 3, rephrase patch 4, use correct prefix
- Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co
Changes in v1:
- No declarations in function body (Jakub)
- Don't touch output arguments until function succeeds (Jakub)
- Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Rewrite function to have (unneeded) socket descriptors automatically
close()d when leaving the scope. Make sure the "ownership" of fds is
correctly passed via take_fd(); i.e. descriptor returned to caller will
remain valid.
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-6-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Constants got switched reducing the test's coverage. Replace SOCK_DGRAM
with SOCK_STREAM in one of unix_inet_skb_redir_to_connected() tests.
Fixes: 51354f700d40 ("bpf, sockmap: Add af_unix test with both sockets in map")
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-5-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Do actually test the sotype as specified by the caller.
This picks up after commit 75e0e27db6cf ("selftest/bpf: Change udp to inet
in some function names").
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-4-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Replace implementation with a call to a generic function.
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-3-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Following create_pair() changes, remove unused function argument in
create_socket_pairs() and adapt its callers, i.e. drop the open-coded
loopback socket creation.
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-2-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Extend the function to allow creating socket pairs of SOCK_STREAM,
SOCK_DGRAM and SOCK_SEQPACKET.
Adapt direct callers and leave further cleanups for the following patch.
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20240731-selftest-sockmap-fixes-v2-1-08a0c73abed2@rbox.co
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- memory corruption fixes for hid-cougar (Camila Alvarez) and
hid-amd_sfh (Olivier Sobrie)
- fix for regression in Wacom driver of twist gesture handling (Jason
Gerecke)
- two new device IDs for hid-multitouch (Dmitry Savin) and hid-asus
(Luke D. Jones)
* tag 'hid-for-linus-2024081901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: Defer calculation of resolution until resolution_code is known
HID: multitouch: Add support for GT7868Q
HID: amd_sfh: free driver_data after destroying hid device
hid-asus: add ROG Ally X prod ID to quirk list
HID: cougar: fix slab-out-of-bounds Read in cougar_report_fixup
|
|
Exec_queue cleanup requires HW access, so we need to use devm instead of
drmm for it.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 5a891a0e69f134f53cc91b409f38e5ea1cafaf0a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The BO cleanup touches the GGTT and therefore requires the HW to be
available, so we need to use devm instead of drmm.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1160
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809231237.1503796-2-daniele.ceraolospurio@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 8d3a2d3d766a823c7510cdc17e6ff7c042c63b61)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Wa_14021821874 applies to xe2_hpg
V2(Himal):
- Use space after define
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812134117.813670-1-tejas.upadhyay@intel.com
(cherry picked from commit 21ff3a16e92e2fa4f906a61d148aca1423c58298)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
This WA is applied while initializing the media GT, but it a primary
GT WA (because it modifies a register on the primary GT), so the XE_WA
macro is returning false even when the WA should be applied.
Fix this by using the primary GT in the macro.
Note that this WA only applies to PXP and we don't yet support that in
Xe, so there are no negative effects to this bug, which is why we didn't
see any errors in testing.
v2: use the primary GT in the macro instead of marking the WA as
platform-wide (Lucas, Matt).
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240807235333.1370915-1-daniele.ceraolospurio@intel.com
(cherry picked from commit e422c0bfd9e47e399e86bcc483f49d8b54064fc2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Wa_15015404425 asks us to perform four "dummy" writes to a
non-existent register offset before every real register read.
Although the specific offset of the writes doesn't directly
matter, the workaround suggests offset 0x130030 as a good target
so that these writes will be easy to recognize and filter out in
debugging traces.
V5(MattR):
- Avoid negating an equality comparison
V4(MattR):
- Use writel and remove xe_reg usage
V3(MattR):
- Define dummy reg local to function
- Avoid tracing dummy writes
- Update commit message
V2:
- Add WA to 8/16/32bit reads also - MattR
- Corrected dummy reg address - MattR
- Use for loop to avoid mental pause - JaniN
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240709155606.2998941-1-tejas.upadhyay@intel.com
(cherry picked from commit 86c5b70a9c0c3f05f7002ef8b789460c96b54e27)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Issuing the flush on top of an ongoing flush is not desirable.
Lets use lock to make it sequential.
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 71733b8d7f50b61403f940c6c9745fb3a9b98dcb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
workaround 14021402888 also applies to Xe2_LPG.
Replicate the existing entry to one specific for Xe2_LPG.
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703090754.1323647-1-krishnaiah.bommu@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 56ab6986992ba143aee0bda33e15a764343e271d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Wa_16021639441 applies to Xe2_LPM.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240701184637.531794-1-ngai-mint.kwan@linux.intel.com
(cherry picked from commit 74e3076800067c6dc0dcff5b75344cec064c20eb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
This involves enabling l2 caching of host side memory access to VRAM
through the CPU BAR. The main fallout here is with display since VRAM
writes from CPU can now be cached in GPU l2, and display is never
coherent with caches, so needs various manual flushing. In the case of
fbc we disable it due to complications in getting this to work
correctly (in a later patch).
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
(cherry picked from commit 01570b446939c3538b1aa3d059837f49fa14a3ae)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We recently moved the teardown of the vgic part of a vcpu inside
a critical section guarded by the config_lock. This teardown phase
involves calling into kvm_io_bus_unregister_dev(), which takes the
kvm->srcu lock.
However, this violates the established order where kvm->srcu is
taken on a memory fault (such as an MMIO access), possibly
followed by taking the config_lock if the GIC emulation requires
mutual exclusion from the other vcpus.
It therefore results in a bad lockdep splat, as reported by Zenghui.
Fix this by moving the call to kvm_io_bus_unregister_dev() outside
of the config_lock critical section. At this stage, there shouln't
be any need to hold the config_lock.
As an additional bonus, document the ordering between kvm->slots_lock,
kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't
know about those anymore.
Fixes: 9eb18136af9f ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Tested-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20240819125045.3474845-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
If there were LPIs being mapped behind our back (i.e., between .start() and
.stop()), we would put them at iter_unmark_lpis() without checking if they
were actually *marked*, which is obviously not good.
Switch to use the xa_for_each_marked() iterator to fix it.
Cc: stable@vger.kernel.org
Fixes: 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240817101541.1664-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
This patch is to move nf_ct_netns_get() out of nf_conncount_init()
and let the consumers of nf_conncount decide if they want to turn
on netfilter conntrack.
It makes nf_conncount more flexible to be used in other places and
avoids netfilter conntrack turned on when using it in openvswitch
conntrack.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
pipapo set backend maintains two copies of the datastructure, removing
the elements from the copy that is going to be discarded slows down
the abort path significantly, from several minutes to few seconds after
this patch.
This patch was previously reverted by
f86fb94011ae ("netfilter: nf_tables: revert do not remove elements if set backend implements .abort")
but it is now possible since recent work by Florian Westphal to perform
on-demand clone from insert/remove path:
532aec7e878b ("netfilter: nft_set_pipapo: remove dirty flag")
3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
a238106703ab ("netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone")
c5444786d0ea ("netfilter: nft_set_pipapo: merge deactivate helper into caller")
6c108d9bee44 ("netfilter: nft_set_pipapo: prepare walk function for on-demand clone")
8b8a2417558c ("netfilter: nft_set_pipapo: prepare destroy function for on-demand clone")
80efd2997fb9 ("netfilter: nft_set_pipapo: make pipapo_clone helper return NULL")
a590f4760922 ("netfilter: nft_set_pipapo: move prove_locking helper around")
after this series, the clone is fully released once aborted, no need to
take it back to previous state. Thus, no stale reference to elements can
occur.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nft_set_lookup_byid() is very slow when transaction becomes large, due to
walk of the transaction list.
Add a dedicated list that contains only the new sets.
Before: nft -f ruleset 0.07s user 0.00s system 0% cpu 1:04.84 total
After: nft -f ruleset 0.07s user 0.00s system 0% cpu 30.115 total
.. where ruleset contains ~10 sets with ~100k elements.
The above number is for a combined flush+reload of the ruleset.
With previous flush, even the first NEWELEM has to walk through a few
hundred thousands of DELSET(ELEM) transactions before the first NEWSET
object. To cope with random-order-newset-newsetelem we'd need to replace
commit_set_list with a hashtable.
Expectation is that a NEWELEM operation refers to the most recently added
set, so last entry of the dedicated list should be the set we want.
NB: This is not a bug fix per se (functionality is fine), but with
larger transaction batches list search takes forever, so it would be
nice to speed this up for -stable too, hence adding a "fixes" tag.
Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Use consume_skb in the batch code path to avoid generating spurious
NOT_SPECIFIED skb drop reasons.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Test that nfqueue with and without GSO process SCTP packets correctly.
Joint work with Florian and Pablo.
Signed-off-by: Antonio Ojea <aojea@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
when packet is enqueued with nfqueue and GSO is enabled, checksum
calculation has to take into account the protocol, as SCTP uses a
32 bits CRC checksum.
Enter skb_gso_segment() path in case of SCTP GSO packets because
skb_zerocopy() does not support for GSO_BY_FRAGS.
Joint work with Pablo.
Signed-off-by: Antonio Ojea <aojea@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
- Do not block printk on non-panic CPUs when they are dumping
backtraces
* tag 'printk-for-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/panic: Allow cpu backtraces to be written into ringbuffer during panic
|
|
These new trace points record xarray indices and the time of
endpoint registration and unregistration, to co-ordinate with
device removal events.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Nit: The built-in xa_limit_32b range starts at 0, but
XA_FLAGS_ALLOC1 configures the xarray's allocator to start at 1.
Adopt the more conventional XA_FLAGS_ALLOC because there's no
mechanical reason to skip 0.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
If the device's reference count is too high, the device completion
callback never fires.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Function bdev_get_queue() must not return NULL, so drop the check in
bdev_write_zeroes_sectors().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Link: https://lore.kernel.org/r/20240815163228.216051-3-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As reported in [0], we may get a hang when formatting a XFS FS on a RAID0
drive.
Commit 73a768d5f955 ("block: factor out a blk_write_zeroes_limit helper")
changed __blkdev_issue_write_zeroes() to read the max write zeroes
value in the loop. This is not safe as max write zeroes may change in
value. Specifically for the case of [0], the value goes to 0, and we get
an infinite loop.
Lift the limit reading out of the loop.
[0] https://lore.kernel.org/linux-xfs/4d31268f-310b-4220-88a2-e191c3932a82@oracle.com/T/#t
Fixes: 73a768d5f955 ("block: factor out a blk_write_zeroes_limit helper")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240815163228.216051-2-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Its possible that two threads call tcp_sk_exit_batch() concurrently,
once from the cleanup_net workqueue, once from a task that failed to clone
a new netns. In the latter case, error unwinding calls the exit handlers
in reverse order for the 'failed' netns.
tcp_sk_exit_batch() calls tcp_twsk_purge().
Problem is that since commit b099ce2602d8 ("net: Batch inet_twsk_purge"),
this function picks up twsk in any dying netns, not just the one passed
in via exit_batch list.
This means that the error unwind of setup_net() can "steal" and destroy
timewait sockets belonging to the exiting netns.
This allows the netns exit worker to proceed to call
WARN_ON_ONCE(!refcount_dec_and_test(&net->ipv4.tcp_death_row.tw_refcount));
without the expected 1 -> 0 transition, which then splats.
At same time, error unwind path that is also running inet_twsk_purge()
will splat as well:
WARNING: .. at lib/refcount.c:31 refcount_warn_saturate+0x1ed/0x210
...
refcount_dec include/linux/refcount.h:351 [inline]
inet_twsk_kill+0x758/0x9c0 net/ipv4/inet_timewait_sock.c:70
inet_twsk_deschedule_put net/ipv4/inet_timewait_sock.c:221
inet_twsk_purge+0x725/0x890 net/ipv4/inet_timewait_sock.c:304
tcp_sk_exit_batch+0x1c/0x170 net/ipv4/tcp_ipv4.c:3522
ops_exit_list+0x128/0x180 net/core/net_namespace.c:178
setup_net+0x714/0xb40 net/core/net_namespace.c:375
copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508
create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110
... because refcount_dec() of tw_refcount unexpectedly dropped to 0.
This doesn't seem like an actual bug (no tw sockets got lost and I don't
see a use-after-free) but as erroneous trigger of debug check.
Add a mutex to force strict ordering: the task that calls tcp_twsk_purge()
blocks other task from doing final _dec_and_test before mutex-owner has
removed all tw sockets of dying netns.
Fixes: e9bd0cca09d1 ("tcp: Don't allocate tcp_death_row outside of struct netns_ipv4.")
Reported-by: syzbot+8ea26396ff85d23a8929@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/0000000000003a5292061f5e4e19@google.com/
Link: https://lore.kernel.org/netdev/20240812140104.GA21559@breakpoint.cc/
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240812222857.29837-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|