Age | Commit message (Collapse) | Author |
|
Reject the usage of the SA_DIR attribute in xfrm netlink messages when
it's not applicable. This ensures that SA_DIR is only accepted for
certain message types (NEWSA, UPDSA, and ALLOCSPI)
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch introduces the 'dir' attribute, 'in' or 'out', to the
xfrm_state, SA, enhancing usability by delineating the scope of values
based on direction. An input SA will restrict values pertinent to input,
effectively segregating them from output-related values.
And an output SA will restrict attributes for output. This change aims
to streamline the configuration process and improve the overall
consistency of SA attributes during configuration.
This feature sets the groundwork for future patches, including
the upcoming IP-TFS patch.
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
net/core/page_pool_user.c
0b11b1c5c320 ("netdev: let netlink core handle -EMSGSIZE errors")
429679dcf7d9 ("page_pool: fix netlink dump stop/resume")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2024-03-06
1) Clear the ECN bits flowi4_tos in decode_session4().
This was already fixed but the bug was reintroduced
when decode_session4() switched to us the flow dissector.
From Guillaume Nault.
2) Fix UDP encapsulation in the TX path with packet offload mode.
From Leon Romanovsky,
3) Avoid clang fortify warning in copy_to_user_tmpl().
From Nathan Chancellor.
4) Fix inter address family tunnel in packet offload mode.
From Mike Yu.
* tag 'ipsec-2024-03-06' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: set skb control buffer based on packet offload as well
xfrm: fix xfrm child route lookup for packet offload
xfrm: Avoid clang fortify warning in copy_to_user_tmpl()
xfrm: Pass UDP encapsulation in TX packet offload
xfrm: Clear low order bits of ->flowi4_tos in decode_session4().
====================
Link: https://lore.kernel.org/r/20240306100438.3953516-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After a couple recent changes in LLVM, there is a warning (or error with
CONFIG_WERROR=y or W=e) from the compile time fortify source routines,
specifically the memset() in copy_to_user_tmpl().
In file included from net/xfrm/xfrm_user.c:14:
...
include/linux/fortify-string.h:438:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
438 | __write_overflow_field(p_size_field, size);
| ^
1 error generated.
While ->xfrm_nr has been validated against XFRM_MAX_DEPTH when its value
is first assigned in copy_templates() by calling validate_tmpl() first
(so there should not be any issue in practice), LLVM/clang cannot really
deduce that across the boundaries of these functions. Without that
knowledge, it cannot assume that the loop stops before i is greater than
XFRM_MAX_DEPTH, which would indeed result a stack buffer overflow in the
memset().
To make the bounds of ->xfrm_nr clear to the compiler and add additional
defense in case copy_to_user_tmpl() is ever used in a path where
->xfrm_nr has not been properly validated against XFRM_MAX_DEPTH first,
add an explicit bound check and early return, which clears up the
warning.
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1985
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
net/core/dev.c
9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()")
723de3ebef03 ("net: free altname using an RCU callback")
net/unix/garbage.c
11498715f266 ("af_unix: Remove io_uring code for GC.")
25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.")
drivers/net/ethernet/renesas/ravb_main.c
ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path"
)
c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")
net/mptcp/protocol.c
bdd70eb68913 ("mptcp: drop the push_pending field")
28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the XFRM interface drivers.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240208164244.3818498-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to allow drivers to fill all statistics, change the name
of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats.
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The policy memory was released but not HW driver data. Add
call to xfrm_dev_policy_delete(), so drivers will have a chance
to release their resources.
Fixes: 919e43fad516 ("xfrm: add an interface to offload policy")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
The previous commit 4e484b3e969b ("xfrm: rate limit SA mapping change
message to user space") added one additional attribute named
XFRMA_MTIMER_THRESH and described its type at compat_policy
(net/xfrm/xfrm_compat.c).
However, the author forgot to also describe the nla_policy at
xfrma_policy (net/xfrm/xfrm_user.c). Hence, this suppose NLA_U32 (4
bytes) value can be faked as empty (0 bytes) by a malicious user, which
leads to 4 bytes overflow read and heap information leak when parsing
nlattrs.
To exploit this, one malicious user can spray the SLUB objects and then
leverage this 4 bytes OOB read to leak the heap data into
x->mapping_maxage (see xfrm_update_ae_params(...)), and leak it to
userspace via copy_to_user_state_extra(...).
The above bug is assigned CVE-2023-3773. To fix it, this commit just
completes the nla_policy description for XFRMA_MTIMER_THRESH, which
enforces the length check and avoids such OOB read.
Fixes: 4e484b3e969b ("xfrm: rate limit SA mapping change message to user space")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Normally, x->replay_esn and x->preplay_esn should be allocated at
xfrm_alloc_replay_state_esn(...) in xfrm_state_construct(...), hence the
xfrm_update_ae_params(...) is okay to update them. However, the current
implementation of xfrm_new_ae(...) allows a malicious user to directly
dereference a NULL pointer and crash the kernel like below.
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 8253067 P4D 8253067 PUD 8e0e067 PMD 0
Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 PID: 98 Comm: poc.npd Not tainted 6.4.0-rc7-00072-gdad9774deaf1 #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.o4
RIP: 0010:memcpy_orig+0xad/0x140
Code: e8 4c 89 5f e0 48 8d 7f e0 73 d2 83 c2 20 48 29 d6 48 29 d7 83 fa 10 72 34 4c 8b 06 4c 8b 4e 08 c
RSP: 0018:ffff888008f57658 EFLAGS: 00000202
RAX: 0000000000000000 RBX: ffff888008bd0000 RCX: ffffffff8238e571
RDX: 0000000000000018 RSI: ffff888007f64844 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888008f57818
R13: ffff888007f64aa4 R14: 0000000000000000 R15: 0000000000000000
FS: 00000000014013c0(0000) GS:ffff88806d600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000054d8000 CR4: 00000000000006f0
Call Trace:
<TASK>
? __die+0x1f/0x70
? page_fault_oops+0x1e8/0x500
? __pfx_is_prefetch.constprop.0+0x10/0x10
? __pfx_page_fault_oops+0x10/0x10
? _raw_spin_unlock_irqrestore+0x11/0x40
? fixup_exception+0x36/0x460
? _raw_spin_unlock_irqrestore+0x11/0x40
? exc_page_fault+0x5e/0xc0
? asm_exc_page_fault+0x26/0x30
? xfrm_update_ae_params+0xd1/0x260
? memcpy_orig+0xad/0x140
? __pfx__raw_spin_lock_bh+0x10/0x10
xfrm_update_ae_params+0xe7/0x260
xfrm_new_ae+0x298/0x4e0
? __pfx_xfrm_new_ae+0x10/0x10
? __pfx_xfrm_new_ae+0x10/0x10
xfrm_user_rcv_msg+0x25a/0x410
? __pfx_xfrm_user_rcv_msg+0x10/0x10
? __alloc_skb+0xcf/0x210
? stack_trace_save+0x90/0xd0
? filter_irq_stacks+0x1c/0x70
? __stack_depot_save+0x39/0x4e0
? __kasan_slab_free+0x10a/0x190
? kmem_cache_free+0x9c/0x340
? netlink_recvmsg+0x23c/0x660
? sock_recvmsg+0xeb/0xf0
? __sys_recvfrom+0x13c/0x1f0
? __x64_sys_recvfrom+0x71/0x90
? do_syscall_64+0x3f/0x90
? entry_SYSCALL_64_after_hwframe+0x72/0xdc
? copyout+0x3e/0x50
netlink_rcv_skb+0xd6/0x210
? __pfx_xfrm_user_rcv_msg+0x10/0x10
? __pfx_netlink_rcv_skb+0x10/0x10
? __pfx_sock_has_perm+0x10/0x10
? mutex_lock+0x8d/0xe0
? __pfx_mutex_lock+0x10/0x10
xfrm_netlink_rcv+0x44/0x50
netlink_unicast+0x36f/0x4c0
? __pfx_netlink_unicast+0x10/0x10
? netlink_recvmsg+0x500/0x660
netlink_sendmsg+0x3b7/0x700
This Null-ptr-deref bug is assigned CVE-2023-3772. And this commit
adds additional NULL check in xfrm_update_ae_params to fix the NPD.
Fixes: d8647b79c3b7 ("xfrm: Add user interface for esn and big anti-replay windows")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
According to all consumers code of attrs[XFRMA_SEC_CTX], like
* verify_sec_ctx_len(), convert to xfrm_user_sec_ctx*
* xfrm_state_construct(), call security_xfrm_state_alloc whose prototype
is int security_xfrm_state_alloc(.., struct xfrm_user_sec_ctx *sec_ctx);
* copy_from_user_sec_ctx(), convert to xfrm_user_sec_ctx *
...
It seems that the expected parsing result for XFRMA_SEC_CTX should be
structure xfrm_user_sec_ctx, and the current xfrm_sec_ctx is confusing
and misleading (Luckily, they happen to have same size 8 bytes).
This commit amend the policy structure to xfrm_user_sec_ctx to avoid
ambiguity.
Fixes: cf5cb79f6946 ("[XFRM] netlink: Establish an attribute policy")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We found below OOB crash:
[ 44.211730] ==================================================================
[ 44.212045] BUG: KASAN: slab-out-of-bounds in memcmp+0x8b/0xb0
[ 44.212045] Read of size 8 at addr ffff88800870f320 by task poc.xfrm/97
[ 44.212045]
[ 44.212045] CPU: 0 PID: 97 Comm: poc.xfrm Not tainted 6.4.0-rc7-00072-gdad9774deaf1-dirty #4
[ 44.212045] Call Trace:
[ 44.212045] <TASK>
[ 44.212045] dump_stack_lvl+0x37/0x50
[ 44.212045] print_report+0xcc/0x620
[ 44.212045] ? __virt_addr_valid+0xf3/0x170
[ 44.212045] ? memcmp+0x8b/0xb0
[ 44.212045] kasan_report+0xb2/0xe0
[ 44.212045] ? memcmp+0x8b/0xb0
[ 44.212045] kasan_check_range+0x39/0x1c0
[ 44.212045] memcmp+0x8b/0xb0
[ 44.212045] xfrm_state_walk+0x21c/0x420
[ 44.212045] ? __pfx_dump_one_state+0x10/0x10
[ 44.212045] xfrm_dump_sa+0x1e2/0x290
[ 44.212045] ? __pfx_xfrm_dump_sa+0x10/0x10
[ 44.212045] ? __kernel_text_address+0xd/0x40
[ 44.212045] ? kasan_unpoison+0x27/0x60
[ 44.212045] ? mutex_lock+0x60/0xe0
[ 44.212045] ? __pfx_mutex_lock+0x10/0x10
[ 44.212045] ? kasan_save_stack+0x22/0x50
[ 44.212045] netlink_dump+0x322/0x6c0
[ 44.212045] ? __pfx_netlink_dump+0x10/0x10
[ 44.212045] ? mutex_unlock+0x7f/0xd0
[ 44.212045] ? __pfx_mutex_unlock+0x10/0x10
[ 44.212045] __netlink_dump_start+0x353/0x430
[ 44.212045] xfrm_user_rcv_msg+0x3a4/0x410
[ 44.212045] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 44.212045] ? __pfx_xfrm_user_rcv_msg+0x10/0x10
[ 44.212045] ? __pfx_xfrm_dump_sa+0x10/0x10
[ 44.212045] ? __pfx_xfrm_dump_sa_done+0x10/0x10
[ 44.212045] ? __stack_depot_save+0x382/0x4e0
[ 44.212045] ? filter_irq_stacks+0x1c/0x70
[ 44.212045] ? kasan_save_stack+0x32/0x50
[ 44.212045] ? kasan_save_stack+0x22/0x50
[ 44.212045] ? kasan_set_track+0x25/0x30
[ 44.212045] ? __kasan_slab_alloc+0x59/0x70
[ 44.212045] ? kmem_cache_alloc_node+0xf7/0x260
[ 44.212045] ? kmalloc_reserve+0xab/0x120
[ 44.212045] ? __alloc_skb+0xcf/0x210
[ 44.212045] ? netlink_sendmsg+0x509/0x700
[ 44.212045] ? sock_sendmsg+0xde/0xe0
[ 44.212045] ? __sys_sendto+0x18d/0x230
[ 44.212045] ? __x64_sys_sendto+0x71/0x90
[ 44.212045] ? do_syscall_64+0x3f/0x90
[ 44.212045] ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 44.212045] ? netlink_sendmsg+0x509/0x700
[ 44.212045] ? sock_sendmsg+0xde/0xe0
[ 44.212045] ? __sys_sendto+0x18d/0x230
[ 44.212045] ? __x64_sys_sendto+0x71/0x90
[ 44.212045] ? do_syscall_64+0x3f/0x90
[ 44.212045] ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 44.212045] ? kasan_save_stack+0x22/0x50
[ 44.212045] ? kasan_set_track+0x25/0x30
[ 44.212045] ? kasan_save_free_info+0x2e/0x50
[ 44.212045] ? __kasan_slab_free+0x10a/0x190
[ 44.212045] ? kmem_cache_free+0x9c/0x340
[ 44.212045] ? netlink_recvmsg+0x23c/0x660
[ 44.212045] ? sock_recvmsg+0xeb/0xf0
[ 44.212045] ? __sys_recvfrom+0x13c/0x1f0
[ 44.212045] ? __x64_sys_recvfrom+0x71/0x90
[ 44.212045] ? do_syscall_64+0x3f/0x90
[ 44.212045] ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 44.212045] ? copyout+0x3e/0x50
[ 44.212045] netlink_rcv_skb+0xd6/0x210
[ 44.212045] ? __pfx_xfrm_user_rcv_msg+0x10/0x10
[ 44.212045] ? __pfx_netlink_rcv_skb+0x10/0x10
[ 44.212045] ? __pfx_sock_has_perm+0x10/0x10
[ 44.212045] ? mutex_lock+0x8d/0xe0
[ 44.212045] ? __pfx_mutex_lock+0x10/0x10
[ 44.212045] xfrm_netlink_rcv+0x44/0x50
[ 44.212045] netlink_unicast+0x36f/0x4c0
[ 44.212045] ? __pfx_netlink_unicast+0x10/0x10
[ 44.212045] ? netlink_recvmsg+0x500/0x660
[ 44.212045] netlink_sendmsg+0x3b7/0x700
[ 44.212045] ? __pfx_netlink_sendmsg+0x10/0x10
[ 44.212045] ? __pfx_netlink_sendmsg+0x10/0x10
[ 44.212045] sock_sendmsg+0xde/0xe0
[ 44.212045] __sys_sendto+0x18d/0x230
[ 44.212045] ? __pfx___sys_sendto+0x10/0x10
[ 44.212045] ? rcu_core+0x44a/0xe10
[ 44.212045] ? __rseq_handle_notify_resume+0x45b/0x740
[ 44.212045] ? _raw_spin_lock_irq+0x81/0xe0
[ 44.212045] ? __pfx___rseq_handle_notify_resume+0x10/0x10
[ 44.212045] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[ 44.212045] ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[ 44.212045] ? __pfx_task_work_run+0x10/0x10
[ 44.212045] __x64_sys_sendto+0x71/0x90
[ 44.212045] do_syscall_64+0x3f/0x90
[ 44.212045] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 44.212045] RIP: 0033:0x44b7da
[ 44.212045] RSP: 002b:00007ffdc8838548 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 44.212045] RAX: ffffffffffffffda RBX: 00007ffdc8839978 RCX: 000000000044b7da
[ 44.212045] RDX: 0000000000000038 RSI: 00007ffdc8838770 RDI: 0000000000000003
[ 44.212045] RBP: 00007ffdc88385b0 R08: 00007ffdc883858c R09: 000000000000000c
[ 44.212045] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[ 44.212045] R13: 00007ffdc8839968 R14: 00000000004c37d0 R15: 0000000000000001
[ 44.212045] </TASK>
[ 44.212045]
[ 44.212045] Allocated by task 97:
[ 44.212045] kasan_save_stack+0x22/0x50
[ 44.212045] kasan_set_track+0x25/0x30
[ 44.212045] __kasan_kmalloc+0x7f/0x90
[ 44.212045] __kmalloc_node_track_caller+0x5b/0x140
[ 44.212045] kmemdup+0x21/0x50
[ 44.212045] xfrm_dump_sa+0x17d/0x290
[ 44.212045] netlink_dump+0x322/0x6c0
[ 44.212045] __netlink_dump_start+0x353/0x430
[ 44.212045] xfrm_user_rcv_msg+0x3a4/0x410
[ 44.212045] netlink_rcv_skb+0xd6/0x210
[ 44.212045] xfrm_netlink_rcv+0x44/0x50
[ 44.212045] netlink_unicast+0x36f/0x4c0
[ 44.212045] netlink_sendmsg+0x3b7/0x700
[ 44.212045] sock_sendmsg+0xde/0xe0
[ 44.212045] __sys_sendto+0x18d/0x230
[ 44.212045] __x64_sys_sendto+0x71/0x90
[ 44.212045] do_syscall_64+0x3f/0x90
[ 44.212045] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 44.212045]
[ 44.212045] The buggy address belongs to the object at ffff88800870f300
[ 44.212045] which belongs to the cache kmalloc-64 of size 64
[ 44.212045] The buggy address is located 32 bytes inside of
[ 44.212045] allocated 36-byte region [ffff88800870f300, ffff88800870f324)
[ 44.212045]
[ 44.212045] The buggy address belongs to the physical page:
[ 44.212045] page:00000000e4de16ee refcount:1 mapcount:0 mapping:000000000 ...
[ 44.212045] flags: 0x100000000000200(slab|node=0|zone=1)
[ 44.212045] page_type: 0xffffffff()
[ 44.212045] raw: 0100000000000200 ffff888004c41640 dead000000000122 0000000000000000
[ 44.212045] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[ 44.212045] page dumped because: kasan: bad access detected
[ 44.212045]
[ 44.212045] Memory state around the buggy address:
[ 44.212045] ffff88800870f200: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 44.212045] ffff88800870f280: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
[ 44.212045] >ffff88800870f300: 00 00 00 00 04 fc fc fc fc fc fc fc fc fc fc fc
[ 44.212045] ^
[ 44.212045] ffff88800870f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 44.212045] ffff88800870f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 44.212045] ==================================================================
By investigating the code, we find the root cause of this OOB is the lack
of checks in xfrm_dump_sa(). The buggy code allows a malicious user to pass
arbitrary value of filter->splen/dplen. Hence, with crafted xfrm states,
the attacker can achieve 8 bytes heap OOB read, which causes info leak.
if (attrs[XFRMA_ADDRESS_FILTER]) {
filter = kmemdup(nla_data(attrs[XFRMA_ADDRESS_FILTER]),
sizeof(*filter), GFP_KERNEL);
if (filter == NULL)
return -ENOMEM;
// NO MORE CHECKS HERE !!!
}
This patch fixes the OOB by adding necessary boundary checks, just like
the code in pfkey_dump() function.
Fixes: d3623099d350 ("ipsec: add support of limited SA dump")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2023-05-16
1) Don't check the policy default if we have an allow
policy. Fix from Sabrina Dubroca.
2) Fix netdevice refount usage on offload.
From Leon Romanovsky.
3) Use netdev_put instead of dev_puti to correctly release
the netdev on failure in xfrm_dev_policy_add.
From Leon Romanovsky.
4) Revert "Fix XFRM-I support for nested ESP tunnels"
This broke Netfilter policy matching.
From Martin Willi.
5) Reject optional tunnel/BEET mode templates in outbound policies
on netlink and pfkey sockets. From Tobias Brunner.
6) Check if_id in inbound policy/secpath match to make
it symetric to the outbound codepath.
From Benedict Wong.
* tag 'ipsec-2023-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: Check if_id in inbound policy/secpath match
af_key: Reject optional tunnel/BEET mode templates in outbound policies
xfrm: Reject optional tunnel/BEET mode templates in outbound policies
Revert "Fix XFRM-I support for nested ESP tunnels"
xfrm: Fix leak of dev tracker
xfrm: release all offloaded policy memory
xfrm: don't check the default policy if the policy allows the packet
====================
Link: https://lore.kernel.org/r/20230516052405.2677554-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
xfrm_state_find() uses `encap_family` of the current template with
the passed local and remote addresses to find a matching state.
If an optional tunnel or BEET mode template is skipped in a mixed-family
scenario, there could be a mismatch causing an out-of-bounds read as
the addresses were not replaced to match the family of the next template.
While there are theoretical use cases for optional templates in outbound
policies, the only practical one is to skip IPComp states in inbound
policies if uncompressed packets are received that are handled by an
implicitly created IPIP state instead.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Failure to add offloaded policy will cause to the following
error once user will try to reload driver.
Unregister_netdevice: waiting for eth3 to become free. Usage count = 2
This was caused by xfrm_dev_policy_add() which increments reference
to net_device. That reference was supposed to be decremented
in xfrm_dev_policy_free(). However the latter wasn't called.
unregister_netdevice: waiting for eth3 to become free. Usage count = 2
leaked reference.
xfrm_dev_policy_add+0xff/0x3d0
xfrm_policy_construct+0x352/0x420
xfrm_add_policy+0x179/0x320
xfrm_user_rcv_msg+0x1d2/0x3d0
netlink_rcv_skb+0xe0/0x210
xfrm_netlink_rcv+0x45/0x50
netlink_unicast+0x346/0x490
netlink_sendmsg+0x3b0/0x6c0
sock_sendmsg+0x73/0xc0
sock_write_iter+0x13b/0x1f0
vfs_write+0x528/0x5d0
ksys_write+0x120/0x150
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixes: 919e43fad516 ("xfrm: add an interface to offload policy")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
Extend packet offload to fully support libreswan
The following patches are an outcome of Raed's work to add packet
offload support to libreswan [1].
The series includes:
* Priority support to IPsec policies
* Statistics per-SA (visible through "ip -s xfrm state ..." command)
* Support to IKE policy holes
* Fine tuning to acquire logic.
[1] https://github.com/libreswan/libreswan/pull/986
Link: https://lore.kernel.org/all/cover.1678714336.git.leon@kernel.org
* tag 'ipsec-libreswan-mlx5' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5e: Update IPsec per SA packets/bytes count
net/mlx5e: Use one rule to count all IPsec Tx offloaded traffic
net/mlx5e: Support IPsec acquire default SA
net/mlx5e: Allow policies with reqid 0, to support IKE policy holes
xfrm: copy_to_user_state fetch offloaded SA packets/bytes statistics
xfrm: add new device offload acquire flag
net/mlx5e: Use chains for IPsec policy priority offload
net/mlx5: fs_core: Allow ignore_flow_level on TX dest
net/mlx5: fs_chains: Refactor to detach chains from tc usage
====================
Link: https://lore.kernel.org/r/20230320094722.1009304-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both in RX and TX, the traffic that performs IPsec packet offload
transformation is accounted by HW only. Consequently, the HW should
be queried for packets/bytes statistics when user asks for such
transformation data.
Signed-off-by: Raed Salem <raeds@nvidia.com>
Link: https://lore.kernel.org/r/d90ec74186452b1509ee94875d942cb777b7181e.1678714336.git.leon@kernel.org
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When copying data to user-space we should ensure that only valid
data is copied over. Padding in structures may be filled with
random (possibly sensitve) data and should never be given directly
to user-space.
This patch fixes the copying of xfrm algorithms and the encap
template in xfrm_user so that padding is zeroed.
Reported-by: syzbot+fa5414772d5c445dac3c@syzkaller.appspotmail.com
Reported-by: Hyunwoo Kim <v4bel@theori.io>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Extend netlink interface to add and delete XFRM policy from the device.
This functionality is a first step to implement packet IPsec offload solution.
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In the next patches, the xfrm core code will be extended to support
new type of offload - packet offload. In that mode, both policy and state
should be specially configured in order to perform whole offloaded data
path.
Full offload takes care of encryption, decryption, encapsulation and
other operations with headers.
As this mode is new for XFRM policy flow, we can "start fresh" with flag
bits and release first and second bit for future use.
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
xfrm_user_rcv_msg() already handles extack, we just need to pass it down.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Functions xfrm_register_km and xfrm_unregister_km do always return 0,
change the type of functions to void.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
XFRM state doesn't need anything from flags except to understand
direction, so store it separately. For future patches, such change
will allow us to reuse xfrm_dev_offload for policy offload too, which
has three possible directions instead of two.
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
The struct xfrm_state_offload has all fields needed to hold information
for offloaded policies too. In order to do not create new struct with
same fields, let's rename existing one and reuse it later.
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2022-03-19
1) Delete duplicated functions that calls same xfrm_api_check.
From Leon Romanovsky.
2) Align userland API of the default policy structure to the
internal structures. From Nicolas Dichtel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a follow up of commit f8d858e607b2 ("xfrm: make user policy API
complete"). The goal is to align userland API to the internal structures.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This reverts commit 68ac0f3810e76a853b5f7b90601a05c3048b8b54 because ID
0 was meant to be used for configuring the policy/state without
matching for a specific interface (e.g., Cilium is affected, see
https://github.com/cilium/cilium/pull/18789 and
https://github.com/cilium/cilium/pull/19019).
Signed-off-by: Kai Lueke <kailueke@linux.microsoft.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch enables distinguishing SAs and SPs based on if_id during
the xfrm_migrate flow. This ensures support for xfrm interfaces
throughout the SA/SP lifecycle.
When there are multiple existing SPs with the same direction,
the same xfrm_selector and different endpoint addresses,
xfrm_migrate might fail with ENODATA.
Specifically, the code path for performing xfrm_migrate is:
Stage 1: find policy to migrate with
xfrm_migrate_policy_find(sel, dir, type, net)
Stage 2: find and update state(s) with
xfrm_migrate_state_find(mp, net)
Stage 3: update endpoint address(es) of template(s) with
xfrm_policy_migrate(pol, m, num_migrate)
Currently "Stage 1" always returns the first xfrm_policy that
matches, and "Stage 3" looks for the xfrm_tmpl that matches the
old endpoint address. Thus if there are multiple xfrm_policy
with same selector, direction, type and net, "Stage 1" might
rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
because it cannot find a xfrm_tmpl with the matching endpoint
address.
The fix is to allow userspace to pass an if_id and add if_id
to the matching rule in Stage 1 and Stage 2 since if_id is a
unique ID for xfrm_policy and xfrm_state. For compatibility,
if_id will only be checked if the attribute is set.
Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/1668886
Signed-off-by: Yan Yan <evitayan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Merge in fixes directly in prep for the 5.17 merge window.
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2022-01-06
1) Fix some clang_analyzer warnings about never read variables.
From luo penghao.
2) Check for pols[0] only once in xfrm_expand_policies().
From Jean Sacren.
3) The SA curlft.use_time was updated only on SA cration time.
Update whenever the SA is used. From Antony Antony
4) Add support for SM3 secure hash.
From Xu Jia.
5) Add support for SM4 symmetric cipher algorithm.
From Xu Jia.
6) Add a rate limit for SA mapping change messages.
From Antony Antony.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|