summaryrefslogtreecommitdiff
path: root/net/tls
AgeCommit message (Collapse)Author
2022-04-13tls: rx: move counting TlsDecryptErrors for syncJakub Kicinski
Move counting TlsDecryptErrors to tls_do_decryption() where differences between sync and async crypto are reconciled. No functional changes, this code just always gave me a pause. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13tls: rx: reuse leave_on_list label for psockJakub Kicinski
The code is identical, we can save a few LoC. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13tls: rx: consistently use unlocked accessors for rx_listJakub Kicinski
rx_list is protected by the socket lock, no need to take the built-in spin lock on accesses. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-12net: remove noblock parameter from recvmsg() entitiesOliver Hartkopp
The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-10tls: rx: jump out for cases which need to leave skb on listJakub Kicinski
The current invese logic is harder to follow (and adds extra tests to the fast path). We have to enumerate all cases which need to keep the skb before consuming it. It's simpler to jump out of the full record flow as we detect those cases. This makes it clear that partial consumption and peek can only reach end of the function thru the !zc case so move the code up there. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: clear ctx->recv_pkt earlierJakub Kicinski
Whatever we do in the loop the skb should not remain on as ctx->recv_pkt afterwards. We can clear that pointer and restart strparser earlier. This adds overhead of extra linking and unlinking to rx_list but that's not large (upcoming change will switch to unlocked skb list operations). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: inline consuming the skb at the end of the loopJakub Kicinski
tls_sw_advance_skb() always consumes the skb at the end of the loop. To fall here the following must be true: !async && !is_peek && !retain_skb retain_skb => !zc && rxm->full_len > len # but non-full record implies !zc, so above can be simplified as retain_skb => rxm->full_len > len !async && !is_peek && !(rxm->full_len > len) !async && !is_peek && rxm->full_len <= len tls_sw_advance_skb() returns false if len < rxm->full_len which can't be true given conditions above. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: pull most of zc check out of the loopJakub Kicinski
Most of the conditions deciding if zero-copy can be used do not change throughout the iterations, so pre-calculate them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: don't track the async countJakub Kicinski
We track both if the last record was handled by async crypto and how many records were async. This is not necessary. We implicitly assume once crypto goes async it will stay that way, otherwise we'd reorder records. So just track if we're in async mode, the exact number of records is not necessary. This change also forces us into "async" mode more consistently in case crypto ever decided to interleave async and sync. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: don't handle async in tls_sw_advance_skb()Jakub Kicinski
tls_sw_advance_skb() caters to the async case when skb argument is NULL. In that case it simply unpauses the strparser. These are surprising semantics to a person reading the code, and result in higher LoC, so inline the __strp_unpause and only call tls_sw_advance_skb() when we actually move past an skb. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: factor out writing ContentType to cmsgJakub Kicinski
cmsg can be filled in during rx_list processing or normal receive. Consolidate the code. We don't need to keep the boolean to track if the cmsg was created. 0 is an invalid content type. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: simplify async waitJakub Kicinski
Since we are protected from async completions by decrypt_compl_lock we can drop the async_notify and reinit the completion before we start waiting. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: wrap decryption arguments in a structureJakub Kicinski
We pass zc as a pointer to bool a few functions down as an in/out argument. This is error prone since C will happily evalue a pointer as a boolean (IOW forgetting *zc and writing zc leads to loss of developer time..). Wrap the arguments into a structure. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: don't report text length from the bowels of decryptJakub Kicinski
We plumb pointer to chunk all the way to the decryption method. It's set to the length of the text when decrypt_skb_update() returns. I think the code is written this way because original TLS implementation passed &chunk to zerocopy_from_iter() and this was carried forward as the code gotten more complex, without any refactoring. The fix for peek() introduced a new variable - to_decrypt which for all practical purposes is what chunk is going to get set to. Spare ourselves the pointer passing, use to_decrypt. Use this opportunity to clean things up a little further. Note that chunk / to_decrypt was mostly needed for the async path, since the sync path would access rxm->full_len (decryption transforms full_len from record size to text size). Use the right source of truth more explicitly. We have three cases: - async - it's TLS 1.2 only, so chunk == to_decrypt, but we need the min() because to_decrypt is a whole record and we don't want to underflow len. Note that we can't handle partial record by falling back to sync as it would introduce reordering against records in flight. - zc - again, TLS 1.2 only for now, so chunk == to_decrypt, we don't do zc if len < to_decrypt, no need to check again. - normal - it already handles chunk > len, we can factor out the assignment to rxm->full_len and share it with zc. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-10tls: rx: drop unnecessary arguments from tls_setup_from_iter()Jakub Kicinski
sk is unused, remove it to make it clear the function doesn't poke at the socket. size_used is always 0 on input and @length on success. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: hw: rx: use return value of tls_device_decrypted() to carry statusJakub Kicinski
Instead of tls_device poking into internals of the message return 1 from tls_device_decrypted() if the device handled the decryption. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: refactor decrypt_skb_update()Jakub Kicinski
Use early return and a jump label to remove two indentation levels. No functional changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: don't issue wake ups when data is decryptedJakub Kicinski
We inform the applications that data is available when the record is received. Decryption happens inline inside recvmsg or splice call. Generating another wakeup inside the decryption handler seems pointless as someone must be actively reading the socket if we are executing this code. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: replace 'back' with 'offset'Jakub Kicinski
The padding length TLS 1.3 logic is searching for content_type from the end of text. IMHO the code is easier to parse if we calculate offset and decrement it rather than try to maintain positive offset from the end of the record called "back". Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: use a define for tag lengthJakub Kicinski
TLS 1.3 has to strip padding, and it starts out 16 bytes from the end of the record. Make it clear this is because of the auth tag. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: init decrypted status in tls_read_size()Jakub Kicinski
We set the record type in tls_read_size(), can as well init the tlm->decrypted field there. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: don't store the decryption status in socket contextJakub Kicinski
Similar justification to previous change, the information about decryption status belongs in the skb. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: don't store the record type in socket contextJakub Kicinski
Original TLS implementation was handling one record at a time. It stashed the type of the record inside tls context (per socket structure) for convenience. When async crypto support was added [1] the author had to use skb->cb to store the type per-message. The use of skb->cb overlaps with strparser, however, so a hybrid approach was taken where type is stored in context while parsing (since we parse a message at a time) but once parsed its copied to skb->cb. Recently a workaround for sockmaps [2] exposed the previously private struct _strp_msg and started a trend of adding user fields directly in strparser's header. This is cleaner than storing information about an skb in the context. This change is not strictly necessary, but IMHO the ownership of the context field is confusing. Information naturally belongs to the skb. [1] commit 94524d8fc965 ("net/tls: Add support for async decryption of tls records") [2] commit b2c4618162ec ("bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: drop pointless else after gotoJakub Kicinski
Pointless else branch after goto makes the code harder to refactor down the line. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08tls: rx: jump to a more appropriate labelJakub Kicinski
'recv_end:' checks num_async and decrypted, and is then followed by the 'end' label. Since we know that decrypted and num_async are 0 at the start we can jump to 'end'. Move the init of decrypted and num_async to let the compiler catch if I'm wrong. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01net/tls: fix slab-out-of-bounds bug in decrypt_internalZiyang Xuan
The memory size of tls_ctx->rx.iv for AES128-CCM is 12 setting in tls_set_sw_offload(). The return value of crypto_aead_ivsize() for "ccm(aes)" is 16. So memcpy() require 16 bytes from 12 bytes memory space will trigger slab-out-of-bounds bug as following: ================================================================== BUG: KASAN: slab-out-of-bounds in decrypt_internal+0x385/0xc40 [tls] Read of size 16 at addr ffff888114e84e60 by task tls/10911 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x5e/0x5db ? decrypt_internal+0x385/0xc40 [tls] kasan_report+0xab/0x120 ? decrypt_internal+0x385/0xc40 [tls] kasan_check_range+0xf9/0x1e0 memcpy+0x20/0x60 decrypt_internal+0x385/0xc40 [tls] ? tls_get_rec+0x2e0/0x2e0 [tls] ? process_rx_list+0x1a5/0x420 [tls] ? tls_setup_from_iter.constprop.0+0x2e0/0x2e0 [tls] decrypt_skb_update+0x9d/0x400 [tls] tls_sw_recvmsg+0x3c8/0xb50 [tls] Allocated by task 10911: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 tls_set_sw_offload+0x2eb/0xa20 [tls] tls_setsockopt+0x68c/0x700 [tls] __sys_setsockopt+0xfe/0x1b0 Replace the crypto_aead_ivsize() with prot->iv_size + prot->salt_size when memcpy() iv value in TLS_1_3_VERSION scenario. Fixes: f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-21net/tls: optimize judgement processes in tls_set_device_offload()Ziyang Xuan
It is known that priority setting HW offload when set tls TX/RX offload by setsockopt(). Check netdevice whether support NETIF_F_HW_TLS_TX or not at the later stages in the whole tls_set_device_offload() process, some memory allocations have been done before that. We must release those memory and return error if we judge the netdevice not support NETIF_F_HW_TLS_TX. It is redundant. Move NETIF_F_HW_TLS_TX judgement forward, and move start_marker_record and offload_ctx memory allocation back slightly. Thus, we can get simpler exception handling process. Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-21net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf()Ziyang Xuan
Avoid using "goto" jump instruction unconditionally when we can return directly. Remove unnecessary jump instructions in do_tls_setsockopt_conf(). Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04tls: cap the output scatter list to something reasonableJakub Kicinski
TLS recvmsg() passes user pages as destination for decrypt. The decrypt operation is repeated record by record, each record being 16kB, max. TLS allocates an sg_table and uses iov_iter_get_pages() to populate it with enough pages to fit the decrypted record. Even though we decrypt a single message at a time we size the sg_table based on the entire length of the iovec. This leads to unnecessarily large allocations, risking triggering OOM conditions. Use iov_iter_truncate() / iov_iter_reexpand() to construct a "capped" version of iov_iter_npages(). Alternatively we could parametrize iov_iter_npages() to take the size as arg instead of using i->count, or do something else.. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-17net/tls: Fix another skb memory leak when running kTLS trafficGal Pressman
This patch is a followup to commit ffef737fd037 ("net/tls: Fix skb memory leak when running kTLS traffic") Which was missing another sk_defer_free_flush() call in tls_sw_splice_read(). Fixes: f35f821935d8 ("tcp: defer skb freeing after socket lock is released") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07net/tls: Fix skb memory leak when running kTLS trafficGal Pressman
The cited Fixes commit introduced a memory leak when running kTLS traffic (with/without hardware offloads). I'm running nginx on the server side and wrk on the client side and get the following: unreferenced object 0xffff8881935e9b80 (size 224): comm "softirq", pid 0, jiffies 4294903611 (age 43.204s) hex dump (first 32 bytes): 80 9b d0 36 81 88 ff ff 00 00 00 00 00 00 00 00 ...6............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000efe2a999>] build_skb+0x1f/0x170 [<00000000ef521785>] mlx5e_skb_from_cqe_mpwrq_linear+0x2bc/0x610 [mlx5_core] [<00000000945d0ffe>] mlx5e_handle_rx_cqe_mpwrq+0x264/0x9e0 [mlx5_core] [<00000000cb675b06>] mlx5e_poll_rx_cq+0x3ad/0x17a0 [mlx5_core] [<0000000018aac6a9>] mlx5e_napi_poll+0x28c/0x1b60 [mlx5_core] [<000000001f3369d1>] __napi_poll+0x9f/0x560 [<00000000cfa11f72>] net_rx_action+0x357/0xa60 [<000000008653b8d7>] __do_softirq+0x282/0x94e [<00000000644923c6>] __irq_exit_rcu+0x11f/0x170 [<00000000d4085f8f>] irq_exit_rcu+0xa/0x20 [<00000000d412fef4>] common_interrupt+0x7d/0xa0 [<00000000bfb0cebc>] asm_common_interrupt+0x1e/0x40 [<00000000d80d0890>] default_idle+0x53/0x70 [<00000000f2b9780e>] default_idle_call+0x8c/0xd0 [<00000000c7659e15>] do_idle+0x394/0x450 I'm not familiar with these areas of the code, but I've added this sk_defer_free_flush() to tls_sw_recvmsg() based on a hunch and it resolved the issue. Fixes: f35f821935d8 ("tcp: defer skb freeing after socket lock is released") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220102081253.9123-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-30net/tls: simplify the tls_set_sw_offload functionTianjia Zhang
Assigning crypto_info variables in advance can simplify the logic of accessing value and move related local variables to a smaller scope. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29net/tls: Fix authentication failure in CCM modeTianjia Zhang
When the TLS cipher suite uses CCM mode, including AES CCM and SM4 CCM, the first byte of the B0 block is flags, and the real IV starts from the second byte. The XOR operation of the IV and rec_seq should be skip this byte, that is, add the iv_offset. Fixes: f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Cc: Vakul Garg <vakul.garg@nxp.com> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-25tls: fix replacing proto_opsJakub Kicinski
We replace proto_ops whenever TLS is configured for RX. But our replacement also overrides sendpage_locked, which will crash unless TX is also configured. Similarly we plug both of those in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely different implementation for TX. Last but not least we always plug in something based on inet_stream_ops even though a few of the callbacks differ for IPv6 (getname, release, bind). Use a callback building method similar to what we do for struct proto. Fixes: c46234ebb4d1 ("tls: RX path for ktls") Fixes: d4ffb02dee2f ("net/tls: enable sk_msg redirect to tls socket egress") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tls: splice_read: fix accessing pre-processed recordsJakub Kicinski
recvmsg() will put peek()ed and partially read records onto the rx_list. splice_read() needs to consult that list otherwise it may miss data. Align with recvmsg() and also put partially-read records onto rx_list. tls_sw_advance_skb() is pretty pointless now and will be removed in net-next. Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tls: splice_read: fix record type checkJakub Kicinski
We don't support splicing control records. TLS 1.3 changes moved the record type check into the decrypt if(). The skb may already be decrypted and still be an alert. Note that decrypt_skb_update() is idempotent and updates ctx->decrypted so the if() is pointless. Reorder the check for decryption errors with the content type check while touching them. This part is not really a bug, because if decryption failed in TLS 1.3 content type will be DATA, and for TLS 1.2 it will be correct. Nevertheless its strange to touch output before checking if the function has failed. Fixes: fedf201e1296 ("net: tls: Refactor control message handling on recv") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
include/net/sock.h 7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable") 4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout") drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.") Adjacent code addition in both cases, keep both. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28net/tls: Fix flipped sign in async_wait.err assignmentDaniel Jordan
sk->sk_err contains a positive number, yet async_wait.err wants the opposite. Fix the missed sign flip, which Jakub caught by inspection. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28net/tls: Fix flipped sign in tls_err_abort() callsDaniel Jordan
sk->sk_err appears to expect a positive value, a convention that ktls doesn't always follow and that leads to memory corruption in other code. For instance, [kworker] tls_encrypt_done(..., err=<negative error from crypto request>) tls_err_abort(.., err) sk->sk_err = err; [task] splice_from_pipe_feed ... tls_sw_do_sendpage if (sk->sk_err) { ret = -sk->sk_err; // ret is positive splice_from_pipe_feed (continued) ret = actor(...) // ret is still positive and interpreted as bytes // written, resulting in underflow of buf->len and // sd->len, leading to huge buf->offset and bogus // addresses computed in later calls to actor() Fix all tls_err_abort() callers to pass a negative error code consistently and centralize the error-prone sign flip there, throwing in a warning to catch future misuse and uninlining the function so it really does only warn once. Cc: stable@vger.kernel.org Fixes: c46234ebb4d1e ("tls: RX path for ktls") Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26net: Rename ->stream_memory_read to ->sock_is_readableCong Wang
The proto ops ->stream_memory_read() is currently only used by TCP to check whether psock queue is empty or not. We need to rename it before reusing it for non-TCP protocols, and adjust the exsiting users accordingly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
2021-10-25net/tls: getsockopt supports complete algorithm listTianjia Zhang
AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls, similar to setsockopt, getsockopt also needs to support these two algorithms. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net/tls: support SM4 CCM algorithmTianjia Zhang
The IV of CCM mode has special requirements, this patch supports CCM mode of SM4 algorithm. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16net/tls: support SM4 GCM/CCM algorithmTianjia Zhang
The RFC8998 specification defines the use of the ShangMi algorithm cipher suites in TLS 1.3, and also supports the GCM/CCM mode using the SM4 algorithm. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-21tls: prevent oversized sendfile() hangs by ignoring MSG_MOREJakub Kicinski
We got multiple reports that multi_chunk_sendfile test case from tls selftest fails. This was sort of expected, as the original fix was never applied (see it in the first Link:). The test in question uses sendfile() with count larger than the size of the underlying file. This will make splice set MSG_MORE on all sendpage calls, meaning TLS will never close and flush the last partial record. Eric seem to have addressed a similar problem in commit 35f9c09fe9c7 ("tcp: tcp_sendpages() should call tcp_push() once") by introducing MSG_SENDPAGE_NOTLAST. Unlike MSG_MORE MSG_SENDPAGE_NOTLAST is not set on the last call of a "pipefull" of data (PIPE_DEF_BUFFERS == 16, so every 16 pages or whenever we run out of data). Having a break every 16 pages should be fine, TLS can pack exactly 4 pages into a record, so for aligned reads there should be no difference, unaligned may see one extra record per sendpage(). Sticking to TCP semantics seems preferable to modifying splice, but we can revisit it if real life scenarios show a regression. Reported-by: Vadim Fedorenko <vfedorenko@novek.ru> Reported-by: Seth Forshee <seth.forshee@canonical.com> Link: https://lore.kernel.org/netdev/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07skbuff: add a parameter to __skb_frag_unrefMatteo Croce
This is a prerequisite patch, the next one is enabling recycling of skbs and fragments. Add an extra argument on __skb_frag_unref() to handle recycling, and update the current users of the function with that. Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01net/tls: Fix use-after-free after the TLS device goes down and upMaxim Mikityanskiy
When a netdev with active TLS offload goes down, tls_device_down is called to stop the offload and tear down the TLS context. However, the socket stays alive, and it still points to the TLS context, which is now deallocated. If a netdev goes up, while the connection is still active, and the data flow resumes after a number of TCP retransmissions, it will lead to a use-after-free of the TLS context. This commit addresses this bug by keeping the context alive until its normal destruction, and implements the necessary fallbacks, so that the connection can resume in software (non-offloaded) kTLS mode. On the TX side tls_sw_fallback is used to encrypt all packets. The RX side already has all the necessary fallbacks, because receiving non-decrypted packets is supported. The thing needed on the RX side is to block resync requests, which are normally produced after receiving non-decrypted packets. The necessary synchronization is implemented for a graceful teardown: first the fallbacks are deployed, then the driver resources are released (it used to be possible to have a tls_dev_resync after tls_dev_del). A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback mode. It's used to skip the RX resync logic completely, as it becomes useless, and some objects may be released (for example, resync_async, which is allocated and freed by the driver). Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01net/tls: Replace TLS_RX_SYNC_RUNNING with RCUMaxim Mikityanskiy
RCU synchronization is guaranteed to finish in finite time, unlike a busy loop that polls a flag. This patch is a preparation for the bugfix in the next patch, where the same synchronize_net() call will also be used to sync with the TX datapath. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>