summaryrefslogtreecommitdiff
path: root/net/xfrm
AgeCommit message (Collapse)Author
2020-03-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-03-28 1) Use kmem_cache_zalloc() instead of kmem_cache_alloc() in xfrm_state_alloc(). From Huang Zijiang. 2) esp_output_fill_trailer() is the same in IPv4 and IPv6, so share this function to avoide code duplcation. From Raed Salem. 3) Add offload support for esp beet mode. From Xin Long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: Fix typo of SKB_SGO_CB_OFFSETCambda Zhu
The SKB_SGO_CB_OFFSET should be SKB_GSO_CB_OFFSET which means the offset of the GSO in skb cb. This patch fixes the typo. Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation") Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-03-27 1) Handle NETDEV_UNREGISTER for xfrm device to handle asynchronous unregister events cleanly. From Raed Salem. 2) Fix vti6 tunnel inter address family TX through bpf_redirect(). From Nicolas Dichtel. 3) Fix lenght check in verify_sec_ctx_len() to avoid a slab-out-of-bounds. From Xin Long. 4) Add a missing verify_sec_ctx_len check in xfrm_add_acquire to avoid a possible out-of-bounds to access. From Xin Long. 5) Use built-in RCU list checking of hlist_for_each_entry_rcu to silence false lockdep warning in __xfrm6_tunnel_spi_lookup when CONFIG_PROVE_RCU_LIST is enabled. From Madhuparna Bhowmik. 6) Fix a panic on esp offload when crypto is done asynchronously. From Xin Long. 7) Fix a skb memory leak in an error path of vti6_rcv. From Torsten Hilbrich. 8) Fix a race that can lead to a doulbe free in xfrm_policy_timer. From Xin Long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26xfrm: add prep for esp beet mode offloadXin Long
Like __xfrm_transport/mode_tunnel_prep(), this patch is to add __xfrm_mode_beet_prep() to fix the transport_header for gso segments, and reset skb mac_len, and pull skb data to the proto inside esp. This patch also fixes a panic, reported by ltp: # modprobe esp4_offload # runltp -f net_stress.ipsec_tcp [ 2452.780511] kernel BUG at net/core/skbuff.c:109! [ 2452.799851] Call Trace: [ 2452.800298] <IRQ> [ 2452.800705] skb_push.cold.98+0x14/0x20 [ 2452.801396] esp_xmit+0x17b/0x270 [esp4_offload] [ 2452.802799] validate_xmit_xfrm+0x22f/0x2e0 [ 2452.804285] __dev_queue_xmit+0x589/0x910 [ 2452.806264] __neigh_update+0x3d7/0xa50 [ 2452.806958] arp_process+0x259/0x810 [ 2452.807589] arp_rcv+0x18a/0x1c It was caused by the skb going to esp_xmit with a wrong transport header. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-03-24xfrm: policy: Fix doulbe free in xfrm_policy_timerYueHaibing
After xfrm_add_policy add a policy, its ref is 2, then xfrm_policy_timer read_lock xp->walk.dead is 0 .... mod_timer() xfrm_policy_kill policy->walk.dead = 1 .... del_timer(&policy->timer) xfrm_pol_put //ref is 1 xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_pol_hold //ref is 1 read_unlock xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_policy_destroy is called twice, which may leads to double free. Call Trace: RIP: 0010:refcount_warn_saturate+0x161/0x210 ... xfrm_policy_timer+0x522/0x600 call_timer_fn+0x1b3/0x5e0 ? __xfrm_decode_session+0x2990/0x2990 ? msleep+0xb0/0xb0 ? _raw_spin_unlock_irq+0x24/0x40 ? __xfrm_decode_session+0x2990/0x2990 ? __xfrm_decode_session+0x2990/0x2990 run_timer_softirq+0x5c5/0x10e0 Fix this by use write_lock_bh in xfrm_policy_kill. Fixes: ea2dea9dacc2 ("xfrm: remove policy lock when accessing policy->walk.dead") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-03-23Remove DST_HOSTDavid Laight
Previous changes to the IP routing code have removed all the tests for the DS_HOST route flag. Remove the flags and all the code that sets it. Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-04esp: remove the skb from the chain when it's enqueued in cryptd_wqXin Long
Xiumei found a panic in esp offload: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: 0010:esp_output_done+0x101/0x160 [esp4] Call Trace: ? esp_output+0x180/0x180 [esp4] cryptd_aead_crypt+0x4c/0x90 cryptd_queue_worker+0x6e/0xa0 process_one_work+0x1a7/0x3b0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 It was caused by that skb secpath is used in esp_output_done() after it's been released elsewhere. The tx path for esp offload is: __dev_queue_xmit()-> validate_xmit_skb_list()-> validate_xmit_xfrm()-> esp_xmit()-> esp_output_tail()-> aead_request_set_callback(esp_output_done) <--[1] crypto_aead_encrypt() <--[2] In [1], .callback is set, and in [2] it will trigger the worker schedule, later on a kernel thread will call .callback(esp_output_done), as the call trace shows. But in validate_xmit_xfrm(): skb_list_walk_safe(skb, skb2, nskb) { ... err = x->type_offload->xmit(x, skb2, esp_features); [esp_xmit] ... } When the err is -EINPROGRESS, which means this skb2 will be enqueued and later gets encrypted and sent out by .callback later in a kernel thread, skb2 should be removed fromt skb chain. Otherwise, it will get processed again outside validate_xmit_xfrm(), which could release skb secpath, and cause the panic above. This patch is to remove the skb from the chain when it's enqueued in cryptd_wq. While at it, remove the unnecessary 'if (!skb)' check. Fixes: 3dca3f38cfb8 ("xfrm: Separate ESP handling from segmentation for GRO packets.") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-02-28net: datagram: drop 'destructor' argument from several helpersPaolo Abeni
The only users for such argument are the UDP protocol and the UNIX socket family. We can safely reclaim the accounted memory directly from the UDP code and, after the previous patch, we can do scm stats accounting outside the datagram helpers. Overall this cleans up a bit some datagram-related helpers, and avoids an indirect call per packet in the UDP receive path. v1 -> v2: - call scm_stat_del() only when not peeking - Kirill - fix build issue with CONFIG_INET_ESPINTCP Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19xfrm: Use kmem_cache_zalloc() instead of kmem_cache_alloc() with flag GFP_ZERO.Huang Zijiang
Use kmem_cache_zalloc instead of manually setting kmem_cache_alloc with flag GFP_ZERO since kzalloc sets allocated memory to zero. Change in v2: add indation Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-02-13xfrm: interface: use icmp_ndo_send helperJason A. Donenfeld
Because xfrmi is calling icmp from network device context, it should use the ndo helper so that the rate limiting applies correctly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-12xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquireXin Long
Without doing verify_sec_ctx_len() check in xfrm_add_acquire(), it may be out-of-bounds to access uctx->ctx_str with uctx->ctx_len, as noticed by syz: BUG: KASAN: slab-out-of-bounds in selinux_xfrm_alloc_user+0x237/0x430 Read of size 768 at addr ffff8880123be9b4 by task syz-executor.1/11650 Call Trace: dump_stack+0xe8/0x16e print_address_description.cold.3+0x9/0x23b kasan_report.cold.4+0x64/0x95 memcpy+0x1f/0x50 selinux_xfrm_alloc_user+0x237/0x430 security_xfrm_policy_alloc+0x5c/0xb0 xfrm_policy_construct+0x2b1/0x650 xfrm_add_acquire+0x21d/0xa10 xfrm_user_rcv_msg+0x431/0x6f0 netlink_rcv_skb+0x15a/0x410 xfrm_netlink_rcv+0x6d/0x90 netlink_unicast+0x50e/0x6a0 netlink_sendmsg+0x8ae/0xd40 sock_sendmsg+0x133/0x170 ___sys_sendmsg+0x834/0x9a0 __sys_sendmsg+0x100/0x1e0 do_syscall_64+0xe5/0x660 entry_SYSCALL_64_after_hwframe+0x6a/0xdf So fix it by adding the missing verify_sec_ctx_len check there. Fixes: 980ebd25794f ("[IPSEC]: Sync series - acquire insert") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-02-12xfrm: fix uctx len check in verify_sec_ctx_lenXin Long
It's not sufficient to do 'uctx->len != (sizeof(struct xfrm_user_sec_ctx) + uctx->ctx_len)' check only, as uctx->len may be greater than nla_len(rt), in which case it will cause slab-out-of-bounds when accessing uctx->ctx_str later. This patch is to fix it by return -EINVAL when uctx->len > nla_len(rt). Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-02-04xfrm: handle NETDEV_UNREGISTER for xfrm deviceRaed Salem
This patch to handle the asynchronous unregister device event so the device IPsec offload resources could be cleanly released. Fixes: e4db5b61c572 ("xfrm: policy: remove pcpu policy cache") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-02-04treewide: remove redundant IS_ERR() before error code checkMasahiro Yamada
'PTR_ERR(p) == -E*' is a stronger condition than IS_ERR(p). Hence, IS_ERR(p) is unneeded. The semantic patch that generates this commit is as follows: // <smpl> @@ expression ptr; constant error_code; @@ -IS_ERR(ptr) && (PTR_ERR(ptr) == - error_code) +PTR_ERR(ptr) == - error_code // </smpl> Link: http://lkml.kernel.org/r/20200106045833.1725-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Acked-by: Stephen Boyd <sboyd@kernel.org> [drivers/clk/clk.c] Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> [GPIO] Acked-by: Wolfram Sang <wsa@the-dreams.de> [drivers/i2c] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [acpi/scan.c] Acked-by: Rob Herring <robh@kernel.org> Cc: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-01-21 1) Add support for TCP encapsulation of IKE and ESP messages, as defined by RFC 8229. Patchset from Sabrina Dubroca. Please note that there is a merge conflict in: net/unix/af_unix.c between commit: 3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo") from the net-next tree and commit: b50b0580d27b ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram") from the ipsec-next tree. The conflict can be solved as done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14net: xfrm: use skb_list_walk_safe helper for gso segmentsJason A. Donenfeld
This is converts xfrm segment iteration to use the new function, keeping the flow of the existing code as intact as possible. One case is very straight-forward, whereas the other case has some more subtle code that likes to peak at ->next and relink skbs. By keeping the variables the same as before, we can upgrade this code with minimal surgery required. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14xfrm: interface: do not confirm neighbor when do pmtu updateXu Wang
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end, we should not call dst_confirm_neigh() as there is no two-way communication. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-01-14xfrm interface: fix packet tx through bpf_redirect()Nicolas Dichtel
With an ebpf program that redirects packets through a xfrm interface, packets are dropped because no dst is attached to skb. This could also be reproduced with an AF_PACKET socket, with the following python script (xfrm1 is a xfrm interface): import socket send_s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, 0) # scapy # p = IP(src='10.100.0.2', dst='10.200.0.1')/ICMP(type='echo-request') # raw(p) req = b'E\x00\x00\x1c\x00\x01\x00\x00@\x01e\xb2\nd\x00\x02\n\xc8\x00\x01\x08\x00\xf7\xff\x00\x00\x00\x00' send_s.sendto(req, ('xfrm1', 0x800, 0, 0)) It was also not possible to send an ip packet through an AF_PACKET socket because a LL header was expected. Let's remove those LL header constraints. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-12-09xfrm: add espintcp (RFC 8229)Sabrina Dubroca
TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented as a TCP ULP, overriding in particular the sendmsg and recvmsg operations. A Stream Parser is used to extract messages out of the TCP stream using the first 2 bytes as length marker. Received IKE messages are put on "ike_queue", waiting to be dequeued by the custom recvmsg implementation. Received ESP messages are sent to XFRM, like with UDP encapsulation. Some of this code is taken from the original submission by Herbert Xu. Currently, only IPv4 is supported, like for UDP encapsulation. Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-12-09xfrm: introduce xfrm_trans_queue_netSabrina Dubroca
This will be used by TCP encapsulation to write packets to the encap socket without holding the user socket's lock. Without this reinjection, we're already holding the lock of the user socket, and then try to lock the encap socket as well when we enqueue the encrypted packet. While at it, add a BUILD_BUG_ON like we usually do for skb->cb, since it's missing for struct xfrm_trans_cb. Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-11-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: "Another merge window, another pull full of stuff: 1) Support alternative names for network devices, from Jiri Pirko. 2) Introduce per-netns netdev notifiers, also from Jiri Pirko. 3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara Larsen. 4) Allow compiling out the TLS TOE code, from Jakub Kicinski. 5) Add several new tracepoints to the kTLS code, also from Jakub. 6) Support set channels ethtool callback in ena driver, from Sameeh Jubran. 7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED, SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long. 8) Add XDP support to mvneta driver, from Lorenzo Bianconi. 9) Lots of netfilter hw offload fixes, cleanups and enhancements, from Pablo Neira Ayuso. 10) PTP support for aquantia chips, from Egor Pomozov. 11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From Josh Hunt. 12) Add smart nagle to tipc, from Jon Maloy. 13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat Duvvuru. 14) Add a flow mask cache to OVS, from Tonghao Zhang. 15) Add XDP support to ice driver, from Maciej Fijalkowski. 16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak. 17) Support UDP GSO offload in atlantic driver, from Igor Russkikh. 18) Support it in stmmac driver too, from Jose Abreu. 19) Support TIPC encryption and auth, from Tuong Lien. 20) Introduce BPF trampolines, from Alexei Starovoitov. 21) Make page_pool API more numa friendly, from Saeed Mahameed. 22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni. 23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits) libbpf: Fix usage of u32 in userspace code mm: Implement no-MMU variant of vmalloc_user_node_flags slip: Fix use-after-free Read in slip_open net: dsa: sja1105: fix sja1105_parse_rgmii_delays() macvlan: schedule bc_work even if error enetc: add support Credit Based Shaper(CBS) for hardware offload net: phy: add helpers phy_(un)lock_mdio_bus mdio_bus: don't use managed reset-controller ax88179_178a: add ethtool_op_get_ts_info() mlxsw: spectrum_router: Fix use of uninitialized adjacency index mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels bpf: Simplify __bpf_arch_text_poke poke type handling bpf: Introduce BPF_TRACE_x helper for the tracing tests bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT bpf, testing: Add various tail call test cases bpf, x86: Emit patchable direct jump as tail call bpf: Constant map key tracking for prog array pokes bpf: Add poke dependency tracking for prog array maps bpf: Add initial poke descriptor table for jit images bpf: Move owner type, jited info into array auxiliary data ...
2019-11-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto updates from Herbert Xu: "API: - Add library interfaces of certain crypto algorithms for WireGuard - Remove the obsolete ablkcipher and blkcipher interfaces - Move add_early_randomness() out of rng_mutex Algorithms: - Add blake2b shash algorithm - Add blake2s shash algorithm - Add curve25519 kpp algorithm - Implement 4 way interleave in arm64/gcm-ce - Implement ciphertext stealing in powerpc/spe-xts - Add Eric Biggers's scalar accelerated ChaCha code for ARM - Add accelerated 32r2 code from Zinc for MIPS - Add OpenSSL/CRYPTOGRAMS poly1305 implementation for ARM and MIPS Drivers: - Fix entropy reading failures in ks-sa - Add support for sam9x60 in atmel - Add crypto accelerator for amlogic GXL - Add sun8i-ce Crypto Engine - Add sun8i-ss cryptographic offloader - Add a host of algorithms to inside-secure - Add NPCM RNG driver - add HiSilicon HPRE accelerator - Add HiSilicon TRNG driver" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (285 commits) crypto: vmx - Avoid weird build failures crypto: lib/chacha20poly1305 - use chacha20_crypt() crypto: x86/chacha - only unregister algorithms if registered crypto: chacha_generic - remove unnecessary setkey() functions crypto: amlogic - enable working on big endian kernel crypto: sun8i-ce - enable working on big endian crypto: mips/chacha - select CRYPTO_SKCIPHER, not CRYPTO_BLKCIPHER hwrng: ks-sa - Enable COMPILE_TEST crypto: essiv - remove redundant null pointer check before kfree crypto: atmel-aes - Change data type for "lastc" buffer crypto: atmel-tdes - Set the IV after {en,de}crypt crypto: sun4i-ss - fix big endian issues crypto: sun4i-ss - hide the Invalid keylen message crypto: sun4i-ss - use crypto_ahash_digestsize crypto: sun4i-ss - remove dependency on not 64BIT crypto: sun4i-ss - Fix 64-bit size_t warnings on sun4i-ss-hash.c MAINTAINERS: Add maintainer for HiSilicon SEC V2 driver crypto: hisilicon - add DebugFS for HiSilicon SEC Documentation: add DebugFS doc for HiSilicon SEC crypto: hisilicon - add SRIOV for HiSilicon SEC ...
2019-11-21net: Fix Kconfig indentation, continuedKrzysztof Kozlowski
Adjust indentation from spaces to tab (+optional two spaces) as in coding style. This fixes various indentation mixups (seven spaces, tab+one space, etc). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Lots of overlapping changes and parallel additions, stuff like that. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-13Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-11-13 1) Remove a unnecessary net_exit function from the xfrm interface. From Xin Long. 2) Assign xfrm4_udp_encap_rcv to a UDP socket only if xfrm is configured. From Alexey Dobriyan. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12xfrm: release device reference for invalid stateXiaodong Xu
An ESP packet could be decrypted in async mode if the input handler for this packet returns -EINPROGRESS in xfrm_input(). At this moment the device reference in skb is held. Later xfrm_input() will be invoked again to resume the processing. If the transform state is still valid it would continue to release the device reference and there won't be a problem; however if the transform state is not valid when async resumption happens, the packet will be dropped while the device reference is still being held. When the device is deleted for some reason and the reference to this device is not properly released, the kernel will keep logging like: unregister_netdevice: waiting for ppp2 to become free. Usage count = 1 The issue is observed when running IPsec traffic over a PPPoE device based on a bridge interface. By terminating the PPPoE connection on the server end for multiple times, the PPPoE device on the client side will eventually get stuck on the above warning message. This patch will check the async mode first and continue to release device reference in async resumption, before it is dropped due to invalid state. v2: Do not assign address family from outer_mode in the transform if the state is invalid v3: Release device reference in the error path instead of jumping to resume Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)") Signed-off-by: Xiaodong Xu <stid.smth@gmail.com> Reported-by: Bo Chen <chenborfc@163.com> Tested-by: Bo Chen <chenborfc@163.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-11-07xfrm: Fix memleak on xfrm state destroySteffen Klassert
We leak the page that we use to create skb page fragments when destroying the xfrm_state. Fix this by dropping a page reference if a page was assigned to the xfrm_state. Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Reported-by: JD <jdtxs00@gmail.com> Reported-by: Paul Wouters <paul@nohats.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-11-01crypto: skcipher - rename the crypto_blkcipher module and kconfig optionEric Biggers
Now that the blkcipher algorithm type has been removed in favor of skcipher, rename the crypto_blkcipher kernel module to crypto_skcipher, and rename the config options accordingly: CONFIG_CRYPTO_BLKCIPHER => CONFIG_CRYPTO_SKCIPHER CONFIG_CRYPTO_BLKCIPHER2 => CONFIG_CRYPTO_SKCIPHER2 Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-01crypto: skcipher - remove the "blkcipher" algorithm typeEric Biggers
Now that all "blkcipher" algorithms have been converted to "skcipher", remove the blkcipher algorithm type. The skcipher (symmetric key cipher) algorithm type was introduced a few years ago to replace both blkcipher and ablkcipher (synchronous and asynchronous block cipher). The advantages of skcipher include: - A much less confusing name, since none of these algorithm types have ever actually been for raw block ciphers, but rather for all length-preserving encryption modes including block cipher modes of operation, stream ciphers, and other length-preserving modes. - It unified blkcipher and ablkcipher into a single algorithm type which supports both synchronous and asynchronous implementations. Note, blkcipher already operated only on scatterlists, so the fact that skcipher does too isn't a regression in functionality. - Better type safety by using struct skcipher_alg, struct crypto_skcipher, etc. instead of crypto_alg, crypto_tfm, etc. - It sometimes simplifies the implementations of algorithms. Also, the blkcipher API was no longer being tested. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-01netfilter: drop bridge nf reset from nf_resetFlorian Westphal
commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-01xfrm: remove the unnecessary .net_exit for xfrmiXin Long
The xfrm_if(s) on each netns can be deleted when its xfrmi dev is deleted. xfrmi dev's removal can happen when: a. netns is being removed and all xfrmi devs will be deleted. b. rtnl_link_unregister(&xfrmi_link_ops) in xfrmi_fini() when xfrm_interface.ko is being unloaded. So there's no need to use xfrmi_exit_net() to clean any xfrm_if up. v1->v2: - Fix some changelog. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor overlapping changes in the btusb and ixgbe drivers. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-09-05 1) Several xfrm interface fixes from Nicolas Dichtel: - Avoid an interface ID corruption on changelink. - Fix wrong intterface names in the logs. - Fix a list corruption when changing network namespaces. - Fix unregistation of the underying phydev. 2) Fix a potential warning when merging xfrm_plocy nodes. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md modeHangbin Liu
In decode_session{4,6} there is a possibility that the skb dst dev is NULL, e,g, with tunnel collect_md mode, which will cause kernel crash. Here is what the code path looks like, for GRE: - ip6gre_tunnel_xmit - ip6gre_xmit_ipv6 - __gre6_xmit - ip6_tnl_xmit - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE - icmpv6_send - icmpv6_route_lookup - xfrm_decode_session_reverse - decode_session4 - oif = skb_dst(skb)->dev->ifindex; <-- here - decode_session6 - oif = skb_dst(skb)->dev->ifindex; <-- here The reason is __metadata_dst_init() init dst->dev to NULL by default. We could not fix it in __metadata_dst_init() as there is no dev supplied. On the other hand, the skb_dst(skb)->dev is actually not needed as we called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif; So make a dst dev check here should be clean and safe. v4: No changes. v3: No changes. v2: fix the issue in decode_session{4,6} instead of updating shared dst dev in {ip_md, ip6}_tunnel_xmit. Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20xfrm: policy: avoid warning splat when merging nodesFlorian Westphal
syzbot reported a splat: xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57 Call Trace: xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline] xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline] xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023 xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139 xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182 xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574 xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670 xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477 xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] There is no reproducer, however, the warning can be reproduced by adding rules with ever smaller prefixes. The sanity check ("does the policy match the node") uses the prefix value of the node before its updated to the smaller value. To fix this, update the prefix earlier. The bug has no impact on tree correctness, this is only to prevent a false warning. Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-30net: Use skb_frag_off accessorsJonathan Lemon
Use accessor functions for skb fragment's page_offset instead of direct references, in preparation for bvec conversion. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-17xfrm interface: fix management of phydevNicolas Dichtel
With the current implementation, phydev cannot be removed: $ ip link add dummy type dummy $ ip link add xfrm1 type xfrm dev dummy if_id 1 $ ip l d dummy kernel:[77938.465445] unregister_netdevice: waiting for dummy to become free. Usage count = 1 Manage it like in ip tunnels, ie just keep the ifindex. Not that the side effect, is that the phydev is now optional. Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-17xfrm interface: fix list corruption for x-netnsNicolas Dichtel
dev_net(dev) is the netns of the device and xi->net is the link netns, where the device has been linked. changelink() must operate in the link netns to avoid a corruption of the xfrm lists. Note that xi->net and dev_net(xi->physdev) are always the same. Before the patch, the xfrmi lists may be corrupted and can later trigger a kernel panic. Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Reported-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-17xfrm interface: ifname may be wrong in logsNicolas Dichtel
The ifname is copied when the interface is created, but is never updated later. In fact, this property is used only in one error message, where the netdevice pointer is available, thus let's use it. Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-17xfrm interface: avoid corruption on changelinkNicolas Dichtel
The new parameters must not be stored in the netdev_priv() before validation, it may corrupt the interface. Note also that if data is NULL, only a memset() is done. $ ip link add xfrm1 type xfrm dev lo if_id 1 $ ip link add xfrm2 type xfrm dev lo if_id 2 $ ip link set xfrm1 type xfrm dev lo if_id 2 RTNETLINK answers: File exists $ ip -d link list dev xfrm1 5: xfrm1@lo: <NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/none 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 68 maxmtu 1500 xfrm if_id 0x2 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 => "if_id 0x2" Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-07-05 1) A lot of work to remove indirections from the xfrm code. From Florian Westphal. 2) Fix a WARN_ON with ipv6 that triggered because of a forgotten break statement. From Florian Westphal. 3) Remove xfrmi_init_net, it is not needed. From Li RongQing. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-07-05 1) Fix xfrm selector prefix length validation for inter address family tunneling. From Anirudh Gupta. 2) Fix a memleak in pfkey. From Jeremy Sowden. 3) Fix SA selector validation to allow empty selectors again. From Nicolas Dichtel. 4) Select crypto ciphers for xfrm_algo, this fixes some randconfig builds. From Arnd Bergmann. 5) Remove a duplicated assignment in xfrm_bydst_resize. From Cong Wang. 6) Fix a hlist corruption on hash rebuild. From Florian Westphal. 7) Fix a memory leak when creating xfrm interfaces. From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03xfrm interface: fix memory leak on creationNicolas Dichtel
The following commands produce a backtrace and return an error but the xfrm interface is created (in the wrong netns): $ ip netns add foo $ ip netns add bar $ ip -n foo netns set bar 0 $ ip -n foo link add xfrmi0 link-netnsid 0 type xfrm dev lo if_id 23 RTNETLINK answers: Invalid argument $ ip -n bar link ls xfrmi0 2: xfrmi0@lo: <NOARP,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/none 00:00:00:00:00:00 brd 00:00:00:00:00:00 Here is the backtrace: [ 79.879174] WARNING: CPU: 0 PID: 1178 at net/core/dev.c:8172 rollback_registered_many+0x86/0x3c1 [ 79.880260] Modules linked in: xfrm_interface nfsv3 nfs_acl auth_rpcgss nfsv4 nfs lockd grace sunrpc fscache button parport_pc parport serio_raw evdev pcspkr loop ext4 crc16 mbcache jbd2 crc32c_generic ide_cd_mod ide_gd_mod cdrom ata_$ eneric ata_piix libata scsi_mod 8139too piix psmouse i2c_piix4 ide_core 8139cp mii i2c_core floppy [ 79.883698] CPU: 0 PID: 1178 Comm: ip Not tainted 5.2.0-rc6+ #106 [ 79.884462] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 79.885447] RIP: 0010:rollback_registered_many+0x86/0x3c1 [ 79.886120] Code: 01 e8 d7 7d c6 ff 0f 0b 48 8b 45 00 4c 8b 20 48 8d 58 90 49 83 ec 70 48 8d 7b 70 48 39 ef 74 44 8a 83 d0 04 00 00 84 c0 75 1f <0f> 0b e8 61 cd ff ff 48 b8 00 01 00 00 00 00 ad de 48 89 43 70 66 [ 79.888667] RSP: 0018:ffffc900015ab740 EFLAGS: 00010246 [ 79.889339] RAX: ffff8882353e5700 RBX: ffff8882353e56a0 RCX: ffff8882353e5710 [ 79.890174] RDX: ffffc900015ab7e0 RSI: ffffc900015ab7e0 RDI: ffff8882353e5710 [ 79.891029] RBP: ffffc900015ab7e0 R08: ffffc900015ab7e0 R09: ffffc900015ab7e0 [ 79.891866] R10: ffffc900015ab7a0 R11: ffffffff82233fec R12: ffffc900015ab770 [ 79.892728] R13: ffffffff81eb7ec0 R14: ffff88822ed6cf00 R15: 00000000ffffffea [ 79.893557] FS: 00007ff350f31740(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 [ 79.894581] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 79.895317] CR2: 00000000006c8580 CR3: 000000022c272000 CR4: 00000000000006f0 [ 79.896137] Call Trace: [ 79.896464] unregister_netdevice_many+0x12/0x6c [ 79.896998] __rtnl_newlink+0x6e2/0x73b [ 79.897446] ? __kmalloc_node_track_caller+0x15e/0x185 [ 79.898039] ? pskb_expand_head+0x5f/0x1fe [ 79.898556] ? stack_access_ok+0xd/0x2c [ 79.899009] ? deref_stack_reg+0x12/0x20 [ 79.899462] ? stack_access_ok+0xd/0x2c [ 79.899927] ? stack_access_ok+0xd/0x2c [ 79.900404] ? __module_text_address+0x9/0x4f [ 79.900910] ? is_bpf_text_address+0x5/0xc [ 79.901390] ? kernel_text_address+0x67/0x7b [ 79.901884] ? __kernel_text_address+0x1a/0x25 [ 79.902397] ? unwind_get_return_address+0x12/0x23 [ 79.903122] ? __cmpxchg_double_slab.isra.37+0x46/0x77 [ 79.903772] rtnl_newlink+0x43/0x56 [ 79.904217] rtnetlink_rcv_msg+0x200/0x24c In fact, each time a xfrm interface was created, a netdev was allocated by __rtnl_newlink()/rtnl_create_link() and then another one by xfrmi_newlink()/xfrmi_create(). Only the second one was registered, it's why the previous commands produce a backtrace: dev_change_net_namespace() was called on a netdev with reg_state set to NETREG_UNINITIALIZED (the first one). CC: Lorenzo Colitti <lorenzo@google.com> CC: Benedict Wong <benedictwong@google.com> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Shannon Nelson <shannon.nelson@oracle.com> CC: Antony Antony <antony@phenome.org> CC: Eyal Birger <eyal.birger@gmail.com> Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces") Reported-by: Julien Floret <julien.floret@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-03xfrm: policy: fix bydst hlist corruption on hash rebuildFlorian Westphal
syzbot reported following spat: BUG: KASAN: use-after-free in __write_once_size include/linux/compiler.h:221 BUG: KASAN: use-after-free in hlist_del_rcu include/linux/rculist.h:455 BUG: KASAN: use-after-free in xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 Write of size 8 at addr ffff888095e79c00 by task kworker/1:3/8066 Workqueue: events xfrm_hash_rebuild Call Trace: __write_once_size include/linux/compiler.h:221 [inline] hlist_del_rcu include/linux/rculist.h:455 [inline] xfrm_hash_rebuild+0xa0d/0x1000 net/xfrm/xfrm_policy.c:1318 process_one_work+0x814/0x1130 kernel/workqueue.c:2269 Allocated by task 8064: __kmalloc+0x23c/0x310 mm/slab.c:3669 kzalloc include/linux/slab.h:742 [inline] xfrm_hash_alloc+0x38/0xe0 net/xfrm/xfrm_hash.c:21 xfrm_policy_init net/xfrm/xfrm_policy.c:4036 [inline] xfrm_net_init+0x269/0xd60 net/xfrm/xfrm_policy.c:4120 ops_init+0x336/0x420 net/core/net_namespace.c:130 setup_net+0x212/0x690 net/core/net_namespace.c:316 The faulting address is the address of the old chain head, free'd by xfrm_hash_resize(). In xfrm_hash_rehash(), chain heads get re-initialized without any hlist_del_rcu: for (i = hmask; i >= 0; i--) INIT_HLIST_HEAD(odst + i); Then, hlist_del_rcu() gets called on the about to-be-reinserted policy when iterating the per-net list of policies. hlist_del_rcu() will then make chain->first be nonzero again: static inline void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; // address of next element in list struct hlist_node **pprev = n->pprev;// location of previous elem, this // can point at chain->first WRITE_ONCE(*pprev, next); // chain->first points to next elem if (next) next->pprev = pprev; Then, when we walk chainlist to find insertion point, we may find a non-empty list even though we're supposedly reinserting the first policy to an empty chain. To fix this first unlink all exact and inexact policies instead of zeroing the list heads. Add the commands equivalent to the syzbot reproducer to xfrm_policy.sh, without fix KASAN catches the corruption as it happens, SLUB poisoning detects it a bit later. Reported-by: syzbot+0165480d4ef07360eeda@syzkaller.appspotmail.com Fixes: 1548bc4e0512 ("xfrm: policy: delete inexact policies from inexact list on hash rebuild") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-02xfrm: remove a duplicated assignmentCong Wang
Fixes: 30846090a746 ("xfrm: policy: add sequence count to sync with hash resize") Cc: Florian Westphal <fw@strlen.de> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-07-01xfrm: remove get_mtu indirection from xfrm_typeFlorian Westphal
esp4_get_mtu and esp6_get_mtu are exactly the same, the only difference is a single sizeof() (ipv4 vs. ipv6 header). Merge both into xfrm_state_mtu() and remove the indirection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>