summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2022-06-10wifi: mac80211: remove cipher scheme supportJohannes Berg
The only driver using this was iwlwifi, where we just removed the support because it was never really used. Remove the code from mac80211 as well. Change-Id: I1667417a5932315ee9d81f5c233c56a354923f09 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-06-10treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULEThomas Gleixner
Based on the normalized pattern: netapp provides this source code under the gpl v2 license the gpl v2 license is available at https://opensource org/licenses/gpl-license php this software is provided by the copyright holders and contributors as is and any express or implied warranties including but not limited to the implied warranties of merchantability and fitness for a particular purpose are disclaimed in no event shall the copyright owner or contributors be liable for any direct indirect incidental special exemplary or consequential damages (including but not limited to procurement of substitute goods or services loss of use data or profits or business interruption) however caused and on any theory of liability whether in contract strict liability or tort (including negligence or otherwise) arising in any way out of the use of this software even if advised of the possibility of such damage extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference. Reviewed-by: Allison Randal <allison@lohutok.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_30.RULE ↵Thomas Gleixner
(part 2) Based on the normalized pattern: this program is free software you can redistribute it and/or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this program is distributed as is without any warranty of any kind whether express or implied without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference. Reviewed-by: Allison Randal <allison@lohutok.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10xfrm: no need to set DST_NOPOLICY in IPv4Eyal Birger
This is a cleanup patch following commit e6175a2ed1f1 ("xfrm: fix "disable_policy" flag use when arriving from different devices") which made DST_NOPOLICY no longer be used for inbound policy checks. On outbound the flag was set, but never used. As such, avoid setting it altogether and remove the nopolicy argument from rt_dst_alloc(). Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-06-09Merge tag 'ieee802154-for-net-next-2022-06-09' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2022-06-09 This is a separate pull request for 6lowpan changes. We agreed with the bluetooth maintainers to switch the trees these changing are going into from bluetooth to ieee802154. Jukka Rissanen stepped down as a co-maintainer of 6lowpan (Thanks for the work!). Alexander is staying as maintainer. Alexander reworked the nhc_id lookup in 6lowpan to be way simpler. Moved the data structure from rb to an array, which is all we need in this case. * tag 'ieee802154-for-net-next-2022-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next: MAINTAINERS: Remove Jukka Rissanen as 6lowpan maintainer net: 6lowpan: constify lowpan_nhc structures net: 6lowpan: use array for find nhc id net: 6lowpan: remove const from scalars ==================== Link: https://lore.kernel.org/r/20220609202956.1512156-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdevAndrea Mayer
Commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") adds a new entry (flowi_l3mdev) in the common flow struct used for indicating the l3mdev index for later rule and table matching. The l3mdev_update_flow() has been adapted to properly set the flowi_l3mdev based on the flowi_oif/flowi_iif. In fact, when a valid flowi_iif is supplied to the l3mdev_update_flow(), this function can update the flowi_l3mdev entry only if it has not yet been set (i.e., the flowi_l3mdev entry is equal to 0). The SRv6 End.DT6 behavior in VRF mode leverages a VRF device in order to force the routing lookup into the associated routing table. This routing operation is performed by seg6_lookup_any_nextop() preparing a flowi6 data structure used by ip6_route_input_lookup() which, in turn, (indirectly) invokes l3mdev_update_flow(). However, seg6_lookup_any_nexthop() does not initialize the new flowi_l3mdev entry which is filled with random garbage data. This prevents l3mdev_update_flow() from properly updating the flowi_l3mdev with the VRF index, and thus SRv6 End.DT6 (VRF mode)/DT46 behaviors are broken. This patch correctly initializes the flowi6 instance allocated and used by seg6_lookup_any_nexhtop(). Specifically, the entire flowi6 instance is wiped out: in case new entries are added to flowi/flowi6 (as happened with the flowi_l3mdev entry), we should no longer have incorrectly initialized values. As a result of this operation, the value of flowi_l3mdev is also set to 0. The proposed fix can be tested easily. Starting from the commit referenced in the Fixes, selftests [1],[2] indicate that the SRv6 End.DT6 (VRF mode)/DT46 behaviors no longer work correctly. By applying this patch, those behaviors are back to work properly again. [1] - tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh [2] - tools/testing/selftests/net/srv6_end_dt6_l3vpn_test.sh Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: Anton Makarov <am@3a-alliance.com> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220608091917.20345-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: add napi_get_frags_check() helperEric Dumazet
This is a follow up of commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") When/if we increase MAX_SKB_FRAGS, we better make sure the old bug will not come back. Adding a check in napi_get_frags() would be costly, even if using DEBUG_NET_WARN_ON_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: add debug checks in napi_consume_skb and __napi_alloc_skb()Eric Dumazet
Commit 6454eca81eae ("net: Use lockdep_assert_in_softirq() in napi_consume_skb()") added a check in napi_consume_skb() which is a bit weak. napi_consume_skb() and __napi_alloc_skb() should only be used from BH context, not from hard irq or nmi context, otherwise we could have races. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: use DEBUG_NET_WARN_ON_ONCE() in skb_release_head_state()Eric Dumazet
Remove this check from fast path unless CONFIG_DEBUG_NET=y Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09af_unix: use DEBUG_NET_WARN_ON_ONCE()Eric Dumazet
Replace four WARN_ON() that have not triggered recently with DEBUG_NET_WARN_ON_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: use WARN_ON_ONCE() in sk_stream_kill_queues()Eric Dumazet
sk_stream_kill_queues() has three checks which have been useful to detect kernel bugs in the past. However they are potentially a problem because they could flood the syslog. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: use WARN_ON_ONCE() in inet_sock_destruct()Eric Dumazet
inet_sock_destruct() has four warnings which have been useful to point to kernel bugs in the past. However they are potentially a problem because they could flood the syslog. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: use DEBUG_NET_WARN_ON_ONCE() in dev_loopback_xmit()Eric Dumazet
One check in dev_loopback_xmit() has not caught issues in the past. Keep it for CONFIG_DEBUG_NET=y builds only. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: use DEBUG_NET_WARN_ON_ONCE() in __release_sock()Eric Dumazet
Check against skb dst in socket backlog has never triggered in past years. Keep the check omly for CONFIG_DEBUG_NET=y builds. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09drop_monitor: adopt u64_stats_tEric Dumazet
As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type") we should use u64_stats_t and related accessors to avoid load/store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09devlink: adopt u64_stats_tEric Dumazet
As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type") we should use u64_stats_t and related accessors to avoid load/store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: adopt u64_stats_t in struct pcpu_sw_netstatsEric Dumazet
As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type") we should use u64_stats_t and related accessors to avoid load/store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09ip6_tunnel: use dev_sw_netstats_rx_add()Eric Dumazet
We have a convenient helper, let's use it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09sit: use dev_sw_netstats_rx_add()Eric Dumazet
We have a convenient helper, let's use it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09vlan: adopt u64_stats_tEric Dumazet
As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type") we should use u64_stats_t and related accessors to avoid load/store tearing. Add READ_ONCE() when reading rx_errors & tx_dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: rename reference+tracking helpersJakub Kicinski
Netdev reference helpers have a dev_ prefix for historic reasons. Renaming the old helpers would be too much churn but we can rename the tracking ones which are relatively recent and should be the default for new code. Rename: dev_hold_track() -> netdev_hold() dev_put_track() -> netdev_put() dev_replace_track() -> netdev_ref_replace() Link: https://lore.kernel.org/r/20220608043955.919359-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TXMaxim Mikityanskiy
To embrace possible future optimizations of TLS, rename zerocopy sendfile definitions to more generic ones: * setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO * sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX RO stands for readonly and emphasizes that the application shouldn't modify the data being transmitted with zerocopy to avoid potential disconnection. Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09net: 6lowpan: constify lowpan_nhc structuresAlexander Aring
This patch constify the lowpan_nhc declarations. Since we drop the rb node datastructure there is no need for runtime manipulation of this structure. Signed-off-by: Alexander Aring <aahringo@redhat.com> Reviewed-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Link: https://lore.kernel.org/r/20220428030534.3220410-4-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2022-06-09net: 6lowpan: use array for find nhc idAlexander Aring
This patch will remove the complete overengineered and overthinking rb data structure for looking up the nhc by nhcid. Instead we using the existing nhc next header array and iterate over it. It works now for 1 byte values only. However there are only 1 byte nhc id values currently supported and IANA also does not specify large than 1 byte values yet. If there are 2 byte values for nhc ids specified we can revisit this data structure and add support for it. Signed-off-by: Alexander Aring <aahringo@redhat.com> Reviewed-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Link: https://lore.kernel.org/r/20220428030534.3220410-3-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2022-06-09net: 6lowpan: remove const from scalarsAlexander Aring
The keyword const makes no sense for scalar types inside the lowpan_nhc structure. Most compilers will ignore it so we remove the keyword from the scalar types. Signed-off-by: Alexander Aring <aahringo@redhat.com> Reviewed-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Link: https://lore.kernel.org/r/20220428030534.3220410-2-aahringo@redhat.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2022-06-09Merge tag 'net-5.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - eth: amt: fix possible null-ptr-deref in amt_rcv() Previous releases - regressions: - tcp: use alloc_large_system_hash() to allocate table_perturb - af_unix: fix a data-race in unix_dgram_peer_wake_me() - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF Previous releases - always broken: - ipv6: fix signed integer overflow in __ip6_append_data - netfilter: - nat: really support inet nat without l3 address - nf_tables: memleak flow rule from commit path - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs - openvswitch: fix misuse of the cached connection on tuple changes - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred - eth: altera: fix refcount leak in altera_tse_mdio_create Misc: - add Quentin Monnet to bpftool maintainers" * tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) net: amd-xgbe: fix clang -Wformat warning tcp: use alloc_large_system_hash() to allocate table_perturb net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY net: dsa: mv88e6xxx: correctly report serdes link failure net: dsa: mv88e6xxx: fix BMSR error to be consistent with others net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete net: altera: Fix refcount leak in altera_tse_mdio_create net: openvswitch: fix misuse of the cached connection on tuple changes net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag ip_gre: test csum_start instead of transport header au1000_eth: stop using virt_to_bus() ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg ipv6: Fix signed integer overflow in __ip6_append_data nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION net: ipv6: unexport __init-annotated seg6_hmac_init() net: xfrm: unexport __init-annotated xfrm4_protocol_init() net: mdio: unexport __init-annotated mdio_bus_init() ...
2022-06-099p: handling Rerror without copy_from_iter_full()Al Viro
p9_client_zc_rpc()/p9_check_zc_errors() are playing fast and loose with copy_from_iter_full(). Reading from file is done by sending Tread request. Response consists of fixed-sized header (including the amount of data actually read) followed by the data itself. For zero-copy case we arrange the things so that the first 11 bytes of reply go into the fixed-sized buffer, with the rest going straight into the pages we want to read into. What makes the things inconvenient is that sglist describing what should go where has to be set *before* the reply arrives. As the result, if reply is an error, the things get interesting. On success we get size[4] Rread tag[2] count[4] data[count] For error layout varies depending upon the protocol variant - in original 9P and 9P2000 it's size[4] Rerror tag[2] len[2] error[len] in 9P2000.U size[4] Rerror tag[2] len[2] error[len] errno[4] in 9P2000.L size[4] Rlerror tag[2] errno[4] The last case is nice and simple - we have an 11-byte response that fits into the fixed-sized buffer we hoped to get an Rread into. In other two, though, we get a variable-length string spill into the pages we'd prepared for the data to be read. Had that been in fixed-sized buffer (which is actually 4K), we would've dealt with that the same way we handle non-zerocopy case. However, for zerocopy it doesn't end up there, so we need to copy it from those pages. The trouble is, by the time we get around to that, the references to pages in question are already dropped. As the result, p9_zc_check_errors() tries to get the data using copy_from_iter_full(). Unfortunately, the iov_iter it's trying to read from might *NOT* be capable of that. It is, after all, a data destination, not data source. In particular, if it's an ITER_PIPE one, copy_from_iter_full() will simply fail. In ->zc_request() itself we do have those pages and dealing with the problem in there would be a simple matter of memcpy_from_page() into the fixed-sized buffer. Moreover, it isn't hard to recognize the (rare) case when such copying is needed. That way we get rid of p9_zc_check_errors() entirely - p9_check_errors() can be used instead both for zero-copy and non-zero-copy cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-08tcp: use alloc_large_system_hash() to allocate table_perturbMuchun Song
In our server, there may be no high order (>= 6) memory since we reserve lots of HugeTLB pages when booting. Then the system panic. So use alloc_large_system_hash() to allocate table_perturb. Fixes: e9261476184b ("tcp: dynamically allocate the perturb table used by source ports") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08net: openvswitch: fix misuse of the cached connection on tuple changesIlya Maximets
If packet headers changed, the cached nfct is no longer relevant for the packet and attempt to re-use it leads to the incorrect packet classification. This issue is causing broken connectivity in OpenStack deployments with OVS/OVN due to hairpin traffic being unexpectedly dropped. The setup has datapath flows with several conntrack actions and tuple changes between them: actions:ct(commit,zone=8,mark=0/0x1,nat(src)), set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)), set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)), ct(zone=8),recirc(0x4) After the first ct() action the packet headers are almost fully re-written. The next ct() tries to re-use the existing nfct entry and marks the packet as invalid, so it gets dropped later in the pipeline. Clearing the cached conntrack entry whenever packet tuple is changed to avoid the issue. The flow key should not be cleared though, because we should still be able to match on the ct_state if the recirculation happens after the tuple change but before the next ct() action. Cc: stable@vger.kernel.org Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Frode Nordahl <frode.nordahl@canonical.com> Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08ip_gre: test csum_start instead of transport headerWillem de Bruijn
GRE with TUNNEL_CSUM will apply local checksum offload on CHECKSUM_PARTIAL packets. ipgre_xmit must validate csum_start after an optional skb_pull, else lco_csum may trigger an overflow. The original check was if (csum && skb_checksum_start(skb) < skb->data) return -EINVAL; This had false positives when skb_checksum_start is undefined: when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement was straightforward if (csum && skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_start(skb) < skb->data) return -EINVAL; But was eventually revised more thoroughly: - restrict the check to the only branch where needed, in an uncommon GRE path that uses header_ops and calls skb_pull. - test skb_transport_header, which is set along with csum_start in skb_partial_csum_set in the normal header_ops datapath. Turns out skbs can arrive in this branch without the transport header set, e.g., through BPF redirection. Revise the check back to check csum_start directly, and only if CHECKSUM_PARTIAL. Do leave the check in the updated location. Check field regardless of whether TUNNEL_CSUM is configured. Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/ Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gmail.com/T/#u Fixes: 8a0ed250f911 ("ip_gre: validate csum_start only on pull") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf 2022-06-09 We've added 6 non-merge commits during the last 2 day(s) which contain a total of 8 files changed, 49 insertions(+), 15 deletions(-). The main changes are: 1) Fix an illegal copy_to_user() attempt seen by syzkaller through arm64 BPF JIT compiler, from Eric Dumazet. 2) Fix calling global functions from BPF_PROG_TYPE_EXT programs by using the correct program context type, from Toke Høiland-Jørgensen. 3) Fix XSK TX batching invalid descriptor handling, from Maciej Fijalkowski. 4) Fix potential integer overflows in multi-kprobe link code by using safer kvmalloc_array() allocation helpers, from Dan Carpenter. 5) Add Quentin as bpftool maintainer, from Quentin Monnet. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: MAINTAINERS: Add a maintainer for bpftool xsk: Fix handling of invalid descriptors in XSK TX batching API selftests/bpf: Add selftest for calling global functions from freplace bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs bpf: Use safer kvmalloc_array() where possible bpf, arm64: Clear prog->jited_len along prog->jited ==================== Link: https://lore.kernel.org/r/20220608234133.32265-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08ipv6: Fix signed integer overflow in l2tp_ip6_sendmsgWang Yufen
When len >= INT_MAX - transhdrlen, ulen = len + transhdrlen will be overflow. To fix, we can follow what udpv6 does and subtract the transhdrlen from the max. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Link: https://lore.kernel.org/r/20220607120028.845916-2-wangyufen@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08ipv6: Fix signed integer overflow in __ip6_append_dataWang Yufen
Resurrect ubsan overflow checks and ubsan report this warning, fix it by change the variable [length] type to size_t. UBSAN: signed-integer-overflow in net/ipv6/ip6_output.c:1489:19 2147479552 + 8567 cannot be represented in type 'int' CPU: 0 PID: 253 Comm: err Not tainted 5.16.0+ #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x214/0x230 show_stack+0x30/0x78 dump_stack_lvl+0xf8/0x118 dump_stack+0x18/0x30 ubsan_epilogue+0x18/0x60 handle_overflow+0xd0/0xf0 __ubsan_handle_add_overflow+0x34/0x44 __ip6_append_data.isra.48+0x1598/0x1688 ip6_append_data+0x128/0x260 udpv6_sendmsg+0x680/0xdd0 inet6_sendmsg+0x54/0x90 sock_sendmsg+0x70/0x88 ____sys_sendmsg+0xe8/0x368 ___sys_sendmsg+0x98/0xe0 __sys_sendmmsg+0xf4/0x3b8 __arm64_sys_sendmmsg+0x34/0x48 invoke_syscall+0x64/0x160 el0_svc_common.constprop.4+0x124/0x300 do_el0_svc+0x44/0xc8 el0_svc+0x3c/0x1e8 el0t_64_sync_handler+0x88/0xb0 el0t_64_sync+0x16c/0x170 Changes since v1: -Change the variable [length] type to unsigned, as Eric Dumazet suggested. Changes since v2: -Don't change exthdrlen type in ip6_make_skb, as Paolo Abeni suggested. Changes since v3: -Don't change ulen type in udpv6_sendmsg and l2tp_ip6_sendmsg, as Jakub Kicinski suggested. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Yufen <wangyufen@huawei.com> Link: https://lore.kernel.org/r/20220607120028.845916-1-wangyufen@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08net: ipv6: unexport __init-annotated seg6_hmac_init()Masahiro Yamada
EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Hence, modules cannot use symbols annotated __init. The access to a freed symbol may end up with kernel panic. modpost used to detect it, but it has been broken for a decade. Recently, I fixed modpost so it started to warn it again, then this showed up in linux-next builds. There are two ways to fix it: - Remove __init - Remove EXPORT_SYMBOL I chose the latter for this case because the caller (net/ipv6/seg6.c) and the callee (net/ipv6/seg6_hmac.c) belong to the same module. It seems an internal function call in ipv6.ko. Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08net: xfrm: unexport __init-annotated xfrm4_protocol_init()Masahiro Yamada
EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Hence, modules cannot use symbols annotated __init. The access to a freed symbol may end up with kernel panic. modpost used to detect it, but it has been broken for a decade. Recently, I fixed modpost so it started to warn it again, then this showed up in linux-next builds. There are two ways to fix it: - Remove __init - Remove EXPORT_SYMBOL I chose the latter for this case because the only in-tree call-site, net/ipv4/xfrm4_policy.c is never compiled as modular. (CONFIG_XFRM is boolean) Fixes: 2f32b51b609f ("xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()Chuck Lever
To make the code easier to read, remove visual clutter by changing the declared type of @p. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2022-06-08SUNRPC: Clean up xdr_get_next_encode_buffer()Chuck Lever
The value of @p is not used until the "location of the next item" is computed. Help human readers by moving its initial assignment to the paragraph where that value is used and by clarifying the antecedents in the documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.com> Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2022-06-08SUNRPC: Clean up xdr_commit_encode()Chuck Lever
Both the kvec::iov_len field and the third parameter of memcpy() and memmove() are size_t. There's no reason for the implicit conversion from size_t to int and back. Change the type of @shift to make the code easier to read and understand. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2022-06-08SUNRPC: Optimize xdr_reserve_space()Chuck Lever
Transitioning between encode buffers is quite infrequent. It happens about 1 time in 400 calls to xdr_reserve_space(), measured on NFSD with a typical build/test workload. Force the compiler to remove that code from xdr_reserve_space(), which is a hot path on both the server and the client. This change reduces the size of xdr_reserve_space() from 10 cache lines to 2 when compiled with -Os. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2022-06-08SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()Chuck Lever
I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up right at the end of the page array. xdr_get_next_encode_buffer() does not compute the value of xdr->end correctly: * The check to see if we're on the final available page in xdr->buf needs to account for the space consumed by @nbytes. * The new xdr->end value needs to account for the portion of @nbytes that is to be encoded into the previous buffer. Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2022-06-08xsk: Fix handling of invalid descriptors in XSK TX batching APIMaciej Fijalkowski
xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK Tx batching API. There is a test that checks how invalid Tx descriptors are handled by AF_XDP. Each valid descriptor is followed by invalid one on Tx side whereas the Rx side expects only to receive a set of valid descriptors. In current xsk_tx_peek_release_desc_batch() function, the amount of available descriptors is hidden inside xskq_cons_peek_desc_batch(). This can be problematic in cases where invalid descriptors are present due to the fact that xskq_cons_peek_desc_batch() returns only a count of valid descriptors. This means that it is impossible to properly update XSK ring state when calling xskq_cons_release_n(). To address this issue, pull out the contents of xskq_cons_peek_desc_batch() so that callers (currently only xsk_tx_peek_release_desc_batch()) will always be able to update the state of ring properly, as total count of entries is now available and use this value as an argument in xskq_cons_release_n(). By doing so, xskq_cons_peek_desc_batch() can be dropped altogether. Fixes: 9349eb3a9d2a ("xsk: Introduce batched Tx descriptor interfaces") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220607142200.576735-1-maciej.fijalkowski@intel.com
2022-06-08netfilter: use get_random_u32 instead of prandomFlorian Westphal
bh might occur while updating per-cpu rnd_state from user context, ie. local_out path. BUG: using smp_processor_id() in preemptible [00000000] code: nginx/2725 caller is nft_ng_random_eval+0x24/0x54 [nft_numgen] Call Trace: check_preemption_disabled+0xde/0xe0 nft_ng_random_eval+0x24/0x54 [nft_numgen] Use the random driver instead, this also avoids need for local prandom state. Moreover, prandom now uses the random driver since d4150779e60f ("random32: use real rng for non-deterministic randomness"). Based on earlier patch from Pablo Neira. Fixes: 6b2faee0ca91 ("netfilter: nft_meta: place prandom handling in a helper") Fixes: 978d8f9055c3 ("netfilter: nft_numgen: add map lookups for numgen random operations") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix NAT support for NFPROTO_INET without layer 3 address, from Florian Westphal. 2) Use kfree_rcu(ptr, rcu) variant in nf_tables clean_net path. 3) Use list to collect flowtable hooks to be deleted. 4) Initialize list of hook field in flowtable transaction. 5) Release hooks on error for flowtable updates. 6) Memleak in hardware offload rule commit and abort paths. 7) Early bail out in case device does not support for hardware offload. This adds a new interface to net/core/flow_offload.c to check if the flow indirect block list is empty. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: bail out early if hardware offload is not supported netfilter: nf_tables: memleak flow rule from commit path netfilter: nf_tables: release new hooks on unsupported flowtable flags netfilter: nf_tables: always initialize flowtable hook list in transaction netfilter: nf_tables: delete flowtable hooks via transaction list netfilter: nf_tables: use kfree_rcu(ptr, rcu) to release hooks in clean_net path netfilter: nat: really support inet nat without l3 address ==================== Link: https://lore.kernel.org/r/20220606212055.98300-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-07sunrpc: set cl_max_connect when cloning an rpc_clntScott Mayhew
If the initial attempt at trunking detection using the krb5i auth flavor fails with -EACCES, -NFS4ERR_CLID_INUSE, or -NFS4ERR_WRONGSEC, then the NFS client tries again using auth_sys, cloning the rpc_clnt in the process. If this second attempt at trunking detection succeeds, then the resulting nfs_client->cl_rpcclient winds up having cl_max_connect=0 and subsequent attempts to add additional transport connections to the rpc_clnt will fail with a message similar to the following being logged: [502044.312640] SUNRPC: reached max allowed number (0) did not add transport to server: 192.168.122.3 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: dc48e0abee24 ("SUNRPC enforce creation of no more than max_connect xprts") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-07net: skb: use auto-generation to convert skb drop reason to stringMenglong Dong
It is annoying to add new skb drop reasons to 'enum skb_drop_reason' and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget to add the new reasons we added to TRACE_SKB_DROP_REASON. TRACE_SKB_DROP_REASON is used to convert drop reason of type number to string. For now, the string we passed to user space is exactly the same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_' prefix. Therefore, we can use 'auto-generation' to generate these drop reasons to string at build time. The new source 'dropreason_str.c' will be auto generated during build time, which contains the string array 'const char * const drop_reasons[]'. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-07af_unix: Fix a data-race in unix_dgram_peer_wake_me().Kuniyuki Iwashima
unix_dgram_poll() calls unix_dgram_peer_wake_me() without `other`'s lock held and check if its receive queue is full. Here we need to use unix_recvq_full_lockless() instead of unix_recvq_full(), otherwise KCSAN will report a data-race. Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20220605232325.11804-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-06netfilter: nf_tables: bail out early if hardware offload is not supportedPablo Neira Ayuso
If user requests for NFT_CHAIN_HW_OFFLOAD, then check if either device provides the .ndo_setup_tc interface or there is an indirect flow block that has been registered. Otherwise, bail out early from the preparation phase. Moreover, validate that family == NFPROTO_NETDEV and hook is NF_NETDEV_INGRESS. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-06netfilter: nf_tables: memleak flow rule from commit pathPablo Neira Ayuso
Abort path release flow rule object, however, commit path does not. Update code to destroy these objects before releasing the transaction. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-06netfilter: nf_tables: release new hooks on unsupported flowtable flagsPablo Neira Ayuso
Release the list of new hooks that are pending to be registered in case that unsupported flowtable flags are provided. Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>