summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2023-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. net/mac80211/key.c 02e0e426a2fb ("wifi: mac80211: fix error path key leak") 2a8b665e6bcc ("wifi: mac80211: remove key_mtx") 7d6904bf26b9 ("Merge wireless into wireless-next") https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/ti/Kconfig a602ee3176a8 ("net: ethernet: ti: Fix mixed module-builtin object") 98bdeae9502b ("net: cpmac: remove driver to prepare for platform removal") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19inet: lock the socket in ip_sock_set_tos()Eric Dumazet
Christoph Paasch reported a panic in TCP stack [1] Indeed, we should not call sk_dst_reset() without holding the socket lock, as __sk_dst_get() callers do not all rely on bare RCU. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 12bad6067 P4D 12bad6067 PUD 12bad5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 2750 Comm: syz-executor.5 Not tainted 6.6.0-rc4-g7a5720a344e7 #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:tcp_get_metrics+0x118/0x8f0 net/ipv4/tcp_metrics.c:321 Code: c7 44 24 70 02 00 8b 03 89 44 24 48 c7 44 24 4c 00 00 00 00 66 c7 44 24 58 02 00 66 ba 02 00 b1 01 89 4c 24 04 4c 89 7c 24 10 <49> 8b 0f 48 8b 89 50 05 00 00 48 89 4c 24 30 33 81 00 02 00 00 69 RSP: 0018:ffffc90000af79b8 EFLAGS: 00010293 RAX: 000000000100007f RBX: ffff88812ae8f500 RCX: ffff88812b5f8f01 RDX: 0000000000000002 RSI: ffffffff8300f080 RDI: 0000000000000002 RBP: 0000000000000002 R08: 0000000000000003 R09: ffffffff8205eca0 R10: 0000000000000002 R11: ffff88812b5f8f00 R12: ffff88812a9e0580 R13: 0000000000000000 R14: ffff88812ae8fbd2 R15: 0000000000000000 FS: 00007f70a006b640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000012bad7003 CR4: 0000000000170ee0 Call Trace: <TASK> tcp_fastopen_cache_get+0x32/0x140 net/ipv4/tcp_metrics.c:567 tcp_fastopen_cookie_check+0x28/0x180 net/ipv4/tcp_fastopen.c:419 tcp_connect+0x9c8/0x12a0 net/ipv4/tcp_output.c:3839 tcp_v4_connect+0x645/0x6e0 net/ipv4/tcp_ipv4.c:323 __inet_stream_connect+0x120/0x590 net/ipv4/af_inet.c:676 tcp_sendmsg_fastopen+0x2d6/0x3a0 net/ipv4/tcp.c:1021 tcp_sendmsg_locked+0x1957/0x1b00 net/ipv4/tcp.c:1073 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1336 __sock_sendmsg+0x83/0xd0 net/socket.c:730 __sys_sendto+0x20a/0x2a0 net/socket.c:2194 __do_sys_sendto net/socket.c:2206 [inline] Fixes: e08d0b3d1723 ("inet: implement lockless IP_TOS") Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231018090014.345158-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-18netfilter: nf_tables: de-constify set commit ops function argumentFlorian Westphal
The set backend using this already has to work around this via ugly cast, don't spread this pattern. Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18net: treat possible_net_t net pointer as an RCU one and add read_pnet_rcu()Jiri Pirko
Make the net pointer stored in possible_net_t structure annotated as an RCU pointer. Change the access helpers to treat it as such. Introduce read_pnet_rcu() helper to allow caller to dereference the net pointer under RCU read lock. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-17Merge tag 'ipsec-2023-10-17' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2023-10-17 1) Fix a slab-use-after-free in xfrm_policy_inexact_list_reinsert. From Dong Chenchen. 2) Fix data-races in the xfrm interfaces dev->stats fields. From Eric Dumazet. 3) Fix a data-race in xfrm_gen_index. From Eric Dumazet. 4) Fix an inet6_dev refcount underflow. From Zhang Changzhong. 5) Check the return value of pskb_trim in esp_remove_trailer for esp4 and esp6. From Ma Ke. 6) Fix a data-race in xfrm_lookup_with_ifid. From Eric Dumazet. * tag 'ipsec-2023-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: fix a data-race in xfrm_lookup_with_ifid() net: ipv4: fix return value check in esp_remove_trailer net: ipv6: fix return value check in esp_remove_trailer xfrm6: fix inet6_dev refcount underflow problem xfrm: fix a data-race in xfrm_gen_index() xfrm: interface: use DEV_STATS_INC() net: xfrm: skip policies marked as dead while reinserting policies ==================== Link: https://lore.kernel.org/r/20231017083723.1364940-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-17tcp: fix excessive TLP and RACK timeouts from HZ roundingNeal Cardwell
We discovered from packet traces of slow loss recovery on kernels with the default HZ=250 setting (and min_rtt < 1ms) that after reordering, when receiving a SACKed sequence range, the RACK reordering timer was firing after about 16ms rather than the desired value of roughly min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the exact same issue. This commit fixes the TLP transmit timer and RACK reordering timer floor calculation to more closely match the intended 2ms floor even on kernels with HZ=250. It does this by adding in a new TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies, instead of the current approach of converting to jiffies and then adding th TCP_TIMEOUT_MIN value of 2 jiffies. Our testing has verified that on kernels with HZ=1000, as expected, this does not produce significant changes in behavior, but on kernels with the default HZ=250 the latency improvement can be large. For example, our tests show that for HZ=250 kernels at low RTTs this fix roughly halves the latency for the RACK reorder timer: instead of mostly firing at 16ms it mostly fires at 8ms. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: bb4d991a28cc ("tcp: adjust tail loss probe timeout") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231015174700.2206872-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-17Merge tag 'wireless-next-2023-10-16' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.7 The second pull request for v6.7, with only driver changes this time. We have now support for mt7925 PCIe and USB variants, few new features and of course some fixes. Major changes: mt76 - mt7925 support ath12k - read board data variant name from SMBIOS wfx - Remain-On-Channel (ROC) support * tag 'wireless-next-2023-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (109 commits) wifi: rtw89: mac: do bf_monitor only if WiFi 6 chips wifi: rtw89: mac: set bf_assoc capabilities according to chip gen wifi: rtw89: mac: set bfee_ctrl() according to chip gen wifi: rtw89: mac: add registers of MU-EDCA parameters for WiFi 7 chips wifi: rtw89: mac: generalize register of MU-EDCA switch according to chip gen wifi: rtw89: mac: update RTS threshold according to chip gen wifi: rtlwifi: simplify TX command fill callbacks wifi: hostap: remove unused ioctl function wifi: atmel: remove unused ioctl function wifi: rtw89: coex: add annotation __counted_by() to struct rtw89_btc_btf_set_mon_reg wifi: rtw89: coex: add annotation __counted_by() for struct rtw89_btc_btf_set_slot_table wifi: rtw89: add EHT radiotap in monitor mode wifi: rtw89: show EHT rate in debugfs wifi: rtw89: parse TX EHT rate selected by firmware from RA C2H report wifi: rtw89: Add EHT rate mask as parameters of RA H2C command wifi: rtw89: parse EHT information from RX descriptor and PPDU status packet wifi: radiotap: add bandwidth definition of EHT U-SIG wifi: rtlwifi: use convenient list_count_nodes() wifi: p54: Annotate struct p54_cal_database with __counted_by wifi: brcmfmac: fweh: Add __counted_by for struct brcmf_fweh_queue_item and use struct_size() ... ==================== Link: https://lore.kernel.org/r/20231016143822.880D8C433C8@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-10-16 We've added 90 non-merge commits during the last 25 day(s) which contain a total of 120 files changed, 3519 insertions(+), 895 deletions(-). The main changes are: 1) Add missed stats for kprobes to retrieve the number of missed kprobe executions and subsequent executions of BPF programs, from Jiri Olsa. 2) Add cgroup BPF sockaddr hooks for unix sockets. The use case is for systemd to reimplement the LogNamespace feature which allows running multiple instances of systemd-journald to process the logs of different services, from Daan De Meyer. 3) Implement BPF CPUv4 support for s390x BPF JIT, from Ilya Leoshkevich. 4) Improve BPF verifier log output for scalar registers to better disambiguate their internal state wrt defaults vs min/max values matching, from Andrii Nakryiko. 5) Extend the BPF fib lookup helpers for IPv4/IPv6 to support retrieving the source IP address with a new BPF_FIB_LOOKUP_SRC flag, from Martynas Pumputis. 6) Add support for open-coded task_vma iterator to help with symbolization for BPF-collected user stacks, from Dave Marchevsky. 7) Add libbpf getters for accessing individual BPF ring buffers which is useful for polling them individually, for example, from Martin Kelly. 8) Extend AF_XDP selftests to validate the SHARED_UMEM feature, from Tushar Vyavahare. 9) Improve BPF selftests cross-building support for riscv arch, from Björn Töpel. 10) Add the ability to pin a BPF timer to the same calling CPU, from David Vernet. 11) Fix libbpf's bpf_tracing.h macros for riscv to use the generic implementation of PT_REGS_SYSCALL_REGS() to access syscall arguments, from Alexandre Ghiti. 12) Extend libbpf to support symbol versioning for uprobes, from Hengqi Chen. 13) Fix bpftool's skeleton code generation to guarantee that ELF data is 8 byte aligned, from Ian Rogers. 14) Inherit system-wide cpu_mitigations_off() setting for Spectre v1/v4 security mitigations in BPF verifier, from Yafang Shao. 15) Annotate struct bpf_stack_map with __counted_by attribute to prepare BPF side for upcoming __counted_by compiler support, from Kees Cook. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (90 commits) bpf: Ensure proper register state printing for cond jumps bpf: Disambiguate SCALAR register state output in verifier logs selftests/bpf: Make align selftests more robust selftests/bpf: Improve missed_kprobe_recursion test robustness selftests/bpf: Improve percpu_alloc test robustness selftests/bpf: Add tests for open-coded task_vma iter bpf: Introduce task_vma open-coded iterator kfuncs selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c bpf: Don't explicitly emit BTF for struct btf_iter_num bpf: Change syscall_nr type to int in struct syscall_tp_t net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set bpf: Avoid unnecessary audit log for CPU security mitigations selftests/bpf: Add tests for cgroup unix socket address hooks selftests/bpf: Make sure mount directory exists documentation/bpf: Document cgroup unix socket address hooks bpftool: Add support for cgroup unix socket address hooks libbpf: Add support for cgroup unix socket address hooks bpf: Implement cgroup sockaddr hooks for unix sockets bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf bpf: Propagate modified uaddrlen from cgroup sockaddr programs ... ==================== Link: https://lore.kernel.org/r/20231016204803.30153-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16page_pool: fragment API support for 32-bit arch with 64-bit DMAYunsheng Lin
Currently page_pool_alloc_frag() is not supported in 32-bit arch with 64-bit DMA because of the overlap issue between pp_frag_count and dma_addr_upper in 'struct page' for those arches, which seems to be quite common, see [1], which means driver may need to handle it when using fragment API. It is assumed that the combination of the above arch with an address space >16TB does not exist, as all those arches have 64b equivalent, it seems logical to use the 64b version for a system with a large address space. It is also assumed that dma address is page aligned when we are dma mapping a page aligned buffer, see [2]. That means we're storing 12 bits of 0 at the lower end for a dma address, we can reuse those bits for the above arches to support 32b+12b, which is 16TB of memory. If we make a wrong assumption, a warning is emitted so that user can report to us. 1. https://lore.kernel.org/all/20211117075652.58299-1-linyunsheng@huawei.com/ 2. https://lore.kernel.org/all/20230818145145.4b357c89@kernel.org/ Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Guillaume Tucker <guillaume.tucker@collabora.com> CC: Matthew Wilcox <willy@infradead.org> CC: Linux-MM <linux-mm@kvack.org> Link: https://lore.kernel.org/r/20231013064827.61135-2-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16Merge tag 'for-net-2023-10-13' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix race when opening vhci device - Avoid memcmp() out of bounds warning - Correctly bounds check and pad HCI_MON_NEW_INDEX name - Fix using memcmp when comparing keys - Ignore error return for hci_devcd_register() in btrtl - Always check if connection is alive before deleting - Fix a refcnt underflow problem for hci_conn * tag 'for-net-2023-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_sock: Correctly bounds check and pad HCI_MON_NEW_INDEX name Bluetooth: avoid memcmp() out of bounds warning Bluetooth: hci_sock: fix slab oob read in create_monitor_event Bluetooth: btrtl: Ignore error return for hci_devcd_register() Bluetooth: hci_event: Fix coding style Bluetooth: hci_event: Fix using memcmp when comparing keys Bluetooth: Fix a refcnt underflow problem for hci_conn Bluetooth: hci_sync: always check if connection is alive before deleting Bluetooth: Reject connection with the device which has same BD_ADDR Bluetooth: hci_event: Ignore NULL link key Bluetooth: ISO: Fix invalid context error Bluetooth: vhci: Fix race when opening vhci device ==================== Link: https://lore.kernel.org/r/20231014031336.1664558-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16net: stub tcp_gro_complete if CONFIG_INET=nJacob Keller
A few networking drivers including bnx2x, bnxt, qede, and idpf call tcp_gro_complete as part of offloading TCP GRO. The function is only defined if CONFIG_INET is true, since its TCP specific and is meaningless if the kernel lacks IP networking support. The combination of trying to use the complex network drivers with CONFIG_NET but not CONFIG_INET is rather unlikely in practice: most use cases are going to need IP networking. The tcp_gro_complete function just sets some data in the socket buffer for use in processing the TCP packet in the event that the GRO was offloaded to the device. If the kernel lacks TCP support, such setup will simply go unused. The bnx2x, bnxt, and qede drivers wrap their TCP offload support in CONFIG_INET checks and skip handling on such kernels. The idpf driver did not check CONFIG_INET and thus fails to link if the kernel is configured with CONFIG_NET=y, CONFIG_IDPF=(m|y), and CONFIG_INET=n. While checking CONFIG_INET does allow the driver to bypass significantly more instructions in the event that we know TCP networking isn't supported, the configuration is unlikely to be used widely. Rather than require driver authors to care about this, stub the tcp_gro_complete function when CONFIG_INET=n. This allows drivers to be left as-is. It does mean the idpf driver will perform slightly more work than strictly necessary when CONFIG_INET=n, since it will still execute some of the skb setup in idpf_rx_rsc. However, that work would be performed in the case where CONFIG_INET=y anyways. I did not change the existing drivers, since they appear to wrap a significant portion of code when CONFIG_INET=n. There is little benefit in trashing these drivers just to unwrap and remove the CONFIG_INET check. Using a stub for tcp_gro_complete is still beneficial, as it means future drivers no longer need to worry about this case of CONFIG_NET=y and CONFIG_INET=n, which should reduce noise from buildbots that check such a configuration. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20231013185502.1473541-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16tcp: Set pingpong threshold via sysctlHaiyang Zhang
TCP pingpong threshold is 1 by default. But some applications, like SQL DB may prefer a higher pingpong threshold to activate delayed acks in quick ack mode for better performance. The pingpong threshold and related code were changed to 3 in the year 2019 in: commit 4a41f453bedf ("tcp: change pingpong threshold to 3") And reverted to 1 in the year 2022 in: commit 4d8f24eeedc5 ("Revert "tcp: change pingpong threshold to 3"") There is no single value that fits all applications. Add net.ipv4.tcp_pingpong_thresh sysctl tunable, so it can be tuned for optimal performance based on the application needs. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/1697056244-21888-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16net, sched: Add tcf_set_drop_reason for {__,}tcf_classifyDaniel Borkmann
Add an initial user for the newly added tcf_set_drop_reason() helper to set the drop reason for internal errors leading to TC_ACT_SHOT inside {__,}tcf_classify(). Right now this only adds a very basic SKB_DROP_REASON_TC_ERROR as a generic fallback indicator to mark drop locations. Where needed, such locations can be converted to more specific codes, for example, when hitting the reclassification limit, etc. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Victor Nogueira <victor@mojatatu.com> Link: https://lore.kernel.org/r/20231009092655.22025-2-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16net, sched: Make tc-related drop reason more flexibleDaniel Borkmann
Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason. Victor kicked-off an initial proposal to make this more flexible by disambiguating verdict from return code by moving the verdict into struct tcf_result and letting tcf_classify() return a negative error. If hit, then two new drop reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual error codes would have required to attach to tcf_classify via kprobe/kretprobe to more deeply debug skb and the returned error. In order to make the kfree_skb_reason() in sch_handle_{ingress,egress}() more extensible, it can be addressed in a more straight forward way, that is: Instead of placing the verdict into struct tcf_result, we can just put the drop reason in there, which does not require changes throughout various classful schedulers given the existing verdict logic can stay as is. Then, SKB_DROP_REASON_TC_ERROR{,_*} can be added to the enum skb_drop_reason to disambiguate between an error or an intentional drop. New drop reason error codes can be added successively to the tc code base. For internal error locations which have not yet been annotated with a SKB_DROP_REASON_TC_ERROR{,_*}, the fallback is SKB_DROP_REASON_TC_INGRESS and SKB_DROP_REASON_TC_EGRESS, respectively. Generic errors could be marked with a SKB_DROP_REASON_TC_ERROR code until they are converted to more specific ones if it is found that they would be useful for troubleshooting. While drop reasons have infrastructure for subsystem specific error codes which are currently used by mac80211 and ovs, Jakub mentioned that it is preferred for tc to use the enum skb_drop_reason core codes given it is a better fit and currently the tooling support is better, too. With regards to the latter: [...] I think Alastair (bpftrace) is working on auto-prettifying enums when bpftrace outputs maps. So we can do something like: $ bpftrace -e 'tracepoint:skb:kfree_skb { @[args->reason] = count(); }' Attaching 1 probe... ^C @[SKB_DROP_REASON_TC_INGRESS]: 2 @[SKB_CONSUMED]: 34 ^^^^^^^^^^^^ names!! Auto-magically. [...] Add a small helper tcf_set_drop_reason() which can be used to set the drop reason into the tcf_result. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Victor Nogueira <victor@mojatatu.com> Link: https://lore.kernel.org/netdev/20231006063233.74345d36@kernel.org Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20231009092655.22025-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16ipv4: add new arguments to udp_tunnel_dst_lookup()Beniamino Galvani
We want to make the function more generic so that it can be used by other UDP tunnel implementations such as geneve and vxlan. To do that, add the following arguments: - source and destination UDP port; - ifindex of the output interface, needed by vxlan; - the tos, because in some cases it is not taken from struct ip_tunnel_info (for example, when it's inherited from the inner packet); - the dst cache, because not all tunnel types (e.g. vxlan) want to use the one from struct ip_tunnel_info. With these parameters, the function no longer needs the full struct ip_tunnel_info as argument and we can pass only the relevant part of it (struct ip_tunnel_key). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16ipv4: remove "proto" argument from udp_tunnel_dst_lookup()Beniamino Galvani
The function is now UDP-specific, the protocol is always IPPROTO_UDP. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-16ipv4: rename and move ip_route_output_tunnel()Beniamino Galvani
At the moment ip_route_output_tunnel() is used only by bareudp. Ideally, other UDP tunnel implementations should use it, but to do so the function needs to accept new parameters that are specific for UDP tunnels, such as the ports. Prepare for these changes by renaming the function to udp_tunnel_dst_lookup() and move it to file net/ipv4/udp_tunnel_core.c. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-15vsock: check for MSG_ZEROCOPY support on sendArseniy Krasnov
This feature totally depends on transport, so if transport doesn't support it, return error. Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13Bluetooth: hci_sock: Correctly bounds check and pad HCI_MON_NEW_INDEX nameKees Cook
The code pattern of memcpy(dst, src, strlen(src)) is almost always wrong. In this case it is wrong because it leaves memory uninitialized if it is less than sizeof(ni->name), and overflows ni->name when longer. Normally strtomem_pad() could be used here, but since ni->name is a trailing array in struct hci_mon_new_index, compilers that don't support -fstrict-flex-arrays=3 can't tell how large this array is via __builtin_object_size(). Instead, open-code the helper and use sizeof() since it will work correctly. Additionally mark ni->name as __nonstring since it appears to not be a %NUL terminated C string. Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Edward AD <twuufnxlz@gmail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Fixes: 18f547f3fc07 ("Bluetooth: hci_sock: fix slab oob read in create_monitor_event") Link: https://lore.kernel.org/lkml/202310110908.F2639D3276@keescook/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-10-13tcp: allow again tcp_disconnect() when threads are waitingPaolo Abeni
As reported by Tom, .NET and applications build on top of it rely on connect(AF_UNSPEC) to async cancel pending I/O operations on TCP socket. The blamed commit below caused a regression, as such cancellation can now fail. As suggested by Eric, this change addresses the problem explicitly causing blocking I/O operation to terminate immediately (with an error) when a concurrent disconnect() is executed. Instead of tracking the number of threads blocked on a given socket, track the number of disconnect() issued on such socket. If such counter changes after a blocking operation releasing and re-acquiring the socket lock, error out the current operation. Fixes: 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting") Reported-by: Tom Deseyn <tdeseyn@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886305 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/f3b95e47e3dbed840960548aebaa8d954372db41.1697008693.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13tls: use fixed size for tls_offload_context_{tx,rx}.driver_stateSabrina Dubroca
driver_state is a flex array, but is always allocated by the tls core to a fixed size (TLS_DRIVER_STATE_SIZE_{TX,RX}). Simplify the code by making that size explicit so that sizeof(struct tls_offload_context_{tx,rx}) works. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: store iv directly within cipher_contextSabrina Dubroca
TLS_MAX_IV_SIZE + TLS_MAX_SALT_SIZE is 20B, we don't get much benefit in cipher_context's size and can simplify the init code a bit. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: rename MAX_IV_SIZE to TLS_MAX_IV_SIZESabrina Dubroca
It's defined in include/net/tls.h, avoid using an overly generic name. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: store rec_seq directly within cipher_contextSabrina Dubroca
TLS_MAX_REC_SEQ_SIZE is 8B, we don't get anything by using kmalloc. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12wifi: radiotap: add bandwidth definition of EHT U-SIGPing-Ke Shih
Define EHT U-SIG bandwidth used by radiotap according to Table 36-28 "U-SIG field of an EHT MU PPDU" in 802.11be (D3.0). Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-2-pkshih@realtek.com
2023-10-10netfilter: cleanup struct nft_tableGeorge Guo
Add comments for nlpid, family, udlen and udata in struct nft_table, and afinfo is no longer a member of struct nft_table, so remove the comment for it. Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: conntrack: simplify nf_conntrack_alter_replyFlorian Westphal
nf_conntrack_alter_reply doesn't do helper reassignment anymore. Remove the comments that make this claim. Furthermore, remove dead code from the function and place ot in nf_conntrack.h. Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10net: macsec: indicate next pn update when offloadingRadu Pirea (NXP OSS)
Indicate next PN update using update_pn flag in macsec_context. Offloaded MACsec implementations does not know whether or not the MACSEC_SA_ATTR_PN attribute was passed for an SA update and assume that next PN should always updated, but this is not always true. The PN can be reset to its initial value using the following command: $ ip macsec set macsec0 tx sa 0 off #octeontx2-pf case Or, the update PN command will succeed even if the driver does not support PN updates. $ ip macsec set macsec0 tx sa 0 pn 1 on #mscc phy driver case Comparing the initial PN with the new PN value is not a solution. When the user updates the PN using its initial value the command will succeed, even if the driver does not support it. Like this: $ ip macsec add macsec0 tx sa 0 pn 1 on key 00 \ ead3664f508eb06c40ac7104cdae4ce5 $ ip macsec set macsec0 tx sa 0 pn 1 on #mlx5 case Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10tcp: record last received ipv6 flowlabelDavid Morley
In order to better estimate whether a data packet has been retransmitted or is the result of a TLP, we save the last received ipv6 flowlabel. To make space for this field we resize the "ato" field in inet_connection_sock as the current value of TCP_DELACK_MAX can be fully contained in 8 bits and add a compile_time_assert ensuring this field is the required size. v2: addressed kernel bot feedback about dccp_delack_timer() v3: addressed build error introduced by commit bbf80d713fe7 ("tcp: derive delack_max from rto_min") Signed-off-by: David Morley <morleyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Tested-by: David Morley <morleyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-09bpf: Derive source IP addr via bpf_*_fib_lookup()Martynas Pumputis
Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; /* the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-06Merge tag 'wireless-next-2023-10-06' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.7 The first pull request for v6.7, with both stack and driver changes. We have a big change how locking is handled in cfg80211 and mac80211 which removes several locks and hopefully simplifies the locking overall. In drivers rtw89 got MCC support and smaller features to other active drivers but nothing out of ordinary. Major changes: cfg80211 - remove wdev mutex, use the wiphy mutex instead - annotate iftype_data pointer with sparse - first kunit tests, for element defrag - remove unused scan_width support mac80211 - major locking rework, remove several locks like sta_mtx, key_mtx etc. and use the wiphy mutex instead - remove unused shifted rate support - support antenna control in frame injection (requires driver support) - convert RX_DROP_UNUSABLE to more detailed reason codes rtw89 - TDMA-based multi-channel concurrency (MCC) support iwlwifi - support set_antenna() operation - support frame injection antenna control ath12k - WCN7850: enable 320 MHz channels in 6 GHz band - WCN7850: hardware rfkill support - WCN7850: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to make scan faster ath11k - add chip id board name while searching board-2.bin * tag 'wireless-next-2023-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (272 commits) wifi: rtlwifi: remove unreachable code in rtl92d_dm_check_edca_turbo() wifi: rtw89: debug: txpwr table supports Wi-Fi 7 chips wifi: rtw89: debug: show txpwr table according to chip gen wifi: rtw89: phy: set TX power RU limit according to chip gen wifi: rtw89: phy: set TX power limit according to chip gen wifi: rtw89: phy: set TX power offset according to chip gen wifi: rtw89: phy: set TX power by rate according to chip gen wifi: rtw89: mac: get TX power control register according to chip gen wifi: rtlwifi: use unsigned long for rtl_bssid_entry timestamp wifi: rtlwifi: fix EDCA limit set by BT coexistence wifi: rt2x00: fix MT7620 low RSSI issue wifi: rtw89: refine bandwidth 160MHz uplink OFDMA performance wifi: rtw89: refine uplink trigger based control mechanism wifi: rtw89: 8851b: update TX power tables to R34 wifi: rtw89: 8852b: update TX power tables to R35 wifi: rtw89: 8852c: update TX power tables to R67 wifi: rtw89: regd: configure Thailand in regulation type wifi: mac80211: add back SPDX identifier wifi: mac80211: fix ieee80211_drop_unencrypted_mgmt return type/value wifi: rtlwifi: cleanup few rtlxxxx_set_hw_reg() routines ... ==================== Link: https://lore.kernel.org/r/87jzrz6bvw.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-06Merge wireless into wireless-nextJohannes Berg
Resolve several conflicts, mostly between changes/fixes in wireless and the locking rework in wireless-next. One of the conflicts actually shows a bug in wireless that we'll want to fix separately. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org>
2023-10-06flow_offload: Annotate struct flow_action_entry with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct flow_action_entry. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-06nexthop: Annotate struct nh_group with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nh_group. Cc: David Ahern <dsahern@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-06nexthop: Annotate struct nh_notifier_grp_info with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nh_notifier_grp_info. Cc: David Ahern <dsahern@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: netdev@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-05nexthop: Annotate struct nh_notifier_res_table_info with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nh_notifier_res_table_info. Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231003231818.work.883-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05nexthop: Annotate struct nh_res_table with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nh_res_table. Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231003231813.work.042-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts (or adjacent changes of note). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05net_sched: export pfifo_fast prio2band[]Eric Dumazet
pfifo_fast prio2band[] is renamed to sch_default_prio2band[] and exported because we want to share it in FQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05net: mana: Fix oversized sge0 for GSO packetsHaiyang Zhang
Handle the case when GSO SKB linear length is too large. MANA NIC requires GSO packets to put only the header part to SGE0, otherwise the TX queue may stop at the HW level. So, use 2 SGEs for the skb linear part which contains more than the packet header. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-04tcp: fix quick-ack counting to count actual ACKs of new dataNeal Cardwell
This commit fixes quick-ack counting so that it only considers that a quick-ack has been provided if we are sending an ACK that newly acknowledges data. The code was erroneously using the number of data segments in outgoing skbs when deciding how many quick-ack credits to remove. This logic does not make sense, and could cause poor performance in request-response workloads, like RPC traffic, where requests or responses can be multi-segment skbs. When a TCP connection decides to send N quick-acks, that is to accelerate the cwnd growth of the congestion control module controlling the remote endpoint of the TCP connection. That quick-ack decision is purely about the incoming data and outgoing ACKs. It has nothing to do with the outgoing data or the size of outgoing data. And in particular, an ACK only serves the intended purpose of allowing the remote congestion control to grow the congestion window quickly if the ACK is ACKing or SACKing new data. The fix is simple: only count packets as serving the goal of the quickack mechanism if they are ACKing/SACKing new data. We can tell whether this is the case by checking inet_csk_ack_scheduled(), since we schedule an ACK exactly when we are ACKing/SACKing new data. Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04page_pool: fix documentation typosRandy Dunlap
Correct grammar for better readability. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231001003846.29541-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04net: appletalk: remove cops supportGreg Kroah-Hartman
The COPS Appletalk support is very old, never said to actually work properly, and the firmware code for the devices are under a very suspect license. Remove it all to clear up the license issue, if it is still needed and actually used by anyone, we can add it back later once the license is cleared up. Reported-by: Prarit Bhargava <prarit@redhat.com> Cc: jschlst@samba.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20230927090029.44704-2-gregkh@linuxfoundation.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge tag 'wireless-2023-09-27' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Quite a collection of fixes this time, really too many to list individually. Many stack fixes, even rfkill (found by simulation and the new eevdf scheduler)! Also a bigger maintainers file cleanup, to remove old and redundant information. * tag 'wireless-2023-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (32 commits) wifi: iwlwifi: mvm: Fix incorrect usage of scan API wifi: mac80211: Create resources for disabled links wifi: cfg80211: avoid leaking stack data into trace wifi: mac80211: allow transmitting EAPOL frames with tainted key wifi: mac80211: work around Cisco AP 9115 VHT MPDU length wifi: cfg80211: Fix 6GHz scan configuration wifi: mac80211: fix potential key leak wifi: mac80211: fix potential key use-after-free wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling wifi: brcmfmac: Replace 1-element arrays with flexible arrays wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet wifi: rtw88: rtw8723d: Fix MAC address offset in EEPROM rfkill: sync before userspace visibility/changes wifi: mac80211: fix mesh id corruption on 32 bit systems wifi: cfg80211: add missing kernel-doc for cqm_rssi_work wifi: cfg80211: fix cqm_config access race wifi: iwlwifi: mvm: Fix a memory corruption issue wifi: iwlwifi: Ensure ack flag is properly cleared. wifi: iwlwifi: dbg_ini: fix structure packing iwlwifi: mvm: handle PS changes in vif_cfg_changed ... ==================== Link: https://lore.kernel.org/r/20230927095835.25803-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-03net: dsa: notify drivers of MAC address changes on user portsVladimir Oltean
In some cases, drivers may need to veto the changing of a MAC address on a user port. Such is the case with KSZ9477 when it offloads a HSR device, because it programs the MAC address of multiple ports to a shared hardware register. Those ports need to have equal MAC addresses for the lifetime of the HSR offload. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03net: dsa: propagate extack to ds->ops->port_hsr_join()Vladimir Oltean
Drivers can provide meaningful error messages which state a reason why they can't perform an offload, and dsa_slave_changeupper() already has the infrastructure to propagate these over netlink rather than printing to the kernel log. So pass the extack argument and modify the xrs700x driver's port_hsr_join() prototype. Also take the opportunity and use the extack for the 2 -EOPNOTSUPP cases from xrs700x_hsr_join(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03ipv6: mark address parameters of udp_tunnel6_xmit_skb() as constBeniamino Galvani
The function doesn't modify the addresses passed as input, mark them as 'const' to make that clear. Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/20230924153014.786962-1-b.galvani@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-03ipv4/fib: send notify when delete source address routesHangbin Liu
After deleting an interface address in fib_del_ifaddr(), the function scans the fib_info list for stray entries and calls fib_flush() and fib_table_flush(). Then the stray entries will be deleted silently and no RTM_DELROUTE notification will be sent. This lack of notification can make routing daemons, or monitor like `ip monitor route` miss the routing changes. e.g. + ip link add dummy1 type dummy + ip link add dummy2 type dummy + ip link set dummy1 up + ip link set dummy2 up + ip addr add 192.168.5.5/24 dev dummy1 + ip route add 7.7.7.0/24 dev dummy2 src 192.168.5.5 + ip -4 route 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 + ip monitor route + ip addr del 192.168.5.5/24 dev dummy1 Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5 Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5 As Ido reminded, fib_table_flush() isn't only called when an address is deleted, but also when an interface is deleted or put down. The lack of notification in these cases is deliberate. And commit 7c6bb7d2faaf ("net/ipv6: Add knob to skip DELROUTE message on device down") introduced a sysctl to make IPv6 behave like IPv4 in this regard. So we can't send the route delete notify blindly in fib_table_flush(). To fix this issue, let's add a new flag in "struct fib_info" to track the deleted prefer source address routes, and only send notify for them. After update: + ip monitor route + ip addr del 192.168.5.5/24 dev dummy1 Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5 Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5 Deleted 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5 Suggested-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230922075508.848925-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-02net: mana: Annotate struct hwc_dma_buf with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct hwc_dma_buf. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Long Li <longli@microsoft.com> Cc: Ajay Sharma <sharmaajay@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230922172858.3822653-9-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>