summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2023-10-13net: fix IPSTATS_MIB_OUTFORWDATAGRAMS increment after fragment checkHeng Guo
Reproduce environment: network with 3 VM linuxs is connected as below: VM1<---->VM2(latest kernel 6.5.0-rc7)<---->VM3 VM1: eth0 ip: 192.168.122.207 MTU 1800 VM2: eth0 ip: 192.168.122.208, eth1 ip: 192.168.123.224 MTU 1500 VM3: eth0 ip: 192.168.123.240 MTU 1800 Reproduce: VM1 send 1600 bytes UDP data to VM3 using tools scapy with flags='DF'. scapy command: send(IP(dst="192.168.123.240",flags='DF')/UDP()/str('0'*1600),count=1, inter=1.000000) Result: Before IP data is sent. ---------------------------------------------------------------------- root@qemux86-64:~# cat /proc/net/snmp Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss Ip: 1 64 6 0 2 2 0 0 2 4 0 0 0 0 0 0 0 0 0 ...... root@qemux86-64:~# ---------------------------------------------------------------------- After IP data is sent. ---------------------------------------------------------------------- root@qemux86-64:~# cat /proc/net/snmp Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss Ip: 1 64 7 0 2 2 0 0 2 5 0 0 0 0 0 0 0 1 0 ...... root@qemux86-64:~# ---------------------------------------------------------------------- ForwDatagrams is always keeping 2 without increment. Issue description and patch: ip_exceeds_mtu() in ip_forward() drops this IP datagram because skb len (1600 sending by scapy) is over MTU(1500 in VM2) if "DF" is set. According to RFC 4293 "3.2.3. IP Statistics Tables", +-------+------>------+----->-----+----->-----+ | InForwDatagrams (6) | OutForwDatagrams (6) | | V +->-+ OutFragReqds | InNoRoutes | | (packets) / (local packet (3) | | | IF is that of the address | +--> OutFragFails | and may not be the receiving IF) | | (packets) the IPSTATS_MIB_OUTFORWDATAGRAMS should be counted before fragment check. The existing implementation, instead, would incease the counter after fragment check: ip_exceeds_mtu() in ipv4 and ip6_pkt_too_big() in ipv6. So do patch to move IPSTATS_MIB_OUTFORWDATAGRAMS counter to ip_forward() for ipv4 and ip6_forward() for ipv6. Test result with patch: Before IP data is sent. ---------------------------------------------------------------------- root@qemux86-64:~# cat /proc/net/snmp Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss Ip: 1 64 6 0 2 2 0 0 2 4 0 0 0 0 0 0 0 0 0 ...... root@qemux86-64:~# ---------------------------------------------------------------------- After IP data is sent. ---------------------------------------------------------------------- root@qemux86-64:~# cat /proc/net/snmp Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqdss Ip: 1 64 7 0 2 3 0 0 2 5 0 0 0 0 0 0 0 1 0 ...... root@qemux86-64:~# ---------------------------------------------------------------------- ForwDatagrams is updated from 2 to 3. Reviewed-by: Filip Pudak <filip.pudak@windriver.com> Signed-off-by: Heng Guo <heng.guo@windriver.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231011015137.27262-1-heng.guo@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13tls: use fixed size for tls_offload_context_{tx,rx}.driver_stateSabrina Dubroca
driver_state is a flex array, but is always allocated by the tls core to a fixed size (TLS_DRIVER_STATE_SIZE_{TX,RX}). Simplify the code by making that size explicit so that sizeof(struct tls_offload_context_{tx,rx}) works. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: validate crypto_info in a separate helperSabrina Dubroca
Simplify do_tls_setsockopt_conf a bit. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: remove tls_context argument from tls_set_device_offloadSabrina Dubroca
It's not really needed since we end up refetching it as tls_ctx. We can also remove the NULL check, since we have already dereferenced ctx in do_tls_setsockopt_conf. While at it, fix up the reverse xmas tree ordering. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: remove tls_context argument from tls_set_sw_offloadSabrina Dubroca
It's not really needed since we end up refetching it as tls_ctx. We can also remove the NULL check, since we have already dereferenced ctx in do_tls_setsockopt_conf. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: add a helper to allocate/initialize offload_ctx_txSabrina Dubroca
Simplify tls_set_device_offload a bit. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: also use init_prot_info in tls_set_device_offloadSabrina Dubroca
Most values are shared. Nonce size turns out to be equal to IV size for all offloadable ciphers. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: move tls_prot_info initialization out of tls_set_sw_offloadSabrina Dubroca
Simplify tls_set_sw_offload, and allow reuse for the tls_device code. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: extract context alloc/initialization out of tls_set_sw_offloadSabrina Dubroca
Simplify tls_set_sw_offload a bit. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: store iv directly within cipher_contextSabrina Dubroca
TLS_MAX_IV_SIZE + TLS_MAX_SALT_SIZE is 20B, we don't get much benefit in cipher_context's size and can simplify the init code a bit. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: rename MAX_IV_SIZE to TLS_MAX_IV_SIZESabrina Dubroca
It's defined in include/net/tls.h, avoid using an overly generic name. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: store rec_seq directly within cipher_contextSabrina Dubroca
TLS_MAX_REC_SEQ_SIZE is 8B, we don't get anything by using kmalloc. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: drop unnecessary cipher_type checks in tls offloadSabrina Dubroca
We should never reach tls_device_reencrypt, tls_enc_record, or tls_enc_skb with a cipher_type that can't be offloaded. Replace those checks with a DEBUG_NET_WARN_ON_ONCE, and use cipher_desc instead of hard-coding offloadable cipher types. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13tls: get salt using crypto_info_salt in tls_enc_skbSabrina Dubroca
I skipped this conversion in my previous series. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-13net: Handle bulk delete policy in bridge driverAmit Cohen
The merge commit 92716869375b ("Merge branch 'br-flush-filtering'") added support for FDB flushing in bridge driver. The following patches will extend VXLAN driver to support FDB flushing as well. The netlink message for bulk delete is shared between the drivers. With the existing implementation, there is no way to prevent user from flushing with attributes that are not supported per driver. For example, when VNI will be added, user will not get an error for flush FDB entries in bridge with VNI, although this attribute is not relevant for bridge. As preparation for support of FDB flush in VXLAN driver, move the policy to be handled in bridge driver, later a new policy for VXLAN will be added in VXLAN driver. Do not pass 'vid' as part of ndo_fdb_del_bulk(), as this field is relevant only for bridge. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12net: gso_test: fix build with gcc-12 and earlierFlorian Westphal
gcc 12 errors out with: net/core/gso_test.c:58:48: error: initializer element is not constant 58 | .segs = (const unsigned int[]) { gso_size }, This version isn't old (2022), so switch to preprocessor-bsaed constant instead of 'static const int'. Cc: Willem de Bruijn <willemb@google.com> Reported-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com> Closes: https://lore.kernel.org/netdev/79fbe35c-4dd1-4f27-acb2-7a60794bc348@linux.vnet.ibm.com/ Fixes: 1b4fa28a8b07 ("net: parametrize skb_segment unit test to expand coverage") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231012120901.10765-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12nfc: nci: assert requested protocol is validJeremy Cline
The protocol is used in a bit mask to determine if the protocol is supported. Assert the provided protocol is less than the maximum defined so it doesn't potentially perform a shift-out-of-bounds and provide a clearer error for undefined protocols vs unsupported ones. Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reported-and-tested-by: syzbot+0839b78e119aae1fec78@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0839b78e119aae1fec78 Signed-off-by: Jeremy Cline <jeremy@jcline.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231009200054.82557-1-jeremy@jcline.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12af_packet: Fix fortified memcpy() without flex array.Kuniyuki Iwashima
Sergei Trofimovich reported a regression [0] caused by commit a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname()."). It introduced a flex array sll_addr_flex in struct sockaddr_ll as a union-ed member with sll_addr to work around the fortified memcpy() check. However, a userspace program uses a struct that has struct sockaddr_ll in the middle, where a flex array is illegal to exist. include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t' 24 | __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex); | ^~~~~~~~~~~~~~~~~~~~ To fix the regression, let's go back to the first attempt [1] telling memcpy() the actual size of the array. Reported-by: Sergei Trofimovich <slyich@gmail.com> Closes: https://github.com/NixOS/nixpkgs/pull/252587#issuecomment-1741733002 [0] Link: https://lore.kernel.org/netdev/20230720004410.87588-3-kuniyu@amazon.com/ [1] Fixes: a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname().") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20231009153151.75688-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-11Merge tag 'nf-next-23-10-10' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for next First 5 patches, from Phil Sutter, clean up nftables dumpers to use the context buffer in the netlink_callback structure rather than a kmalloc'd buffer. Patch 6, from myself, zaps dead code and replaces the helper function with a small inlined helper. Patch 7, also from myself, removes another pr_debug and replaces it with the existing nf_log-based debug helpers. Last patch, from George Guo, gets nft_table comments back in sync with the structure members. * tag 'nf-next-23-10-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: cleanup struct nft_table netfilter: conntrack: prefer tcp_error_log to pr_debug netfilter: conntrack: simplify nf_conntrack_alter_reply netfilter: nf_tables: Don't allocate nft_rule_dump_ctx netfilter: nf_tables: Carry s_idx in nft_rule_dump_ctx netfilter: nf_tables: Carry reset flag in nft_rule_dump_ctx netfilter: nf_tables: Drop pointless memset when dumping rules netfilter: nf_tables: Always allocate nft_rule_dump_ctx ==================== Link: https://lore.kernel.org/r/20231010145343.12551-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: tcp: fix crashes trying to free half-baked MTU probesJakub Kicinski
tcp_stream_alloc_skb() initializes the skb to use tcp_tsorted_anchor which is a union with the destructor. We need to clean that TCP-iness up before freeing. Fixes: 736013292e3c ("tcp: let tcp_mtu_probe() build headless packets") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231010173651.3990234-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: expand skb_segment unit test with frag_list coverageWillem de Bruijn
Expand the test with these variants that use skb frag_list: - GSO_TEST_FRAG_LIST: frag_skb length is gso_size - GSO_TEST_FRAG_LIST_PURE: same, data exclusively in frag skbs - GSO_TEST_FRAG_LIST_NON_UNIFORM: frag_skb length may vary - GSO_TEST_GSO_BY_FRAGS: frag_skb length defines gso_size, i.e., segs may have varying sizes. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: parametrize skb_segment unit test to expand coverageWillem de Bruijn
Expand the test with variants - GSO_TEST_NO_GSO: payload size less than or equal to gso_size - GSO_TEST_FRAGS: payload in both linear and page frags - GSO_TEST_FRAGS_PURE: payload exclusively in page frags - GSO_TEST_GSO_PARTIAL: produce one gso segment of multiple of gso_size, plus optionally one non-gso trailer segment Define a test struct that encodes the input gso skb and output segs. Input in terms of linear and fragment lengths. Output as length of each segment. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: add skb_segment kunit testWillem de Bruijn
Add unit testing for skb segment. This function is exercised by many different code paths, such as GSO_PARTIAL or GSO_BY_FRAGS, linear (with or without head_frag), frags or frag_list skbs, etc. It is infeasible to manually run tests that cover all code paths when making changes. The long and complex function also makes it hard to establish through analysis alone that a patch has no unintended side-effects. Add code coverage through kunit regression testing. Introduce kunit infrastructure for tests under net/core, and add this first test. This first skb_segment test exercises a simple case: a linear skb. Follow-on patches will parametrize the test and add more variants. Tested: Built and ran the test with make ARCH=um mrproper ./tools/testing/kunit/kunit.py run \ --kconfig_add CONFIG_NET=y \ --kconfig_add CONFIG_DEBUG_KERNEL=y \ --kconfig_add CONFIG_DEBUG_INFO=y \ --kconfig_add=CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y \ net_core_gso Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net/smc: Fix pos miscalculation in statisticsNils Hoppmann
SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) will calculate wrong bucket positions for payloads of exactly 4096 bytes and (1 << (m + 12)) bytes, with m == SMC_BUF_MAX - 1. Intended bucket distribution: Assume l == size of payload, m == SMC_BUF_MAX - 1. Bucket 0 : 0 < l <= 2^13 Bucket n, 1 <= n <= m-1 : 2^(n+12) < l <= 2^(n+13) Bucket m : l > 2^(m+12) Current solution: _pos = fls64((l) >> 13) [...] _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m For l == 4096, _pos == -1, but should be _pos == 0. For l == (1 << (m + 12)), _pos == m, but should be _pos == m - 1. In order to avoid special treatment of these corner cases, the calculation is adjusted. The new solution first subtracts the length by one, and then calculates the correct bucket by shifting accordingly, i.e. _pos = fls64((l - 1) >> 13), l > 0. This not only fixes the issues named above, but also makes the whole bucket assignment easier to follow. Same is done for SMC_STAT_RMB_SIZE_SUB(_smc_stats, _tech, k, _len), where the calculation of the bucket position is similar to the one named above. Fixes: e0e4b8fa5338 ("net/smc: Add SMC statistics support") Suggested-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Nils Hoppmann <niho@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net/core: Introduce netdev_core_stats_inc()Yajun Deng
Although there is a kfree_skb_reason() helper function that can be used to find the reason why this skb is dropped, but most callers didn't increase one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. For the users, people are more concerned about why the dropped in ip is increasing. Introduce netdev_core_stats_inc() for trace the caller of dev_core_stats_*_inc(). Also, add __code to netdev_core_stats_alloc(), as it's called with small probability. And add noinline make sure netdev_core_stats_inc was never inlined. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: dsa: remove dsa_port_phylink_validate()Russell King (Oracle)
As all drivers now provide phylink capabilities (including MAC), the if() condition in dsa_port_phylink_validate() will always be true. We will always use the generic validator, which phylink will call itself if the .validate method isn't populated. Thus, there is now no need to implement the .validate method, so this implementation can be removed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-10Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-10-11 We've added 14 non-merge commits during the last 5 day(s) which contain a total of 12 files changed, 398 insertions(+), 104 deletions(-). The main changes are: 1) Fix s390 JIT backchain issues in the trampoline code generation which previously clobbered the caller's backchain, from Ilya Leoshkevich. 2) Fix zero-size allocation warning in xsk sockets when the configured ring size was close to SIZE_MAX, from Andrew Kanner. 3) Fixes for bpf_mprog API that were found when implementing support in the ebpf-go library along with selftests, from Daniel Borkmann and Lorenz Bauer. 4) Fix riscv JIT to properly sign-extend the return register in programs. This fixes various test_progs selftests on riscv, from Björn Töpel. 5) Fix verifier log for async callback return values where the allowed range was displayed incorrectly, from David Vernet. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: s390/bpf: Fix unwinding past the trampoline s390/bpf: Fix clobbering the caller's backchain in the trampoline selftests/bpf: Add testcase for async callback return value failure bpf: Fix verifier log for async callback return values xdp: Fix zero-size allocation warning in xskq_create() riscv, bpf: Track both a0 (RISC-V ABI) and a5 (BPF) return values riscv, bpf: Sign-extend return values selftests/bpf: Make seen_tc* variable tests more robust selftests/bpf: Test query on empty mprog and pass revision into attach selftests/bpf: Adapt assert_mprog_count to always expect 0 count selftests/bpf: Test bpf_mprog query API via libbpf and raw syscall bpf: Refuse unused attributes in bpf_prog_{attach,detach} bpf: Handle bpf_mprog_query with NULL entry bpf: Fix BPF_PROG_QUERY last field check ==================== Link: https://lore.kernel.org/r/20231010223610.3984-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10ethtool: Fix mod state of verbose no_mask bitsetKory Maincent
A bitset without mask in a _SET request means we want exactly the bits in the bitset to be set. This works correctly for compact format but when verbose format is parsed, ethnl_update_bitset32_verbose() only sets the bits present in the request bitset but does not clear the rest. The commit 6699170376ab fixes this issue by clearing the whole target bitmap before we start iterating. The solution proposed brought an issue with the behavior of the mod variable. As the bitset is always cleared the old val will always differ to the new val. Fix it by adding a new temporary variable which save the state of the old bitmap. Fixes: 6699170376ab ("ethtool: fix application of verbose no_mask bitset") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231009133645.44503-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10Merge tag 'linux-can-fixes-for-6.6-20231009' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2023-10-09 Lukas Magel's patch for the CAN ISO-TP protocol fixes the TX state detection and wait behavior. John Watts contributes a patch to only show the sun4i_can Kconfig option on ARCH_SUNXI. A patch by Miquel Raynal fixes the soft-reset workaround for Renesas SoCs in the sja1000 driver. Markus Schneider-Pargmann's patch for the tcan4x5x m_can glue driver fixes the id2 register for the tcan4553. 2 patches by Haibo Chen fix the flexcan stop mode for the imx93 SoC. * tag 'linux-can-fixes-for-6.6-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: tcan4x5x: Fix id2_register for tcan4553 can: flexcan: remove the auto stop mode for IMX93 can: sja1000: Always restart the Tx queue after an overrun arm64: dts: imx93: add the Flex-CAN stop mode by GPR can: sun4i_can: Only show Kconfig if ARCH_SUNXI is set can: isotp: isotp_sendmsg(): fix TX state detection and wait behavior ==================== Link: https://lore.kernel.org/r/20231009085256.693378-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10net: nfc: fix races in nfc_llcp_sock_get() and nfc_llcp_sock_get_sn()Eric Dumazet
Sili Luo reported a race in nfc_llcp_sock_get(), leading to UAF. Getting a reference on the socket found in a lookup while holding a lock should happen before releasing the lock. nfc_llcp_sock_get_sn() has a similar problem. Finally nfc_llcp_recv_snl() needs to make sure the socket found by nfc_llcp_sock_from_sn() does not disappear. Fixes: 8f50020ed9b8 ("NFC: LLCP late binding") Reported-by: Sili Luo <rootlab@huawei.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20231009123110.3735515-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10mctp: perform route lookups under a RCU read-side lockJeremy Kerr
Our current route lookups (mctp_route_lookup and mctp_route_lookup_null) traverse the net's route list without the RCU read lock held. This means the route lookup is subject to preemption, resulting in an potential grace period expiry, and so an eventual kfree() while we still have the route pointer. Add the proper read-side critical section locks around the route lookups, preventing premption and a possible parallel kfree. The remaining net->mctp.routes accesses are already under a rcu_read_lock, or protected by the RTNL for updates. Based on an analysis from Sili Luo <rootlab@huawei.com>, where introducing a delay in the route lookup could cause a UAF on simultaneous sendmsg() and route deletion. Reported-by: Sili Luo <rootlab@huawei.com> Fixes: 889b7da23abf ("mctp: Add initial routing framework") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/29c4b0e67dc1bf3571df3982de87df90cae9b631.1696837310.git.jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10appletalk: remove ipddp driverArnd Bergmann
After the cops driver is removed, ipddp is now the only CONFIG_DEV_APPLETALK but as far as I can tell, this also has no users and can be removed, making appletalk support purely based on ethertalk, using ethernet hardware. Link: https://lore.kernel.org/netdev/e490dd0c-a65d-4acf-89c6-c06cb48ec880@app.fastmail.com/ Link: https://lore.kernel.org/netdev/9cac4fbd-9557-b0b8-54fa-93f0290a6fb8@schmorgal.com/ Cc: Doug Brown <doug@schmorgal.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20231009141139.1766345-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10netfilter: conntrack: prefer tcp_error_log to pr_debugFlorian Westphal
pr_debug doesn't provide any information other than that a packet did not match existing state but also was found to not create a new connection. Replaces this with tcp_error_log, which will also dump packets' content so one can see if this is a stray FIN or RST. Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: conntrack: simplify nf_conntrack_alter_replyFlorian Westphal
nf_conntrack_alter_reply doesn't do helper reassignment anymore. Remove the comments that make this claim. Furthermore, remove dead code from the function and place ot in nf_conntrack.h. Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: nf_tables: Don't allocate nft_rule_dump_ctxPhil Sutter
Since struct netlink_callback::args is not used by rule dumpers anymore, use it to hold nft_rule_dump_ctx. Add a build-time check to make sure it won't ever exceed the available space. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: nf_tables: Carry s_idx in nft_rule_dump_ctxPhil Sutter
In order to move the context into struct netlink_callback's scratch area, the latter must be unused first. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: nf_tables: Carry reset flag in nft_rule_dump_ctxPhil Sutter
This relieves the dump callback from having to check nlmsg_type upon each call and instead performs the check once in .start callback. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: nf_tables: Drop pointless memset when dumping rulesPhil Sutter
None of the dump callbacks uses netlink_callback::args beyond the first element, no need to zero the data. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10netfilter: nf_tables: Always allocate nft_rule_dump_ctxPhil Sutter
It will move into struct netlink_callback's scratch area later, just put nf_tables_dump_rules_start in shape to reduce churn later. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-10net/smc: Fix dependency of SMC on ISMGerd Bayer
When the SMC protocol is built into the kernel proper while ISM is configured to be built as module, linking the kernel fails due to unresolved dependencies out of net/smc/smc_ism.o to ism_get_smcd_ops, ism_register_client, and ism_unregister_client as reported via the linux-next test automation (see link). This however is a bug introduced a while ago. Correct the dependency list in ISM's and SMC's Kconfig to reflect the dependencies that are actually inverted. With this you cannot build a kernel with CONFIG_SMC=y and CONFIG_ISM=m. Either ISM needs to be 'y', too - or a 'n'. That way, SMC can still be configured on non-s390 architectures that do not have (nor need) an ISM driver. Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/linux-next/d53b5b50-d894-4df8-8969-fd39e63440ae@infradead.org/ Co-developed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20231006125847.1517840-1-gbayer@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10tcp: change data receiver flowlabel after one dupDavid Morley
This commit changes the data receiver repath behavior to occur after receiving a single duplicate. This can help recover ACK connectivity quicker if a TLP was sent along a nonworking path. For instance, consider the case where we have an initially nonworking forward path and reverse path and subsequently switch to only working forward paths. Before this patch we would have the following behavior. +---------+--------+--------+----------+----------+----------+ | Event | For FL | Rev FL | FP Works | RP Works | Data Del | +---------+--------+--------+----------+----------+----------+ | Initial | A | 1 | N | N | 0 | +---------+--------+--------+----------+----------+----------+ | TLP | A | 1 | N | N | 0 | +---------+--------+--------+----------+----------+----------+ | RTO 1 | B | 1 | Y | N | 1 | +---------+--------+--------+----------+----------+----------+ | RTO 2 | C | 1 | Y | N | 2 | +---------+--------+--------+----------+----------+----------+ | RTO 3 | D | 2 | Y | Y | 3 | +---------+--------+--------+----------+----------+----------+ This patch gets rid of at least RTO 3, avoiding additional unnecessary repaths of a working forward path to a (potentially) nonworking one. In addition, this commit changes the behavior to avoid repathing upon rx of duplicate data if the local endpoint is in CA_Loss (in which case the RTOs will already be changing the outgoing flowlabel). Signed-off-by: David Morley <morleyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Tested-by: David Morley <morleyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-10tcp: record last received ipv6 flowlabelDavid Morley
In order to better estimate whether a data packet has been retransmitted or is the result of a TLP, we save the last received ipv6 flowlabel. To make space for this field we resize the "ato" field in inet_connection_sock as the current value of TCP_DELACK_MAX can be fully contained in 8 bits and add a compile_time_assert ensuring this field is the required size. v2: addressed kernel bot feedback about dccp_delack_timer() v3: addressed build error introduced by commit bbf80d713fe7 ("tcp: derive delack_max from rto_min") Signed-off-by: David Morley <morleyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Tested-by: David Morley <morleyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-09net: refine debug info in skb_checksum_help()Eric Dumazet
syzbot uses panic_on_warn. This means that the skb_dump() I added in the blamed commit are not even called. Rewrite this so that we get the needed skb dump before syzbot crashes. Fixes: eeee4b77dc52 ("net: add more debug info in skb_checksum_help()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231006173355.2254983-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-09xdp: Fix zero-size allocation warning in xskq_create()Andrew Kanner
Syzkaller reported the following issue: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2807 at mm/vmalloc.c:3247 __vmalloc_node_range (mm/vmalloc.c:3361) Modules linked in: CPU: 0 PID: 2807 Comm: repro Not tainted 6.6.0-rc2+ #12 Hardware name: Generic DT based system unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258) show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) dump_stack_lvl from __warn (kernel/panic.c:633 kernel/panic.c:680) __warn from warn_slowpath_fmt (./include/linux/context_tracking.h:153 kernel/panic.c:700) warn_slowpath_fmt from __vmalloc_node_range (mm/vmalloc.c:3361 (discriminator 3)) __vmalloc_node_range from vmalloc_user (mm/vmalloc.c:3478) vmalloc_user from xskq_create (net/xdp/xsk_queue.c:40) xskq_create from xsk_setsockopt (net/xdp/xsk.c:953 net/xdp/xsk.c:1286) xsk_setsockopt from __sys_setsockopt (net/socket.c:2308) __sys_setsockopt from ret_fast_syscall (arch/arm/kernel/entry-common.S:68) xskq_get_ring_size() uses struct_size() macro to safely calculate the size of struct xsk_queue and q->nentries of desc members. But the syzkaller repro was able to set q->nentries with the value initially taken from copy_from_sockptr() high enough to return SIZE_MAX by struct_size(). The next PAGE_ALIGN(size) is such case will overflow the size_t value and set it to 0. This will trigger WARN_ON_ONCE in vmalloc_user() -> __vmalloc_node_range(). The issue is reproducible on 32-bit arm kernel. Fixes: 9f78bf330a66 ("xsk: support use vaddr as ring") Reported-by: syzbot+fae676d3cf469331fc89@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000c84b4705fb31741e@google.com/T/ Reported-by: syzbot+b132693e925cbbd89e26@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000e20df20606ebab4f@google.com/T/ Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+fae676d3cf469331fc89@syzkaller.appspotmail.com Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://syzkaller.appspot.com/bug?extid=fae676d3cf469331fc89 Link: https://lore.kernel.org/bpf/20231007075148.1759-1-andrew.kanner@gmail.com
2023-10-06net: sock_dequeue_err_skb() optimizationEric Dumazet
Exit early if the list is empty. Some applications using TCP zerocopy are calling recvmsg( ... MSG_ERRQUEUE) and hit this case quite often, probably because busy polling only deals with sk_receive_queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231005114504.642589-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-06Merge tag 'wireless-next-2023-10-06' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.7 The first pull request for v6.7, with both stack and driver changes. We have a big change how locking is handled in cfg80211 and mac80211 which removes several locks and hopefully simplifies the locking overall. In drivers rtw89 got MCC support and smaller features to other active drivers but nothing out of ordinary. Major changes: cfg80211 - remove wdev mutex, use the wiphy mutex instead - annotate iftype_data pointer with sparse - first kunit tests, for element defrag - remove unused scan_width support mac80211 - major locking rework, remove several locks like sta_mtx, key_mtx etc. and use the wiphy mutex instead - remove unused shifted rate support - support antenna control in frame injection (requires driver support) - convert RX_DROP_UNUSABLE to more detailed reason codes rtw89 - TDMA-based multi-channel concurrency (MCC) support iwlwifi - support set_antenna() operation - support frame injection antenna control ath12k - WCN7850: enable 320 MHz channels in 6 GHz band - WCN7850: hardware rfkill support - WCN7850: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to make scan faster ath11k - add chip id board name while searching board-2.bin * tag 'wireless-next-2023-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (272 commits) wifi: rtlwifi: remove unreachable code in rtl92d_dm_check_edca_turbo() wifi: rtw89: debug: txpwr table supports Wi-Fi 7 chips wifi: rtw89: debug: show txpwr table according to chip gen wifi: rtw89: phy: set TX power RU limit according to chip gen wifi: rtw89: phy: set TX power limit according to chip gen wifi: rtw89: phy: set TX power offset according to chip gen wifi: rtw89: phy: set TX power by rate according to chip gen wifi: rtw89: mac: get TX power control register according to chip gen wifi: rtlwifi: use unsigned long for rtl_bssid_entry timestamp wifi: rtlwifi: fix EDCA limit set by BT coexistence wifi: rt2x00: fix MT7620 low RSSI issue wifi: rtw89: refine bandwidth 160MHz uplink OFDMA performance wifi: rtw89: refine uplink trigger based control mechanism wifi: rtw89: 8851b: update TX power tables to R34 wifi: rtw89: 8852b: update TX power tables to R35 wifi: rtw89: 8852c: update TX power tables to R67 wifi: rtw89: regd: configure Thailand in regulation type wifi: mac80211: add back SPDX identifier wifi: mac80211: fix ieee80211_drop_unencrypted_mgmt return type/value wifi: rtlwifi: cleanup few rtlxxxx_set_hw_reg() routines ... ==================== Link: https://lore.kernel.org/r/87jzrz6bvw.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-06devlink: Hold devlink lock on health reporter dump getMoshe Shemesh
Devlink health dump get callback should take devlink lock as any other devlink callback. Otherwise, since devlink_mutex was removed, this callback is not protected from a race of the reporter being destroyed while handling the callback. Add devlink lock to the callback and to any call for devlink_health_do_dump(). This should be safe as non of the drivers dump callback implementation takes devlink lock. As devlink lock is added to any callback of dump, the reporter dump_lock is now redundant and can be removed. Fixes: d3efc2a6a6d8 ("net: devlink: remove devlink_mutex") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/1696510216-189379-1-git-send-email-moshe@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-06Merge tag 'linux-can-next-for-6.7-20231005' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2023-10-05 The first patch is by Miquel Raynal and fixes a comment in the sja1000 driver. Vincent Mailhol contributes 2 patches that fix W=1 compiler warnings in the etas_es58x driver. Jiapeng Chong's patch removes an unneeded NULL pointer check before dev_put() in the CAN raw protocol. A patch by Justin Stittreplaces a strncpy() by strscpy() in the peak_pci sja1000 driver. The next 5 patches are by me and fix the can_restart() handler and replace BUG_ON()s in the CAN dev helpers with proper error handling. The last 27 patches are also by me and target the at91_can driver. First a new helper function is introduced, the at91_can driver is cleaned up and updated to use the rx-offload helper. * tag 'linux-can-next-for-6.7-20231005' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (37 commits) can: at91_can: switch to rx-offload implementation can: at91_can: at91_alloc_can_err_skb() introduce new function can: at91_can: at91_irq_err_line(): send error counters with state change can: at91_can: at91_irq_err_line(): make use of can_change_state() and can_bus_off() can: at91_can: at91_irq_err_line(): take reg_sr into account for bus off can: at91_can: at91_irq_err_line(): make use of can_state_get_by_berr_counter() can: at91_can: at91_irq_err(): rename to at91_irq_err_line() can: at91_can: at91_irq_err_frame(): move next to at91_irq_err() can: at91_can: at91_irq_err_frame(): call directly from IRQ handler can: at91_can: at91_poll_err(): increase stats even if no quota left or OOM can: at91_can: at91_poll_err(): fold in at91_poll_err_frame() can: at91_can: add CAN transceiver support can: at91_can: at91_open(): forward request_irq()'s return value in case or an error can: at91_can: at91_chip_start(): don't disable IRQs twice can: at91_can: at91_set_bittiming(): demote register output to debug level can: at91_can: rename struct at91_priv::{tx_next,tx_echo} to {tx_head,tx_tail} can: at91_can: at91_setup_mailboxes(): update comments can: at91_can: add more register definitions can: at91_can: MCR Register: convert to FIELD_PREP() can: at91_can: MSR Register: convert to FIELD_PREP() ... ==================== Link: https://lore.kernel.org/r/20231005195812.549776-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-06Merge wireless into wireless-nextJohannes Berg
Resolve several conflicts, mostly between changes/fixes in wireless and the locking rework in wireless-next. One of the conflicts actually shows a bug in wireless that we'll want to fix separately. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org>