summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2022-09-15mptcp: account memory allocation in mptcp_nl_cmd_add_addr() to userThomas Haller
Now that non-root users can configure MPTCP endpoints, account the memory allocation to the user. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15mptcp: allow privileged operations from user namespacesThomas Haller
GENL_ADMIN_PERM checks that the user has CAP_NET_ADMIN in the initial namespace by calling netlink_capable(). Instead, use GENL_UNS_ADMIN_PERM which uses netlink_ns_capable(). This checks that the caller has CAP_NET_ADMIN in the current user namespace. See also commit 4a92602aa1cd ("openvswitch: allow management from inside user namespaces") which introduced this mechanism. See also commit 5617c6cd6f84 ("nl80211: Allow privileged operations from user namespaces") which introduced this for nl80211. Signed-off-by: Thomas Haller <thaller@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15mptcp: add do_check_data_fin to replace copiedGeliang Tang
This patch adds a new bool variable 'do_check_data_fin' to replace the original int variable 'copied' in __mptcp_push_pending(), check it to determine whether to call __mptcp_check_send_data_fin(). Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15mptcp: add mptcp_for_each_subflow_safe helperMatthieu Baerts
Similar to mptcp_for_each_subflow(): this is clearer now that the _safe version is used in multiple places. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-15can: raw: add CAN XL supportOliver Hartkopp
Enable CAN_RAW sockets to read and write CAN XL frames analogue to the CAN FD extension (new CAN_RAW_XL_FRAMES sockopt). A CAN XL network interface is capable to handle Classical CAN, CAN FD and CAN XL frames. When CAN_RAW_XL_FRAMES is enabled, the CAN_RAW socket checks whether the addressed CAN network interface is capable to handle the provided CAN frame. In opposite to the fixed number of bytes for - CAN frames (CAN_MTU = sizeof(struct can_frame)) - CAN FD frames (CANFD_MTU = sizeof(struct can_frame)) the number of bytes when reading/writing CAN XL frames depends on the number of data bytes. For efficiency reasons the length of the struct canxl_frame is truncated to the needed size for read/write operations. This leads to a calculated size of CANXL_HDR_SIZE + canxl_frame::len which is enforced on write() operations and guaranteed on read() operations. NB: Valid length values are 1 .. 2048 (CANXL_MIN_DLEN .. CANXL_MAX_DLEN). Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20220912170725.120748-8-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-15can: canxl: update CAN infrastructure for CAN XL framesOliver Hartkopp
- add new ETH_P_CANXL ethernet protocol type - update skb checks for CAN XL - add alloc_canxl_skb() which now needs a data length parameter - introduce init_can_skb_reserve() to reduce code duplication Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20220912170725.120748-6-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-15can: set CANFD_FDF flag in all CAN FD frame structuresOliver Hartkopp
To simplify the testing in user space all struct canfd_frame's provided by the CAN subsystem of the Linux kernel now have the CANFD_FDF flag set in canfd_frame::flags. NB: Handcrafted ETH_P_CANFD frames introduced via PF_PACKET socket might not set this bit correctly. During the check for sufficient headroom in PF_PACKET sk_buffs the uninitialized CAN sk_buff data structures are filled. In the case of a CAN FD frame the CANFD_FDF flag is set accordingly. As the CAN frame content is already zero initialized in alloc_canfd_skb() the obsolete initialization of cf->flags in the CTU CAN FD driver has been removed as it would overwrite the already set CANFD_FDF flag. Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20220912170725.120748-4-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-15can: skb: unify skb CAN frame identification helpersOliver Hartkopp
Replace open coded checks for sk_buffs containing Classical CAN and CAN FD frame structures as a preparation for CAN XL support. With the added length check the unintended processing of CAN XL frames having the CANXL_XLF bit set can be suppressed even when the skb->len fits to non CAN XL frames. The CAN_RAW socket needs a rework to use these helpers. Therefore the use of these helpers is postponed to the CAN_RAW CAN XL integration. The J1939 protocol gets a check for Classical CAN frames too. Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20220912170725.120748-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-15batman-adv: remove unused struct definitionsMarek Lindner
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2022-09-14Bluetooth: hci_sync: allow advertise when scan without RPAZhengping Jiang
Address resolution will be paused during active scan to allow any advertising reports reach the host. If LL privacy is enabled, advertising will rely on the controller to generate new RPA. If host is not using RPA, there is no need to stop advertising during active scan because there is no need to generate RPA in the controller. Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-09-14Bluetooth: avoid hci_dev_test_and_set_flag() in mgmt_init_hdev()Tetsuo Handa
syzbot is again reporting attempt to cancel uninitialized work at mgmt_index_removed() [1], for setting of HCI_MGMT flag from mgmt_init_hdev() from hci_mgmt_cmd() from hci_sock_sendmsg() can race with testing of HCI_MGMT flag from mgmt_index_removed() from hci_sock_bind() due to lack of serialization via hci_dev_lock(). Since mgmt_init_hdev() is called with mgmt_chan_list_lock held, we can safely split hci_dev_test_and_set_flag() into hci_dev_test_flag() and hci_dev_set_flag(). Thus, in order to close this race, set HCI_MGMT flag after INIT_DELAYED_WORK() completed. This is a local fix based on mgmt_chan_list_lock. Lack of serialization via hci_dev_lock() might be causing different race conditions somewhere else. But a global fix based on hci_dev_lock() should deserve a future patch. Link: https://syzkaller.appspot.com/bug?extid=844c7bf1b1aa4119c5de Reported-by: syzbot+844c7bf1b1aa4119c5de@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 3f2893d3c142986a ("Bluetooth: don't try to cancel uninitialized works at mgmt_index_removed()") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-09-13mptcp: fix fwd memory accounting on coalescePaolo Abeni
The intel bot reported a memory accounting related splat: [ 240.473094] ------------[ cut here ]------------ [ 240.478507] page_counter underflow: -4294828518 nr_pages=4294967290 [ 240.485500] WARNING: CPU: 2 PID: 14986 at mm/page_counter.c:56 page_counter_cancel+0x96/0xc0 [ 240.570849] CPU: 2 PID: 14986 Comm: mptcp_connect Tainted: G S 5.19.0-rc4-00739-gd24141fe7b48 #1 [ 240.581637] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017 [ 240.590600] RIP: 0010:page_counter_cancel+0x96/0xc0 [ 240.596179] Code: 00 00 00 45 31 c0 48 89 ef 5d 4c 89 c6 41 5c e9 40 fd ff ff 4c 89 e2 48 c7 c7 20 73 39 84 c6 05 d5 b1 52 04 01 e8 e7 95 f3 01 <0f> 0b eb a9 48 89 ef e8 1e 25 fc ff eb c3 66 66 2e 0f 1f 84 00 00 [ 240.615639] RSP: 0018:ffffc9000496f7c8 EFLAGS: 00010082 [ 240.621569] RAX: 0000000000000000 RBX: ffff88819c9c0120 RCX: 0000000000000000 [ 240.629404] RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff5200092deeb [ 240.637239] RBP: ffff88819c9c0120 R08: 0000000000000001 R09: ffff888366527a2b [ 240.645069] R10: ffffed106cca4f45 R11: 0000000000000001 R12: 00000000fffffffa [ 240.652903] R13: ffff888366536118 R14: 00000000fffffffa R15: ffff88819c9c0000 [ 240.660738] FS: 00007f3786e72540(0000) GS:ffff888366500000(0000) knlGS:0000000000000000 [ 240.669529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 240.675974] CR2: 00007f966b346000 CR3: 0000000168cea002 CR4: 00000000003706e0 [ 240.683807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 240.691641] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 240.699468] Call Trace: [ 240.702613] <TASK> [ 240.705413] page_counter_uncharge+0x29/0x80 [ 240.710389] drain_stock+0xd0/0x180 [ 240.714585] refill_stock+0x278/0x580 [ 240.718951] __sk_mem_reduce_allocated+0x222/0x5c0 [ 240.729248] __mptcp_update_rmem+0x235/0x2c0 [ 240.734228] __mptcp_move_skbs+0x194/0x6c0 [ 240.749764] mptcp_recvmsg+0xdfa/0x1340 [ 240.763153] inet_recvmsg+0x37f/0x500 [ 240.782109] sock_read_iter+0x24a/0x380 [ 240.805353] new_sync_read+0x420/0x540 [ 240.838552] vfs_read+0x37f/0x4c0 [ 240.842582] ksys_read+0x170/0x200 [ 240.864039] do_syscall_64+0x5c/0x80 [ 240.872770] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 240.878526] RIP: 0033:0x7f3786d9ae8e [ 240.882805] Code: c0 e9 b6 fe ff ff 50 48 8d 3d 6e 18 0a 00 e8 89 e8 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28 [ 240.902259] RSP: 002b:00007fff7be81e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 240.910533] RAX: ffffffffffffffda RBX: 0000000000002000 RCX: 00007f3786d9ae8e [ 240.918368] RDX: 0000000000002000 RSI: 00007fff7be87ec0 RDI: 0000000000000005 [ 240.926206] RBP: 0000000000000005 R08: 00007f3786e6a230 R09: 00007f3786e6a240 [ 240.934046] R10: fffffffffffff288 R11: 0000000000000246 R12: 0000000000002000 [ 240.941884] R13: 00007fff7be87ec0 R14: 00007fff7be87ec0 R15: 0000000000002000 [ 240.949741] </TASK> [ 240.952632] irq event stamp: 27367 [ 240.956735] hardirqs last enabled at (27366): [<ffffffff81ba50ea>] mem_cgroup_uncharge_skmem+0x6a/0x80 [ 240.966848] hardirqs last disabled at (27367): [<ffffffff81b8fd42>] refill_stock+0x282/0x580 [ 240.976017] softirqs last enabled at (27360): [<ffffffff83a4d8ef>] mptcp_recvmsg+0xaf/0x1340 [ 240.985273] softirqs last disabled at (27364): [<ffffffff83a4d30c>] __mptcp_move_skbs+0x18c/0x6c0 [ 240.994872] ---[ end trace 0000000000000000 ]--- After commit d24141fe7b48 ("mptcp: drop SK_RECLAIM_* macros"), if rmem_fwd_alloc become negative, mptcp_rmem_uncharge() can try to reclaim a negative amount of pages, since the expression: reclaimable >= PAGE_SIZE will evaluate to true for any negative value of the int 'reclaimable': 'PAGE_SIZE' is an unsigned long and the negative integer will be promoted to a (very large) unsigned long value. Still after the mentioned commit, kfree_skb_partial() in mptcp_try_coalesce() will reclaim most of just released fwd memory, so that following charging of the skb delta size will lead to negative fwd memory values. At that point a racing recvmsg() can trigger the splat. Address the issue switching the order of the memory accounting operations. The fwd memory can still transiently reach negative values, but that will happen in an atomic scope and no code path could touch/use such value. Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: d24141fe7b48 ("mptcp: drop SK_RECLAIM_* macros") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20220906180404.1255873-1-matthieu.baerts@tessares.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-12Merge tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - Fix SUNRPC call completion races with call_decode() that trigger a WARN_ON() - NFSv4.0 cannot support open-by-filehandle and NFS re-export - Revert "SUNRPC: Remove unreachable error condition" to allow handling of error conditions - Update suid/sgid mode bits after ALLOCATE and DEALLOCATE * tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: Remove unreachable error condition" NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0 SUNRPC: Fix call completion races with call_decode()
2022-09-10bpf: Add support for writing to nf_conn:markDaniel Xu
Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/ebca06dea366e3e7e861c12f375a548cc4c61108.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-10bpf: Use 0 instead of NOT_INIT for btf_struct_access() writesDaniel Xu
Returning a bpf_reg_type only makes sense in the context of a BPF_READ. For writes, prefer to explicitly return 0 for clarity. Note that is non-functional change as it just so happened that NOT_INIT == 0. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/01772bc1455ae16600796ac78c6cc9fff34f95ff.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-09bpf: Invoke cgroup/connect{4,6} programs for unprivileged ICMP pingYiFei Zhu
Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This adds the hook invocation to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW. Logic is mirrored from UDP sockets where the hook is invoked during pre_connect, after a check for suficiently sized addr_len. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/5764914c252fad4cd134fb6664c6ede95f409412.1662682323.git.zhuyifei@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-09net: core: fix flow symmetric hashLudovic Cintrat
__flow_hash_consistentify() wrongly swaps ipv4 addresses in few cases. This function is indirectly used by __skb_get_hash_symmetric(), which is used to fanout packets in AF_PACKET. Intrusion detection systems may be impacted by this issue. __flow_hash_consistentify() computes the addresses difference then swaps them if the difference is negative. In few cases src - dst and dst - src are both negative. The following snippet mimics __flow_hash_consistentify(): ``` #include <stdio.h> #include <stdint.h> int main(int argc, char** argv) { int diffs_d, diffd_s; uint32_t dst = 0xb225a8c0; /* 178.37.168.192 --> 192.168.37.178 */ uint32_t src = 0x3225a8c0; /* 50.37.168.192 --> 192.168.37.50 */ uint32_t dst2 = 0x3325a8c0; /* 51.37.168.192 --> 192.168.37.51 */ diffs_d = src - dst; diffd_s = dst - src; printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n", src, dst, diffs_d, diffs_d, diffd_s, diffd_s); diffs_d = src - dst2; diffd_s = dst2 - src; printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n", src, dst2, diffs_d, diffs_d, diffd_s, diffd_s); return 0; } ``` Results: src:3225a8c0 dst:b225a8c0, \ diff(s-d)=-2147483648(0x80000000) \ diff(d-s)=-2147483648(0x80000000) src:3225a8c0 dst:3325a8c0, \ diff(s-d)=-16777216(0xff000000) \ diff(d-s)=16777216(0x1000000) In the first case the addresses differences are always < 0, therefore __flow_hash_consistentify() always swaps, thus dst->src and src->dst packets have differents hashes. Fixes: c3f8324188fa8 ("net: Add full IPv6 addresses to flow_keys") Signed-off-by: Ludovic Cintrat <ludovic.cintrat@gatewatcher.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: openvswitch: fix repeated words in commentsJilin Yuan
Delete the redundant word 'is'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller
Florian Westhal says: ==================== netfilter: bugfixes for net The following set contains four netfilter patches for your *net* tree. When there are multiple Contact headers in a SIP message its possible the next headers won't be found because the SIP helper confuses relative and absolute offsets in the message. From Igor Ryzhov. Make the nft_concat_range self-test support socat, this makes the selftest pass on my test VM, from myself. nf_conntrack_irc helper can be tricked into opening a local port forward that the client never requested by embedding a DCC message in a PING request sent to the client. Fix from David Leadbeater. Both have been broken since the kernel 2.6.x days. The 'osf' match might indicate success while it could not find anything, broken since 5.2 . Fix from Pablo Neira. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_vlan: get rid of tcf_vlan_walker and tcf_vlan_searchZhengchao Shao
tcf_vlan_walker() and tcf_vlan_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_tunnel_key: get rid of tunnel_key_walker and tunnel_key_searchZhengchao Shao
tunnel_key_walker() and tunnel_key_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_skbmod: get rid of tcf_skbmod_walker and tcf_skbmod_searchZhengchao Shao
tcf_skbmod_walker() and tcf_skbmod_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_skbedit: get rid of tcf_skbedit_walker and tcf_skbedit_searchZhengchao Shao
tcf_skbedit_walker() and tcf_skbedit_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_simple: get rid of tcf_simp_walker and tcf_simp_searchZhengchao Shao
tcf_simp_walker() and tcf_simp_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_sample: get rid of tcf_sample_walker and tcf_sample_searchZhengchao Shao
tcf_sample_walker() and tcf_sample_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_police: get rid of tcf_police_walker and tcf_police_searchZhengchao Shao
tcf_police_walker() and tcf_police_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_pedit: get rid of tcf_pedit_walker and tcf_pedit_searchZhengchao Shao
tcf_pedit_walker() and tcf_pedit_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_nat: get rid of tcf_nat_walker and tcf_nat_searchZhengchao Shao
tcf_nat_walker() and tcf_nat_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_mpls: get rid of tcf_mpls_walker and tcf_mpls_searchZhengchao Shao
tcf_mpls_walker() and tcf_mpls_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_mirred: get rid of tcf_mirred_walker and tcf_mirred_searchZhengchao Shao
tcf_mirred_walker() and tcf_mirred_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_ipt: get rid of tcf_ipt_walker/tcf_xt_walker and ↵Zhengchao Shao
tcf_ipt_search/tcf_xt_search tcf_ipt_walker()/tcf_xt_walker() and tcf_ipt_search()/tcf_xt_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_ife: get rid of tcf_ife_walker and tcf_ife_searchZhengchao Shao
tcf_ife_walker() and tcf_ife_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_gate: get rid of tcf_gate_walker and tcf_gate_searchZhengchao Shao
tcf_gate_walker() and tcf_gate_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_gact: get rid of tcf_gact_walker and tcf_gact_searchZhengchao Shao
tcf_gact_walker() and tcf_gact_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_ctinfo: get rid of tcf_ctinfo_walker and tcf_ctinfo_searchZhengchao Shao
tcf_ctinfo_walker() and tcf_ctinfo_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_ct: get rid of tcf_ct_walker and tcf_ct_searchZhengchao Shao
tcf_ct_walker() and tcf_ct_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_csum: get rid of tcf_csum_walker and tcf_csum_searchZhengchao Shao
tcf_csum_walker() and tcf_csum_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_connmark: get rid of tcf_connmark_walker and tcf_connmark_searchZhengchao Shao
tcf_connmark_walker() and tcf_connmark_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_bpf: get rid of tcf_bpf_walker and tcf_bpf_searchZhengchao Shao
tcf_bpf_walker() and tcf_bpf_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act_api: implement generic walker and search for tc actionZhengchao Shao
Being able to get tc_action_net by using net_id stored in tc_action_ops and execute the generic walk/search function, add __tcf_generic_walker() and __tcf_idr_search() helpers. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09net: sched: act: move global static variable net_id to tc_action_opsZhengchao Shao
Each tc action module has a corresponding net_id, so put net_id directly into the structure tc_action_ops. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextDavid S. Miller
Florian Westphal says: ==================== The following set contains changes for your *net-next* tree: - make conntrack ignore packets that are delayed (containing data already acked). The current behaviour to flag them as INVALID causes more harm than good, let them pass so peer can send an immediate ACK for the most recent sequence number. - make conntrack recognize when both peers have sent 'invalid' FINs: This helps cleaning out stale connections faster for those cases where conntrack is no longer in sync with the actual connection state. - Now that DECNET is gone, we don't need to reserve space for DECNET related information. - compact common 'find a free port number for the new inbound connection' code and move it to a helper, then cap number of tries the new helper will make until it gives up. - replace various instances of strlcpy with strscpy, from Wolfram Sang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
drivers/net/ethernet/freescale/fec.h 7d650df99d52 ("net: fec: add pm_qos support on imx6q platform") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-08Revert "SUNRPC: Remove unreachable error condition"Dan Aloni
This reverts commit efe57fd58e1cb77f9186152ee12a8aa4ae3348e0. The assumption that it is impossible to return an ERR pointer from rpc_run_task() no longer holds due to commit 25cf32ad5dba ("SUNRPC: Handle allocation failure in rpc_new_task()"). Fixes: 25cf32ad5dba ('SUNRPC: Handle allocation failure in rpc_new_task()') Fixes: efe57fd58e1c ('SUNRPC: Remove unreachable error condition') Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-09-08sch_sfb: Also store skb len before calling child enqueueToke Høiland-Jørgensen
Cong Wang noticed that the previous fix for sch_sfb accessing the queued skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue function was also calling qdisc_qstats_backlog_inc() after enqueue, which reads the pkt len from the skb cb field. Fix this by also storing the skb len, and using the stored value to increment the backlog after enqueueing. Fixes: 9efd23297cca ("sch_sfb: Don't assume the skb is still around after enqueueing to child") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-07selftests/bpf: Add tests for kfunc returning a memory pointerBenjamin Tissoires
We add 2 new kfuncs that are following the RET_PTR_TO_MEM capability from the previous commit. Then we test them in selftests: the first tests are testing valid case, and are not failing, and the later ones are actually preventing the program to be loaded because they are wrong. To work around that, we mark the failing ones as not autoloaded (with SEC("?tc")), and we manually enable them one by one, ensuring the verifier rejects them. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-8-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07selftests/bpf: add test for accessing ctx from syscall program typeBenjamin Tissoires
We need to also export the kfunc set to the syscall program type, and then add a couple of eBPF programs that are testing those calls. The first one checks for valid access, and the second one is OK from a static analysis point of view but fails at run time because we are trying to access outside of the allocated memory. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-5-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-07net/smc: Fix possible access to freed memory in link clearYacan Liu
After modifying the QP to the Error state, all RX WR would be completed with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not wait for it is done, but destroy the QP and free the link group directly. So there is a risk that accessing the freed memory in tasklet context. Here is a crash example: BUG: unable to handle page fault for address: ffffffff8f220860 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S OE 5.10.0-0607+ #23 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> _raw_spin_lock_irqsave+0x30/0x40 mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib] smc_wr_rx_tasklet_fn+0x56/0xa0 [smc] tasklet_action_common.isra.21+0x66/0x100 __do_softirq+0xd5/0x29c asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x37/0x40 irq_exit_rcu+0x9d/0xa0 sysvec_call_function_single+0x34/0x80 asm_sysvec_call_function_single+0x12/0x20 Fixes: bd4ad57718cc ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR") Signed-off-by: Yacan Liu <liuyacan@corp.netease.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07netfilter: nat: avoid long-running port range loopFlorian Westphal
Looping a large port range takes too long. Instead select a random offset within [ntohs(exp->saved_proto.tcp.port), 65535] and try 128 ports. This is a rehash of an erlier patch to do the same, but generalized to handle other helpers as well. Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20210920204439.13179-2-Cole.Dishington@alliedtelesis.co.nz/ Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07netfilter: nat: move repetitive nat port reserve loop to a helperFlorian Westphal
Almost all nat helpers reserve an expecation port the same way: Try the port inidcated by the peer, then move to next port if that port is already in use. We can squash this into a helper. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>