summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2020-06-25netfilter: ip6tables: Split ip6t_unregister_table() into pre_exit and exit ↵David Wilder
helpers. The pre_exit will un-register the underlying hook and .exit will do the table freeing. The netns core does an unconditional synchronize_rcu after the pre_exit hooks insuring no packets are in flight that have picked up the pointer before completing the un-register. Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-25netfilter: iptables: Add a .pre_exit hook in all iptable_foo.c.David Wilder
Using new helpers ipt_unregister_table_pre_exit() and ipt_unregister_table_exit(). Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-25netfilter: iptables: Split ipt_unregister_table() into pre_exit and exit ↵David Wilder
helpers. The pre_exit will un-register the underlying hook and .exit will do the table freeing. The netns core does an unconditional synchronize_rcu after the pre_exit hooks insuring no packets are in flight that have picked up the pointer before completing the un-register. Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-25netfilter: Add MODULE_DESCRIPTION entries to kernel modulesRob Gill
The user tool modinfo is used to get information on kernel modules, including a description where it is available. This patch adds a brief MODULE_DESCRIPTION to netfilter kernel modules (descriptions taken from Kconfig file or code comments) Signed-off-by: Rob Gill <rrobgill@protonmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-25netfilter: ipset: fix unaligned atomic accessRussell King
When using ip_set with counters and comment, traffic causes the kernel to panic on 32-bit ARM: Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>] Unhandled fault: alignment exception (0x221) at 0xea08133c PC is at ip_set_match_extensions+0xe0/0x224 [ip_set] The problem occurs when we try to update the 64-bit counters - the faulting address above is not 64-bit aligned. The problem occurs due to the way elements are allocated, for example: set->dsize = ip_set_elem_len(set, tb, 0, 0); map = ip_set_alloc(sizeof(*map) + elements * set->dsize); If the element has a requirement for a member to be 64-bit aligned, and set->dsize is not a multiple of 8, but is a multiple of four, then every odd numbered elements will be misaligned - and hitting an atomic64_add() on that element will cause the kernel to panic. ip_set_elem_len() must return a size that is rounded to the maximum alignment of any extension field stored in the element. This change ensures that is the case. Fixes: 95ad1f4a9358 ("netfilter: ipset: Fix extension alignment") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-24dsa: Allow forwarding of redirected IGMP trafficDaniel Mack
The driver for Marvell switches puts all ports in IGMP snooping mode which results in all IGMP/MLD frames that ingress on the ports to be forwarded to the CPU only. The bridge code in the kernel can then interpret these frames and act upon them, for instance by updating the mdb in the switch to reflect multicast memberships of stations connected to the ports. However, the IGMP/MLD frames must then also be forwarded to other ports of the bridge so external IGMP queriers can track membership reports, and external multicast clients can receive query reports from foreign IGMP queriers. Currently, this is impossible as the EDSA tagger sets offload_fwd_mark on the skb when it unwraps the tagged frames, and that will make the switchdev layer prevent the skb from egressing on any other port of the same switch. To fix that, look at the To_CPU code in the DSA header and make forwarding of the frame possible for trapped IGMP packets. Introduce some #defines for the frame types to make the code a bit more comprehensive. This was tested on a Marvell 88E6352 variant. Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: add a flag to avoid refreshing fdb when changing/addingNikolay Aleksandrov
When we modify or create a new fdb entry sometimes we want to avoid refreshing its activity in order to track it properly. One example is when a mac is received from EVPN multi-homing peer by FRR, which doesn't want to change local activity accounting. It makes it static and sets a flag to track its activity. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: add option to allow activity notifications for any fdb entriesNikolay Aleksandrov
This patch adds the ability to notify about activity of any entries (static, permanent or ext_learn). EVPN multihoming peers need it to properly and efficiently handle mac sync (peer active/locally active). We add a new NFEA_ACTIVITY_NOTIFY attribute which is used to dump the current activity state and to control if static entries should be monitored at all. We use 2 bits - one to activate fdb entry tracking (disabled by default) and the second to denote that an entry is inactive. We need the second bit in order to avoid multiple notifications of inactivity. Obviously this makes no difference for dynamic entries since at the time of inactivity they get deleted, while the tracked non-dynamic entries get the inactive bit set and get a notification. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: neighbor: add fdb extended attributeNikolay Aleksandrov
Add an attribute to NDA which will contain all future fdb-specific attributes in order to avoid polluting the NDA namespace with e.g. bridge or vxlan specific attributes. The attribute is called NDA_FDB_EXT_ATTRS and the structure would look like: [NDA_FDB_EXT_ATTRS] = { [NFEA_xxx] } Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24net: bridge: fdb_add_entry takes ndm as argumentNikolay Aleksandrov
We can just pass ndm as an argument instead of its fields separately. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24openvswitch: take into account de-fragmentation/gso_size in ↵Lorenzo Bianconi
execute_check_pkt_len ovs connection tracking module performs de-fragmentation on incoming fragmented traffic. Take info account if traffic has been de-fragmented in execute_check_pkt_len action otherwise we will perform the wrong nested action considering the original packet size. This issue typically occurs if ovs-vswitchd adds a rule in the pipeline that requires connection tracking (e.g. OVN stateful ACLs) before execute_check_pkt_len action. Moreover take into account GSO fragment size for GSO packet in execute_check_pkt_len routine Fixes: 4d5ec89fc8d14 ("net: openvswitch: Add a new action check_pkt_len") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-24Bluetooth: Don't restart scanning if pausedAbhishek Pandit-Subedi
When restarting LE scanning, check if it's currently paused before enabling passive scanning. Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-06-24bpf: Add SO_KEEPALIVE and related options to bpf_setsockoptDmitry Yakunin
This patch adds support of SO_KEEPALIVE flag and TCP related options to bpf_setsockopt() routine. This is helpful if we want to enable or tune TCP keepalive for applications which don't do it in the userspace code. v3: - update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>) v4: - update kernel-doc in tools too (Alexei Starovoitov) - add test to selftests (Alexei Starovoitov) Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
2020-06-24tcp: Expose tcp_sock_set_keepidle_lockedDmitry Yakunin
This is preparation for usage in bpf_setsockopt. v2: - remove redundant EXPORT_SYMBOL (Alexei Starovoitov) Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200620153052.9439-2-zeil@yandex-team.ru
2020-06-24sock: Move sock_valbool_flag to headerDmitry Yakunin
This is preparation for usage in bpf_setsockopt. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200620153052.9439-1-zeil@yandex-team.ru
2020-06-24xfrm: policy: match with both mark and mask on user interfacesXin Long
In commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"), it would take 'priority' to make a policy unique, and allow duplicated policies with different 'priority' to be added, which is not expected by userland, as Tobias reported in strongswan. To fix this duplicated policies issue, and also fix the issue in commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"), when doing add/del/get/update on user interfaces, this patch is to change to look up a policy with both mark and mask by doing: mark.v == pol->mark.v && mark.m == pol->mark.m and leave the check: (mark & pol->mark.m) == pol->mark.v for tx/rx path only. As the userland expects an exact mark and mask match to manage policies. v1->v2: - make xfrm_policy_mark_match inline and fix the changelog as Tobias suggested. Fixes: 295fae568885 ("xfrm: Allow user space manipulation of SPD mark") Fixes: ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list") Reported-by: Tobias Brunner <tobias@strongswan.org> Tested-by: Tobias Brunner <tobias@strongswan.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-06-24xfrm: introduce oseq-may-wrap flagPetr Vaněk
RFC 4303 in section 3.3.3 suggests to disable anti-replay for manually distributed ICVs in which case the sender does not need to monitor or reset the counter. However, the sender still increments the counter and when it reaches the maximum value, the counter rolls over back to zero. This patch introduces new extra_flag XFRM_SA_XFLAG_OSEQ_MAY_WRAP which allows sequence number to cycle in outbound packets if set. This flag is used only in legacy and bmp code, because esn should not be negotiated if anti-replay is disabled (see note in 3.3.3 section). Signed-off-by: Petr Vaněk <pv@excello.cz> Acked-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-06-23net: Do not clear the sock TX queue in sk_set_socket()Tariq Toukan
Clearing the sock TX queue in sk_set_socket() might cause unexpected out-of-order transmit when called from sock_orphan(), as outstanding packets can pick a different TX queue and bypass the ones already queued. This is undesired in general. More specifically, it breaks the in-order scheduling property guarantee for device-offloaded TLS sockets. Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it explicitly only where needed. Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23dn_route_rcv: remove redundant dev null checkGaurav Singh
dev cannot be NULL here since its already being accessed before. Remove the redundant null check. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23dcb_doit: remove redundant skb checkGaurav Singh
skb cannot be NULL here since its already being accessed before: sock_net(skb->sk). Remove the redundant null check. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: ipv6: Use struct_size() helper and kcalloc()Gustavo A. R. Silva
Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. Also, remove unnecessary function ipv6_rpl_srh_alloc_size() and replace kzalloc() with kcalloc(), which has a 2-factor argument form for multiplication. This code was detected with the help of Coccinelle and, audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: ethtool: Handle missing cable test TDR parametersAndrew Lunn
A last minute change put the TDR cable test parameters into a nest. The validation is not sufficient, resulting in an oops if the nest is missing. Set default values first, then update them if the nest is provided. Fixes: f2bc8ad31a7f ("net: ethtool: Allow PHY cable test TDR data to configured") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23udp: move gro declarations to net/udp.hEric Dumazet
This removes following warnings : CC net/ipv4/udp_offload.o net/ipv4/udp_offload.c:504:17: warning: no previous prototype for 'udp4_gro_receive' [-Wmissing-prototypes] 504 | struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb) | ^~~~~~~~~~~~~~~~ net/ipv4/udp_offload.c:584:29: warning: no previous prototype for 'udp4_gro_complete' [-Wmissing-prototypes] 584 | INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) | ^~~~~~~~~~~~~~~~~ CHECK net/ipv6/udp_offload.c net/ipv6/udp_offload.c:115:16: warning: symbol 'udp6_gro_receive' was not declared. Should it be static? net/ipv6/udp_offload.c:148:29: warning: symbol 'udp6_gro_complete' was not declared. Should it be static? CC net/ipv6/udp_offload.o net/ipv6/udp_offload.c:115:17: warning: no previous prototype for 'udp6_gro_receive' [-Wmissing-prototypes] 115 | struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) | ^~~~~~~~~~~~~~~~ net/ipv6/udp_offload.c:148:29: warning: no previous prototype for 'udp6_gro_complete' [-Wmissing-prototypes] 148 | INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) | ^~~~~~~~~~~~~~~~~ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: move tcp gro declarations to net/tcp.hEric Dumazet
This patch removes following (C=1 W=1) warnings for CONFIG_RETPOLINE=y : net/ipv4/tcp_offload.c:306:16: warning: symbol 'tcp4_gro_receive' was not declared. Should it be static? net/ipv4/tcp_offload.c:306:17: warning: no previous prototype for 'tcp4_gro_receive' [-Wmissing-prototypes] net/ipv4/tcp_offload.c:319:29: warning: symbol 'tcp4_gro_complete' was not declared. Should it be static? net/ipv4/tcp_offload.c:319:29: warning: no previous prototype for 'tcp4_gro_complete' [-Wmissing-prototypes] CHECK net/ipv6/tcpv6_offload.c net/ipv6/tcpv6_offload.c:16:16: warning: symbol 'tcp6_gro_receive' was not declared. Should it be static? net/ipv6/tcpv6_offload.c:29:29: warning: symbol 'tcp6_gro_complete' was not declared. Should it be static? CC net/ipv6/tcpv6_offload.o net/ipv6/tcpv6_offload.c:16:17: warning: no previous prototype for 'tcp6_gro_receive' [-Wmissing-prototypes] 16 | struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb) | ^~~~~~~~~~~~~~~~ net/ipv6/tcpv6_offload.c:29:29: warning: no previous prototype for 'tcp6_gro_complete' [-Wmissing-prototypes] 29 | INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff) | ^~~~~~~~~~~~~~~~~ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23tcp: move ipv4_specific to tcp include fileEric Dumazet
Declare ipv4_specific once, in tcp.h were it belongs. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23tcp: move ipv6_specific declaration to remove a warningEric Dumazet
ipv6_specific should be declared in tcp include files, not mptcp. This removes the following warning : CHECK net/ipv6/tcp_ipv6.c net/ipv6/tcp_ipv6.c:78:42: warning: symbol 'ipv6_specific' was not declared. Should it be static? Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23audit: log nftables configuration change eventsRichard Guy Briggs
iptables, ip6tables, arptables and ebtables table registration, replacement and unregistration configuration events are logged for the native (legacy) iptables setsockopt api, but not for the nftables netlink api which is used by the nft-variant of iptables in addition to nftables itself. Add calls to log the configuration actions in the nftables netlink api. This uses the same NETFILTER_CFG record format but overloads the table field. type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.878:162) : table=?:0;?:0 family=unspecified entries=2 op=nft_register_gen pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld ... type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.878:162) : table=firewalld:1;?:0 family=inet entries=0 op=nft_register_table pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld ... type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.911:163) : table=firewalld:1;filter_FORWARD:85 family=inet entries=8 op=nft_register_chain pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld ... type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.911:163) : table=firewalld:1;filter_FORWARD:85 family=inet entries=101 op=nft_register_rule pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld ... type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.911:163) : table=firewalld:1;__set0:87 family=inet entries=87 op=nft_register_setelem pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld ... type=NETFILTER_CFG msg=audit(2020-05-28 17:46:41.911:163) : table=firewalld:1;__set0:87 family=inet entries=0 op=nft_register_set pid=396 subj=system_u:system_r:firewalld_t:s0 comm=firewalld For further information please see issue https://github.com/linux-audit/audit-kernel/issues/124 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-23bonding/xfrm: use real_dev instead of slave_devJarod Wilson
Rather than requiring every hw crypto capable NIC driver to do a check for slave_dev being set, set real_dev in the xfrm layer and xso init time, and then override it in the bonding driver as needed. Then NIC drivers can always use real_dev, and at the same time, we eliminate the use of a variable name that probably shouldn't have been used in the first place, particularly given recent current events. CC: Boris Pismenny <borisp@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> CC: Leon Romanovsky <leon@kernel.org> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23ipv6: fib6: avoid indirect calls from fib6_rule_lookupBrian Vazquez
It was reported that a considerable amount of cycles were spent on the expensive indirect calls on fib6_rule_lookup. This patch introduces an inline helper called pol_route_func that uses the indirect_call_wrappers to avoid the indirect calls. This patch saves around 50ns per call. Performance was measured on the receiver by checking the amount of syncookies that server was able to generate under a synflood load. Traffic was generated using trafgen[1] which was pushing around 1Mpps on a single queue. Receiver was using only one rx queue which help to create a bottle neck and make the experiment rx-bounded. These are the syncookies generated over 10s from the different runs: Whithout the patch: TcpExtSyncookiesSent 3553749 0.0 TcpExtSyncookiesSent 3550895 0.0 TcpExtSyncookiesSent 3553845 0.0 TcpExtSyncookiesSent 3541050 0.0 TcpExtSyncookiesSent 3539921 0.0 TcpExtSyncookiesSent 3557659 0.0 TcpExtSyncookiesSent 3526812 0.0 TcpExtSyncookiesSent 3536121 0.0 TcpExtSyncookiesSent 3529963 0.0 TcpExtSyncookiesSent 3536319 0.0 With the patch: TcpExtSyncookiesSent 3611786 0.0 TcpExtSyncookiesSent 3596682 0.0 TcpExtSyncookiesSent 3606878 0.0 TcpExtSyncookiesSent 3599564 0.0 TcpExtSyncookiesSent 3601304 0.0 TcpExtSyncookiesSent 3609249 0.0 TcpExtSyncookiesSent 3617437 0.0 TcpExtSyncookiesSent 3608765 0.0 TcpExtSyncookiesSent 3620205 0.0 TcpExtSyncookiesSent 3601895 0.0 Without the patch the average is 354263 pkt/s or 2822 ns/pkt and with the patch the average is 360738 pkt/s or 2772 ns/pkt which gives an estimate of 50 ns per packet. [1] http://netsniff-ng.org/ Changelog since v1: - Change ordering in the ICW (Paolo Abeni) Cc: Luigi Rizzo <lrizzo@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23net: ethtool: add missing string for NETIF_F_GSO_TUNNEL_REMCSUMAlexander Lobakin
Commit e585f2363637 ("udp: Changes to udp_offload to support remote checksum offload") added new GSO type and a corresponding netdev feature, but missed Ethtool's 'netdev_features_strings' table. Give it a name so it will be exposed to userspace and become available for manual configuration. v3: - decouple from "netdev_features_strings[] cleanup" series; - no functional changes. v2: - don't split the "Fixes:" tag across lines; - no functional changes. Fixes: e585f2363637 ("udp: Changes to udp_offload to support remote checksum offload") Signed-off-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23bridge: mrp: Validate when setting the port roleHoratiu Vultur
This patch adds specific checks for primary(0x0) and secondary(0x1) when setting the port role. For any other value the function 'br_mrp_set_port_role' will return -EINVAL. Fixes: 20f6a05ef63594 ("bridge: mrp: Rework the MRP netlink interface") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23Bluetooth: add a mutex lock to avoid UAF in do_enale_setLihong Kou
In the case we set or free the global value listen_chan in different threads, we can encounter the UAF problems because the method is not protected by any lock, add one to avoid this bug. BUG: KASAN: use-after-free in l2cap_chan_close+0x48/0x990 net/bluetooth/l2cap_core.c:730 Read of size 8 at addr ffff888096950000 by task kworker/1:102/2868 CPU: 1 PID: 2868 Comm: kworker/1:102 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events do_enable_set Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1fb/0x318 lib/dump_stack.c:118 print_address_description+0x74/0x5c0 mm/kasan/report.c:374 __kasan_report+0x149/0x1c0 mm/kasan/report.c:506 kasan_report+0x26/0x50 mm/kasan/common.c:641 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135 l2cap_chan_close+0x48/0x990 net/bluetooth/l2cap_core.c:730 do_enable_set+0x660/0x900 net/bluetooth/6lowpan.c:1074 process_one_work+0x7f5/0x10f0 kernel/workqueue.c:2264 worker_thread+0xbbc/0x1630 kernel/workqueue.c:2410 kthread+0x332/0x350 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Allocated by task 2870: save_stack mm/kasan/common.c:72 [inline] set_track mm/kasan/common.c:80 [inline] __kasan_kmalloc+0x118/0x1c0 mm/kasan/common.c:515 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529 kmem_cache_alloc_trace+0x221/0x2f0 mm/slab.c:3551 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] l2cap_chan_create+0x50/0x320 net/bluetooth/l2cap_core.c:446 chan_create net/bluetooth/6lowpan.c:640 [inline] bt_6lowpan_listen net/bluetooth/6lowpan.c:959 [inline] do_enable_set+0x6a4/0x900 net/bluetooth/6lowpan.c:1078 process_one_work+0x7f5/0x10f0 kernel/workqueue.c:2264 worker_thread+0xbbc/0x1630 kernel/workqueue.c:2410 kthread+0x332/0x350 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Freed by task 2870: save_stack mm/kasan/common.c:72 [inline] set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:337 [inline] __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476 kasan_slab_free+0xe/0x10 mm/kasan/common.c:485 __cache_free mm/slab.c:3426 [inline] kfree+0x10d/0x220 mm/slab.c:3757 l2cap_chan_destroy net/bluetooth/l2cap_core.c:484 [inline] kref_put include/linux/kref.h:65 [inline] l2cap_chan_put+0x170/0x190 net/bluetooth/l2cap_core.c:498 do_enable_set+0x66c/0x900 net/bluetooth/6lowpan.c:1075 process_one_work+0x7f5/0x10f0 kernel/workqueue.c:2264 worker_thread+0xbbc/0x1630 kernel/workqueue.c:2410 kthread+0x332/0x350 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 The buggy address belongs to the object at ffff888096950000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 0 bytes inside of 2048-byte region [ffff888096950000, ffff888096950800) The buggy address belongs to the page: page:ffffea00025a5400 refcount:1 mapcount:0 mapping:ffff8880aa400e00 index:0x0 flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea00027d1548 ffffea0002397808 ffff8880aa400e00 raw: 0000000000000000 ffff888096950000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88809694ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88809694ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888096950000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888096950080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888096950100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Reported-by: syzbot+96414aa0033c363d8458@syzkaller.appspotmail.com Signed-off-by: Lihong Kou <koulihong@huawei.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-06-22mptcp: drop sndr_key in mptcp_syn_optionsGeliang Tang
In RFC 8684, we don't need to send sndr_key in SYN package anymore, so drop it. Fixes: cc7972ea1932 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/core/devlink.c: remove new uninitialized_var() usageStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22tcindex_change: Remove redundant null checkGaurav Singh
arg cannot be NULL since its already being dereferenced before. Remove the redundant NULL check. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22ethtool: Fix check in ethtool_rx_flow_rule_createGaurav Singh
Fix check in ethtool_rx_flow_rule_create Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator") Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22hsr: avoid to create proc file after unregisterTaehee Yoo
When an interface is being deleted, "/proc/net/dev_snmp6/<interface name>" is deleted. The function for this is addrconf_ifdown() in the addrconf_notify() and it is called by notification, which is NETDEV_UNREGISTER. But, if NETDEV_CHANGEMTU is triggered after NETDEV_UNREGISTER, this proc file will be created again. This recreated proc file will be deleted by netdev_wati_allrefs(). Before netdev_wait_allrefs() is called, creating a new HSR interface routine can be executed and It tries to create a proc file but it will find an un-deleted proc file. At this point, it warns about it. To avoid this situation, it can use ->dellink() instead of ->ndo_uninit() to release resources because ->dellink() is called before NETDEV_UNREGISTER. So, a proc file will not be recreated. Test commands ip link add dummy0 type dummy ip link add dummy1 type dummy ip link set dummy0 mtu 1300 #SHELL1 while : do ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1 done #SHELL2 while : do ip link del hsr0 done Splat looks like: [ 9888.980852][ T2752] proc_dir_entry 'dev_snmp6/hsr0' already registered [ 9888.981797][ C2] WARNING: CPU: 2 PID: 2752 at fs/proc/generic.c:372 proc_register+0x2d5/0x430 [ 9888.981798][ C2] Modules linked in: hsr dummy veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6x [ 9888.981814][ C2] CPU: 2 PID: 2752 Comm: ip Tainted: G W 5.8.0-rc1+ #616 [ 9888.981815][ C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 9888.981816][ C2] RIP: 0010:proc_register+0x2d5/0x430 [ 9888.981818][ C2] Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 65 01 00 00 49 8b b5 e0 00 00 00 48 89 ea 40 [ 9888.981819][ C2] RSP: 0018:ffff8880628dedf0 EFLAGS: 00010286 [ 9888.981821][ C2] RAX: dffffc0000000008 RBX: ffff888028c69170 RCX: ffffffffaae09a62 [ 9888.981822][ C2] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f75ac [ 9888.981823][ C2] RBP: ffff888028c693f4 R08: ffffed100d9401bd R09: ffffed100d9401bd [ 9888.981824][ C2] R10: ffffffffaddf406f R11: 0000000000000001 R12: ffff888028c69308 [ 9888.981825][ C2] R13: ffff8880663584c8 R14: dffffc0000000000 R15: ffffed100518d27e [ 9888.981827][ C2] FS: 00007f3876b3b0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 9888.981828][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9888.981829][ C2] CR2: 00007f387601a8c0 CR3: 000000004101a002 CR4: 00000000000606e0 [ 9888.981830][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9888.981831][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9888.981832][ C2] Call Trace: [ 9888.981833][ C2] ? snmp6_seq_show+0x180/0x180 [ 9888.981834][ C2] proc_create_single_data+0x7c/0xa0 [ 9888.981835][ C2] snmp6_register_dev+0xb0/0x130 [ 9888.981836][ C2] ipv6_add_dev+0x4b7/0xf60 [ 9888.981837][ C2] addrconf_notify+0x684/0x1ca0 [ 9888.981838][ C2] ? __mutex_unlock_slowpath+0xd0/0x670 [ 9888.981839][ C2] ? kasan_unpoison_shadow+0x30/0x40 [ 9888.981840][ C2] ? wait_for_completion+0x250/0x250 [ 9888.981841][ C2] ? inet6_ifinfo_notify+0x100/0x100 [ 9888.981842][ C2] ? dropmon_net_event+0x227/0x410 [ 9888.981843][ C2] ? notifier_call_chain+0x90/0x160 [ 9888.981844][ C2] ? inet6_ifinfo_notify+0x100/0x100 [ 9888.981845][ C2] notifier_call_chain+0x90/0x160 [ 9888.981846][ C2] register_netdevice+0xbe5/0x1070 [ ... ] Reported-by: syzbot+1d51c8b74efa4c44adeb@syzkaller.appspotmail.com Fixes: e0a4b99773d3 ("hsr: use upper/lower device infrastructure") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net: core: try to runtime-resume detached device in __dev_openHeiner Kallweit
A netdevice may be marked as detached because the parent is runtime-suspended and not accessible whilst interface or link is down. An example are PCI network devices that go into PCI D3hot, see e.g. __igc_shutdown() or rtl8169_net_suspend(). If netdevice is down and marked as detached we can only open it if we runtime-resume it before __dev_open() calls netif_device_present(). Therefore, if netdevice is detached, try to runtime-resume the parent and only return with an error if it's still detached. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22devlink: Add support for board.serial_number to info_get cb.Vasundhara Volam
Board serial number is a serial number, often available in PCI *Vital Product Data*. Also, update devlink-info.rst documentation file. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22xfrm: bail early on slave pass over skbJarod Wilson
This is prep work for initial support of bonding hardware encryption pass-through support. The bonding driver will fill in the slave_dev pointer, and we use that to know not to skb_push() again on a given skb that was already processed on the bond device. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Support setting hardware address of port functionParav Pandit
PCI PF and VF devlink port can manage the function represented by a devlink port. Allow users to set port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55 $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:55 Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Support querying hardware address of port functionParav Pandit
PCI PF and VF devlink port can manage the function represented by a devlink port. Enable users to query port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:66 $ devlink port show pci/0000:06:00.0/2 -jp { "port": { "pci/0000:06:00.0/2": { "type": "eth", "netdev": "enp6s0pf0vf1", "flavour": "pcivf", "pfnum": 0, "vfnum": 1, "function": { "hw_addr": "00:11:22:33:44:66" } } } } Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22net/devlink: Prepare devlink port functions to fill extackParav Pandit
Prepare devlink port related functions to optionally fill up the extack information which will be used in subsequent patch by port function attribute(s). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22bpf: Set map_btf_{name, id} for all map typesAndrey Ignatov
Set map_btf_name and map_btf_id for all map types so that map fields can be accessed by bpf programs. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/a825f808f22af52b018dbe82f1c7d29dab5fc978.1592600985.git.rdna@fb.com
2020-06-22bpf: Rename bpf_htab to bpf_shtab in sock_mapAndrey Ignatov
There are two different `struct bpf_htab` in bpf code in the following files: - kernel/bpf/hashtab.c - net/core/sock_map.c It makes it impossible to find proper btf_id by name = "bpf_htab" and kind = BTF_KIND_STRUCT what is needed to support access to map ptr so that bpf program can access `struct bpf_htab` fields. To make it possible one of the struct-s should be renamed, sock_map.c looks like a better candidate for rename since it's specialized version of hashtab. Rename it to bpf_shtab ("sh" stands for Sock Hash). Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/c006a639e03c64ca50fc87c4bb627e0bfba90f4e.1592600985.git.rdna@fb.com
2020-06-22Bluetooth: Disconnect if E0 is used for Level 4Luiz Augusto von Dentz
E0 is not allowed with Level 4: BLUETOOTH CORE SPECIFICATION Version 5.2 | Vol 3, Part C page 1319: '128-bit equivalent strength for link and encryption keys required using FIPS approved algorithms (E0 not allowed, SAFER+ not allowed, and P-192 not allowed; encryption key not shortened' SC enabled: > HCI Event: Read Remote Extended Features (0x23) plen 13 Status: Success (0x00) Handle: 256 Page: 1/2 Features: 0x0b 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Secure Simple Pairing (Host Support) LE Supported (Host) Secure Connections (Host Support) > HCI Event: Encryption Change (0x08) plen 4 Status: Success (0x00) Handle: 256 Encryption: Enabled with AES-CCM (0x02) SC disabled: > HCI Event: Read Remote Extended Features (0x23) plen 13 Status: Success (0x00) Handle: 256 Page: 1/2 Features: 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Secure Simple Pairing (Host Support) LE Supported (Host) > HCI Event: Encryption Change (0x08) plen 4 Status: Success (0x00) Handle: 256 Encryption: Enabled with E0 (0x01) [May 8 20:23] Bluetooth: hci0: Invalid security: expect AES but E0 was used < HCI Command: Disconnect (0x01|0x0006) plen 3 Handle: 256 Reason: Authentication Failure (0x05) Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-06-22Bluetooth: use configured params for ext advAlain Michaud
When the extended advertisement feature is enabled, a hardcoded min and max interval of 0x8000 is used. This patch fixes this issue by using the configured min/max value. This was validated by setting min/max in main.conf and making sure the right setting is applied: < HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036) plen 25 #93 [hci0] 10.953011 … Min advertising interval: 181.250 msec (0x0122) Max advertising interval: 181.250 msec (0x0122) … Signed-off-by: Alain Michaud <alainm@chromium.org> Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Reviewed-by: Daniel Winkler <danielwinkler@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-06-22xprtrdma: Fix handling of RDMA_ERROR repliesChuck Lever
The RPC client currently doesn't handle ERR_CHUNK replies correctly. rpcrdma_complete_rqst() incorrectly passes a negative number to xprt_complete_rqst() as the number of bytes copied. Instead, set task->tk_status to the error value, and return zero bytes copied. In these cases, return -EIO rather than -EREMOTEIO. The RPC client's finite state machine doesn't know what to do with -EREMOTEIO. Additional clean ups: - Don't double-count RDMA_ERROR replies - Remove a stale comment Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@kernel.vger.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22xprtrdma: Clean up disconnectChuck Lever
1. Ensure that only rpcrdma_cm_event_handler() modifies ep->re_connect_status to avoid racy changes to that field. 2. Ensure that xprt_force_disconnect() is invoked only once as a transport is closed or destroyed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-22xprtrdma: Clean up synopsis of rpcrdma_flush_disconnect()Chuck Lever
Refactor: Pass struct rpcrdma_xprt instead of an IB layer object. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>