summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2019-08-24net/core/skmsg: Delete an unnecessary check before the function call ↵Markus Elfring
“consume_skb” The consume_skb() function performs also input parameter validation. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md modeHangbin Liu
In decode_session{4,6} there is a possibility that the skb dst dev is NULL, e,g, with tunnel collect_md mode, which will cause kernel crash. Here is what the code path looks like, for GRE: - ip6gre_tunnel_xmit - ip6gre_xmit_ipv6 - __gre6_xmit - ip6_tnl_xmit - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE - icmpv6_send - icmpv6_route_lookup - xfrm_decode_session_reverse - decode_session4 - oif = skb_dst(skb)->dev->ifindex; <-- here - decode_session6 - oif = skb_dst(skb)->dev->ifindex; <-- here The reason is __metadata_dst_init() init dst->dev to NULL by default. We could not fix it in __metadata_dst_init() as there is no dev supplied. On the other hand, the skb_dst(skb)->dev is actually not needed as we called decode_session{4,6} via xfrm_decode_session_reverse(), so oif is not used by: fl4->flowi4_oif = reverse ? skb->skb_iif : oif; So make a dst dev check here should be clean and safe. v4: No changes. v3: No changes. v2: fix the issue in decode_session{4,6} instead of updating shared dst dev in {ip_md, ip6}_tunnel_xmit. Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24ipv4/icmp: fix rt dst dev null pointer dereferenceHangbin Liu
In __icmp_send() there is a possibility that the rt->dst.dev is NULL, e,g, with tunnel collect_md mode, which will cause kernel crash. Here is what the code path looks like, for GRE: - ip6gre_tunnel_xmit - ip6gre_xmit_ipv4 - __gre6_xmit - ip6_tnl_xmit - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE - icmp_send - net = dev_net(rt->dst.dev); <-- here The reason is __metadata_dst_init() init dst->dev to NULL by default. We could not fix it in __metadata_dst_init() as there is no dev supplied. On the other hand, the reason we need rt->dst.dev is to get the net. So we can just try get it from skb->dev when rt->dst.dev is NULL. v4: Julian Anastasov remind skb->dev also could be NULL. We'd better still use dst.dev and do a check to avoid crash. v3: No changes. v2: fix the issue in __icmp_send() instead of updating shared dst dev in {ip_md, ip6}_tunnel_xmit. Fixes: c8b34e680a09 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24openvswitch: Fix log message in ovs conntrackYi-Hung Wei
Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24Merge branch 'ieee802154-for-davem-2019-08-24' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2019-08-24 An update from ieee802154 for your *net* tree. Yue Haibing fixed two bugs discovered by KASAN in the hwsim driver for ieee802154 and Colin Ian King cleaned up a redundant variable assignment. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2019-08-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix verifier precision tracking with BPF-to-BPF calls, from Alexei. 2) Fix a use-after-free in prog symbol exposure, from Daniel. 3) Several s390x JIT fixes plus BE related fixes in BPF kselftests, from Ilya. 4) Fix memory leak by unpinning XDP umem pages in error path, from Ivan. 5) Fix a potential use-after-free on flow dissector detach, from Jakub. 6) Fix bpftool to close prog fd after showing metadata, from Quentin. 7) BPF kselftest config and TEST_PROGS_EXTENDED fixes, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0Ilya Leoshkevich
test_select_reuseport fails on s390 due to verifier rejecting test_select_reuseport_kern.o with the following message: ; data_check.eth_protocol = reuse_md->eth_protocol; 18: (69) r1 = *(u16 *)(r6 +22) invalid bpf_context access off=22 size=2 This is because on big-endian machines casts from __u32 to __u16 are generated by referencing the respective variable as __u16 with an offset of 2 (as opposed to 0 on little-endian machines). The verifier already has all the infrastructure in place to allow such accesses, it's just that they are not explicitly enabled for eth_protocol field. Enable them for eth_protocol field by using bpf_ctx_range instead of offsetof. Ditto for ip_protocol, bind_inany and len, since they already allow narrowing, and the same problem can arise when working with them. Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-24flow_dissector: Fix potential use-after-free on BPF_PROG_DETACHJakub Sitnicki
Call to bpf_prog_put(), with help of call_rcu(), queues an RCU-callback to free the program once a grace period has elapsed. The callback can run together with new RCU readers that started after the last grace period. New RCU readers can potentially see the "old" to-be-freed or already-freed pointer to the program object before the RCU update-side NULLs it. Reorder the operations so that the RCU update-side resets the protected pointer before the end of the grace period after which the program will be freed. Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-23drop_monitor: Make timestamps y2038 safeIdo Schimmel
Timestamps are currently communicated to user space as 'struct timespec', which is not considered y2038 safe since it uses a 32-bit signed value for seconds. Fix this while the API is still not part of any official kernel release by using 64-bit nanoseconds timestamps instead. Fixes: ca30707dee2b ("drop_monitor: Add packet alert mode") Fixes: 5e58109b1ea4 ("drop_monitor: Add support for packet alert mode for hardware drops") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23net/rds: Whitelist rdma_cookie and rx_tstamp for usercopyDag Moxnes
Add the RDMA cookie and RX timestamp to the usercopy whitelist. After the introduction of hardened usercopy whitelisting (https://lwn.net/Articles/727322/), a warning is displayed when the RDMA cookie or RX timestamp is copied to userspace: kernel: WARNING: CPU: 3 PID: 5750 at mm/usercopy.c:81 usercopy_warn+0x8e/0xa6 [...] kernel: Call Trace: kernel: __check_heap_object+0xb8/0x11b kernel: __check_object_size+0xe3/0x1bc kernel: put_cmsg+0x95/0x115 kernel: rds_recvmsg+0x43d/0x620 [rds] kernel: sock_recvmsg+0x43/0x4a kernel: ___sys_recvmsg+0xda/0x1e6 kernel: ? __handle_mm_fault+0xcae/0xf79 kernel: __sys_recvmsg+0x51/0x8a kernel: SyS_recvmsg+0x12/0x1c kernel: do_syscall_64+0x79/0x1ae When the whitelisting feature was introduced, the memory for the RDMA cookie and RX timestamp in RDS was not added to the whitelist, causing the warning above. Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com> Tested-by: Jenny <jenny.x.xu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idevSabrina Dubroca
Currently, ipv6_find_idev returns NULL when ipv6_add_dev fails, ignoring the specific error value. This results in addrconf_add_dev returning ENOBUFS in all cases, which is unfortunate in cases such as: # ip link add dummyX type dummy # ip link set dummyX mtu 1200 up # ip addr add 2000::/64 dev dummyX RTNETLINK answers: No buffer space available Commit a317a2f19da7 ("ipv6: fail early when creating netdev named all or default") introduced error returns in ipv6_add_dev. Before that, that function would simply return NULL for all failures. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23net: ipv6: fix listify ip6_rcv_finish in case of forwardingXin Long
We need a similar fix for ipv6 as Commit 0761680d5215 ("net: ipv4: fix listify ip_rcv_finish in case of forwarding") does for ipv4. This issue can be reprocuded by syzbot since Commit 323ebb61e32b ("net: use listified RX for handling GRO_NORMAL skbs") on net-next. The call trace was: kernel BUG at include/linux/skbuff.h:2225! RIP: 0010:__skb_pull include/linux/skbuff.h:2225 [inline] RIP: 0010:skb_pull+0xea/0x110 net/core/skbuff.c:1902 Call Trace: sctp_inq_pop+0x2f1/0xd80 net/sctp/inqueue.c:202 sctp_endpoint_bh_rcv+0x184/0x8d0 net/sctp/endpointola.c:385 sctp_inq_push+0x1e4/0x280 net/sctp/inqueue.c:80 sctp_rcv+0x2807/0x3590 net/sctp/input.c:256 sctp6_rcv+0x17/0x30 net/sctp/ipv6.c:1049 ip6_protocol_deliver_rcu+0x2fe/0x1660 net/ipv6/ip6_input.c:397 ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:438 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip6_input+0xe4/0x3f0 net/ipv6/ip6_input.c:447 dst_input include/net/dst.h:442 [inline] ip6_sublist_rcv_finish+0x98/0x1e0 net/ipv6/ip6_input.c:84 ip6_list_rcv_finish net/ipv6/ip6_input.c:118 [inline] ip6_sublist_rcv+0x80c/0xcf0 net/ipv6/ip6_input.c:282 ipv6_list_rcv+0x373/0x4b0 net/ipv6/ip6_input.c:316 __netif_receive_skb_list_ptype net/core/dev.c:5049 [inline] __netif_receive_skb_list_core+0x5fc/0x9d0 net/core/dev.c:5097 __netif_receive_skb_list net/core/dev.c:5149 [inline] netif_receive_skb_list_internal+0x7eb/0xe60 net/core/dev.c:5244 gro_normal_list.part.0+0x1e/0xb0 net/core/dev.c:5757 gro_normal_list net/core/dev.c:5755 [inline] gro_normal_one net/core/dev.c:5769 [inline] napi_frags_finish net/core/dev.c:5782 [inline] napi_gro_frags+0xa6a/0xea0 net/core/dev.c:5855 tun_get_user+0x2e98/0x3fa0 drivers/net/tun.c:1974 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2020 Fixes: d8269e2cbf90 ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()") Fixes: 323ebb61e32b ("net: use listified RX for handling GRO_NORMAL skbs") Reported-by: syzbot+eb349eeee854e389c36d@syzkaller.appspotmail.com Reported-by: syzbot+4a0643a653ac375612d1@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-23batman-adv: Only read OGM2 tvlv_len after buffer len checkSven Eckelmann
Multiple batadv_ogm2_packet can be stored in an skbuff. The functions batadv_v_ogm_send_to_if() uses batadv_v_ogm_aggr_packet() to check if there is another additional batadv_ogm2_packet in the skb or not before they continue processing the packet. The length for such an OGM2 is BATADV_OGM2_HLEN + batadv_ogm2_packet->tvlv_len. The check must first check that at least BATADV_OGM2_HLEN bytes are available before it accesses tvlv_len (which is part of the header. Otherwise it might try read outside of the currently available skbuff to get the content of tvlv_len. Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-08-23batman-adv: Only read OGM tvlv_len after buffer len checkSven Eckelmann
Multiple batadv_ogm_packet can be stored in an skbuff. The functions batadv_iv_ogm_send_to_if()/batadv_iv_ogm_receive() use batadv_iv_ogm_aggr_packet() to check if there is another additional batadv_ogm_packet in the skb or not before they continue processing the packet. The length for such an OGM is BATADV_OGM_HLEN + batadv_ogm_packet->tvlv_len. The check must first check that at least BATADV_OGM_HLEN bytes are available before it accesses tvlv_len (which is part of the header. Otherwise it might try read outside of the currently available skbuff to get the content of tvlv_len. Fixes: ef26157747d4 ("batman-adv: tvlv - basic infrastructure") Reported-by: syzbot+355cab184197dbbfa384@syzkaller.appspotmail.com Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-08-23Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Three important fixes tagged for stable (an indefinite hang, a crash on an assert and a NULL pointer dereference) plus a small series from Luis fixing instances of vfree() under spinlock" * tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client: libceph: fix PG split vs OSD (re)connect race ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply ceph: clear page dirty before invalidate page ceph: fix buffer free while holding i_ceph_lock in fill_inode() ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
2019-08-22net/ncsi: update response packet length for GCPS/GNS/GNPTS commandsBen Wei
Update response packet length for the following commands per NC-SI spec - Get Controller Packet Statistics - Get NC-SI Statistics - Get NC-SI Pass-through Statistics command Signed-off-by: Ben Wei <benwei@fb.com> Reviewed-by: Justin Lee <justin.lee1@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-22net/ncsi: Fix the payload copying for the request coming from NetlinkJustin.Lee1@Dell.com
The request coming from Netlink should use the OEM generic handler. The standard command handler expects payload in bytes/words/dwords but the actual payload is stored in data if the request is coming from Netlink. Signed-off-by: Justin Lee <justin.lee1@dell.com> Reviewed-by: Vijay Khemka <vijaykhemka@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-22nl80211: add NL80211_CMD_UPDATE_FT_IES to supported commandsMatthew Wang
Add NL80211_CMD_UPDATE_FT_IES to supported commands. In mac80211 drivers, this can be implemented via existing NL80211_CMD_AUTHENTICATE and NL80211_ATTR_IE, but non-mac80211 drivers have a separate command for this. A driver supports FT if it either is mac80211 or supports this command. Signed-off-by: Matthew Wang <matthewmwang@chromium.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Link: https://lore.kernel.org/r/20190822174806.2954-1-matthewmwang@chromium.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-22mac80211: minstrel_ht: fix infinite loop because supported is not being shiftedColin Ian King
Currently the for-loop will spin forever if variable supported is non-zero because supported is never changed. Fix this by adding in the missing right shift of supported. Addresses-Coverity: ("Infinite loop") Fixes: 48cb39522a9d ("mac80211: minstrel_ht: improve rate probing for devices with static fallback") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20190822122034.28664-1-colin.king@canonical.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-22nexthops: remove redundant assignment to variable errColin Ian King
Variable err is initialized to a value that is never read and it is re-assigned later. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused Value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-22Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU and LKMM changes from Paul E. McKenney: - A few more RCU flavor consolidation cleanups. - Miscellaneous fixes. - Updates to RCU's list-traversal macros improving lockdep usability. - Torture-test updates. - Forward-progress improvements for no-CBs CPUs: Avoid ignoring incoming callbacks during grace-period waits. - Forward-progress improvements for no-CBs CPUs: Use ->cblist structure to take advantage of others' grace periods. - Also added a small commit that avoids needlessly inflicting scheduler-clock ticks on callback-offloaded CPUs. - Forward-progress improvements for no-CBs CPUs: Reduce contention on ->nocb_lock guarding ->cblist. - Forward-progress improvements for no-CBs CPUs: Add ->nocb_bypass list to further reduce contention on ->nocb_lock guarding ->cblist. - LKMM updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-22libceph: fix PG split vs OSD (re)connect raceIlya Dryomov
We can't rely on ->peer_features in calc_target() because it may be called both when the OSD session is established and open and when it's not. ->peer_features is not valid unless the OSD session is open. If this happens on a PG split (pg_num increase), that could mean we don't resend a request that should have been resent, hanging the client indefinitely. In userspace this was fixed by looking at require_osd_release and get_xinfo[osd].features fields of the osdmap. However these fields belong to the OSD section of the osdmap, which the kernel doesn't decode (only the client section is decoded). Instead, let's drop this feature check. It effectively checks for luminous, so only pre-luminous OSDs would be affected in that on a PG split the kernel might resend a request that should not have been resent. Duplicates can occur in other scenarios, so both sides should already be prepared for them: see dup/replay logic on the OSD side and retry_attempt check on the client side. Cc: stable@vger.kernel.org Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT") Link: https://tracker.ceph.com/issues/41162 Reported-by: Jerry Lee <leisurelysw24@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Jerry Lee <leisurelysw24@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2019-08-21net: fix icmp_socket_deliver argument 2 inputLi RongQing
it expects a unsigned int, but got a __be32 Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21ipv6/addrconf: allow adding multicast addr if IFA_F_MCAUTOJOIN is setHangbin Liu
In commit 93a714d6b53d ("multicast: Extend ip address command to enable multicast group join/leave on") we added a new flag IFA_F_MCAUTOJOIN to make user able to add multicast address on ethernet interface. This works for IPv4, but not for IPv6. See the inet6_addr_add code. static int inet6_addr_add() { ... if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) { ipv6_mc_config(net->ipv6.mc_autojoin_sk, true...) } ifp = ipv6_add_addr(idev, cfg, true, extack); <- always fail with maddr if (!IS_ERR(ifp)) { ... } else if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) { ipv6_mc_config(net->ipv6.mc_autojoin_sk, false...) } } But in ipv6_add_addr() it will check the address type and reject multicast address directly. So this feature is never worked for IPv6. We should not remove the multicast address check totally in ipv6_add_addr(), but could accept multicast address only when IFA_F_MCAUTOJOIN flag supplied. v2: update commit description Fixes: 93a714d6b53d ("multicast: Extend ip address command to enable multicast group join/leave on") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21Merge tag 'batadv-net-for-davem-20190821' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - fix uninit-value in batadv_netlink_get_ifindex(), by Eric Dumazet ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21xprtrdma: Optimize rpcrdma_post_recvs()Chuck Lever
Micro-optimization: In rpcrdma_post_recvs, since commit e340c2d6ef2a ("xprtrdma: Reduce the doorbell rate (Receive)"), the common case is to return without doing anything. Found with perf. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Inline XDR chunk encoder functionsChuck Lever
Micro-optimization: Save the cost of three function calls during transport header encoding. These were "noinline" before to generate more meaningful call stacks during debugging, but this code is now pretty stable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Fix bc_max_slots return valueChuck Lever
For the moment the returned value just happens to be correct because the current backchannel server implementation does not vary the number of credits it offers. The spec does permit this value to change during the lifetime of a connection, however. The actual maximum is fixed for all RPC/RDMA transports, because each transport instance has to pre-allocate the resources for processing BC requests. That's the value that should be returned. Fixes: 7402a4fedc2b ("SUNRPC: Fix up backchannel slot table ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21Merge branch 'odp_fixes' into rdma.git for-nextJason Gunthorpe
Jason Gunthorpe says: ==================== This is a collection of general cleanups for ODP to clarify some of the flows around umem creation and use of the interval tree. ==================== The branch is based on v5.3-rc5 due to dependencies * odp_fixes: RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address RDMA/core: Make invalidate_range a device operation RDMA/odp: Use kvcalloc for the dma_list and page_list RDMA/odp: Check for overflow when computing the umem_odp end RDMA/odp: Provide ib_umem_odp_release() to undo the allocs RDMA/odp: Split creating a umem_odp from ib_umem_get RDMA/odp: Make the three ways to create a umem_odp clear RMDA/odp: Consolidate umem_odp initialization RDMA/odp: Make it clearer when a umem is an implicit ODP umem RDMA/odp: Iterate over the whole rbtree directly RDMA/odp: Use the common interval tree library instead of generic RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-21xprtrdma: Clean up xprt_rdma_set_connect_timeout()Chuck Lever
Clean up: The function name should match the documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Use an llist to manage free rpcrdma_repsChuck Lever
rpcrdma_rep objects are removed from their free list by only a single thread: the Receive completion handler. Thus that free list can be converted to an llist, where a single-threaded consumer and a multi-threaded producer (rpcrdma_buffer_put) can both access the llist without the need for any serialization. This eliminates spin lock contention between the Receive completion handler and rpcrdma_buffer_get, and makes the rep consumer wait- free. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Remove rpcrdma_buffer::rb_mrlockChuck Lever
Clean up: Now that the free list is used sparingly, get rid of the separate spin lock protecting it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xprtrdma: Cache free MRs in each rpcrdma_reqChuck Lever
Instead of a globally-contended MR free list, cache MRs in each rpcrdma_req as they are released. This means acquiring and releasing an MR will be lock-free in the common case, even outside the transport send lock. The original idea of per-rpcrdma_req MR free lists was suggested by Shirley Ma <shirley.ma@oracle.com> several years ago. I just now figured out how to make that idea work with on-demand MR allocation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21xdp: xdp_umem: replace kmap on vmap for umem mapIvan Khoronzhuk
For 64-bit there is no reason to use vmap/vunmap, so use page_address as it was initially. For 32 bits, in some apps, like in samples xdpsock_user.c when number of pgs in use is quite big, the kmap memory can be not enough, despite on this, kmap looks like is deprecated in such cases as it can block and should be used rather for dynamic mm. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-21mac80211: minstrel_ht: improve rate probing for devices with static fallbackFelix Fietkau
On some devices that only support static rate fallback tables sending rate control probing packets can be really expensive. Probing lower rates can already hurt throughput quite a bit. What hurts even more is the fact that on mt76x0/mt76x2, single probing packets can only be forced by directing packets at a different internal hardware queue, which causes some heavy reordering and extra latency. The reordering issue is mainly problematic while pushing lots of packets to a particular station. If there is little activity, the overhead of probing is neglegible. The static fallback behavior is designed to pretty much only handle rate control algorithms that use only a very limited set of rates on which the algorithm switches up/down based on packet error rate. In order to better support that kind of hardware, this patch implements a different approach to rate probing where it switches to a slightly higher rate, waits for tx status feedback, then updates the stats and switches back to the new max throughput rate. This only triggers above a packet rate of 100 per stats interval (~50ms). For that kind of probing, the code has to reduce the set of probing rates a lot more compared to single packet probing, so it uses only one packet per MCS group which is either slightly faster, or as close as possible to the max throughput rate. This allows switching between similar rates with different numbers of streams. The algorithm assumes that the hardware will work its way lower within an MCS group in case of retransmissions, so that lower rates don't have to be probed by the high packets per second rate probing code. To further reduce the search space, it also does not probe rates with lower channel bandwidth than the max throughput rate. At the moment, these changes will only affect mt76x0/mt76x2. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-4-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: fix default max throughput rate indexesFelix Fietkau
Use the first supported rate instead of 0 (which can be invalid) Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-3-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: reduce unnecessary rate probing attemptsFelix Fietkau
On hardware with static fallback tables (e.g. mt76x2), rate probing attempts can be very expensive. On such devices, avoid sampling rates slower than the per-group max throughput rate, based on the assumption that the fallback table will take care of probing lower rates within that group if the higher rates fail. To further reduce unnecessary probing attempts, skip duplicate attempts on rates slower than the max throughput rate. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: minstrel_ht: fix per-group max throughput rate initializationFelix Fietkau
The group number needs to be multiplied by the number of rates per group to get the full rate index Fixes: 5935839ad735 ("mac80211: improve minstrel_ht rate sorting by throughput & probability") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190820095449.45255-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21nl80211: Add support for EDMG channelsAlexei Avshalom Lazar
802.11ay specification defines Enhanced Directional Multi-Gigabit (EDMG) STA and AP which allow channel bonding of 2 channels and more. Introduce new NL attributes that are needed for enabling and configuring EDMG support. Two new attributes are used by kernel to publish driver's EDMG capabilities to the userspace: NL80211_BAND_ATTR_EDMG_CHANNELS - bitmap field that indicates the 2.16 GHz channel(s) that are supported by the driver. When this attribute is not set it means driver does not support EDMG. NL80211_BAND_ATTR_EDMG_BW_CONFIG - represent the channel bandwidth configurations supported by the driver. Additional two new attributes are used by the userspace for connect command and for AP configuration: NL80211_ATTR_WIPHY_EDMG_CHANNELS NL80211_ATTR_WIPHY_EDMG_BW_CONFIG New rate info flag - RATE_INFO_FLAGS_EDMG, can be reported from driver and used for bitrate calculation that will take into account EDMG according to the 802.11ay specification. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Link: https://lore.kernel.org/r/1566138918-3823-2-git-send-email-ailizaro@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: fix possible NULL pointerderef in obss pd codeJohn Crispin
he_spr_ie_elem is dereferenced before the NULL check. fix this by moving the assignment after the check. fixes commit 697f6c507c74 ("mac80211: propagate HE operation info into bss_conf") This was reported by the static code checker. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190813070712.25509-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21mac80211: add assoc-at supportBen Greear
Report timestamp for when sta becomes associated. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20190809180001.26393-2-greearb@candelatech.com [fix ktime_get_boot_ns() to ktime_get_boottime_ns(), assoc_at type to u64] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: Support assoc-at timer in sta-infoBen Greear
Report timestamp of when sta became associated. This is the boottime clock, units are nano-seconds. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20190809180001.26393-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: apply same mandatory rate flags for 5GHz and 6GHzArend van Spriel
For the new 6GHz band the same rules apply for mandatory rates so add it to set_mandatory_flags_band() function. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-9-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: ibss: use 11a mandatory rates for 6GHz band operationArend van Spriel
The default mandatory rates, ie. when not specified by user-space, is determined by the band. Select 11a rateset for 6GHz band. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-8-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: use same IR permissive rules for 6GHz bandArend van Spriel
The function cfg80211_ir_permissive_chan() is applicable for 6GHz band as well so make sure it is handled. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-7-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: add 6GHz in code handling array with NUM_NL80211_BANDS entriesArend van Spriel
In nl80211.c there is a policy for all bands in NUM_NL80211_BANDS and in trace.h there is a callback trace for multicast rates which is per band in NUM_NL80211_BANDS. Both need to be extended for the new NL80211_BAND_6GHZ. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-6-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: extend ieee80211_operating_class_to_band() for 6GHzArend van Spriel
Add 6GHz operating class range as defined in 802.11ax D4.1 Annex E. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-5-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: util: add 6GHz channel to freq conversion and vice versaArend van Spriel
Extend the functions ieee80211_channel_to_frequency() and ieee80211_frequency_to_channel() to support 6GHz band according specification in 802.11ax D4.1 27.3.22.2. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-4-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21cfg80211: add 6GHz UNII band definitionsArend van Spriel
For the new 6GHz there are new UNII band definitions as listed in the FCC notice [1]. [1] https://docs.fcc.gov/public/attachments/FCC-18-147A1_Rcd.pdf Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-3-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-08-21nl80211: add 6GHz band definition to enum nl80211_bandArend van Spriel
In the 802.11ax specification a new band is introduced, which is also proposed by FCC for unlicensed use. This band is referred to as 6GHz spanning frequency range from 5925 to 7125 MHz. Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Leon Zegers <leon.zegers@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1564745465-21234-2-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>