summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-09-10sctp: Use skb_queue_is_first().David S. Miller
Instead of direct skb_queue_head pointer accesses. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10mac80211: Don't access sk_queue_head->next directly.David S. Miller
Use __skb_peek() instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10sch_netem: Move private queue handler to generic location.David S. Miller
By hand copies of SKB list handlers do not belong in individual packet schedulers. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10sch_htb: Remove local SKB queue handling code.David S. Miller
Instead, adjust __qdisc_enqueue_tail() such that HTB can use it instead. The only other caller of __qdisc_enqueue_tail() is qdisc_enqueue_tail() so we can move the backlog and return value handling (which HTB doesn't need/want) to the latter. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10net/ipv6: Remove rt6i_prefsrcDavid Ahern
After the conversion to fib6_info, rt6i_prefsrc has a single user that reads the value and otherwise it is only set. The one reader can be converted to use rt->from so rt6i_prefsrc can be removed, reducing rt6_info by another 20 bytes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10hidp: fix compat_ioctlAl Viro
1) no point putting it into fs/compat_ioctl.c when you handle it in your ->compat_ioctl() anyway. 2) HIDPCONNADD is *not* COMPATIBLE_IOCTL() stuff at all - it does layout massage (pointer-chasing there) 3) use compat_ptr() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-09-10hidp: constify hidp_connection_add()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-09-10cmtp: fix compat_ioctlAl Viro
Use compat_ptr(). And don't mess with fs/compat_ioctl.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-09-10bnep: fix compat_ioctlAl Viro
use compat_ptr() properly and don't bother with fs/compat_ioctl.c - it's all handled in ->compat_ioctl() anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-09-10mac80211: fix TX status reporting for ieee80211sYuan-Chi Pang
TX status reporting to ieee80211s is through ieee80211s_update_metric. There are two problems about ieee80211s_update_metric: 1. The purpose is to estimate the fail probability to a specific link. No need to restrict to data frame. 2. Current implementation does not work if wireless driver does not pass tx_status with skb. Fix this by removing ieee80211_is_data condition, passing ieee80211_tx_status directly to ieee80211s_update_metric, and putting it in both __ieee80211_tx_status and ieee80211_tx_status_ext. Signed-off-by: Yuan-Chi Pang <fu3mo6goo@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-10mac80211: TDLS: fix skb queue/priority assignmentJohannes Berg
If the TDLS setup happens over a connection to an AP that doesn't have QoS, we nevertheless assign a non-zero TID (skb->priority) and queue mapping, which may confuse us or drivers later. Fix it by just assigning the special skb->priority and then using ieee80211_select_queue() just like other data frames would go through. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-10cfg80211: Address some corner cases in scan result channel updatingJouni Malinen
cfg80211_get_bss_channel() is used to update the RX channel based on the available frame payload information (channel number from DSSS Parameter Set element or HT Operation element). This is needed on 2.4 GHz channels where frames may be received on neighboring channels due to overlapping frequency range. This might of some use on the 5 GHz band in some corner cases, but things are more complex there since there is no n:1 or 1:n mapping between channel numbers and frequencies due to multiple different starting frequencies in different operating classes. This could result in ieee80211_channel_to_frequency() returning incorrect frequency and ieee80211_get_channel() returning incorrect channel information (or indication of no match). In the previous implementation, this could result in some scan results being dropped completely, e.g., for the 4.9 GHz channels. That prevented connection to such BSSs. Fix this by using the driver-provided channel pointer if ieee80211_get_channel() does not find matching channel data for the channel number in the frame payload and if the scan is done with 5 MHz or 10 MHz channel bandwidth. While doing this, also add comments describing what the function is trying to achieve to make it easier to understand what happens here and why. Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-09ip: frags: fix crash in ip_do_fragment()Taehee Yoo
A kernel crash occurrs when defragmented packet is fragmented in ip_do_fragment(). In defragment routine, skb_orphan() is called and skb->ip_defrag_offset is set. but skb->sk and skb->ip_defrag_offset are same union member. so that frag->sk is not NULL. Hence crash occurrs in skb->sk check routine in ip_do_fragment() when defragmented packet is fragmented. test commands: %iptables -t nat -I POSTROUTING -j MASQUERADE %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 splat looks like: [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 [ 261.198158] Call Trace: [ 261.199018] ? dst_output+0x180/0x180 [ 261.205011] ? save_trace+0x300/0x300 [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 [ 261.213034] ? sched_clock_local+0xd4/0x140 [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 [ 261.227014] ? find_held_lock+0x39/0x1c0 [ 261.233008] ip_finish_output+0x51d/0xb50 [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] [ 261.250152] ? rcu_is_watching+0x77/0x120 [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] [ 261.261033] ? nf_hook_slow+0xb1/0x160 [ 261.265007] ip_output+0x1c7/0x710 [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 [ 261.282996] ? nf_hook_slow+0xb1/0x160 [ 261.287007] raw_sendmsg+0x21f9/0x4420 [ 261.291008] ? dst_output+0x180/0x180 [ 261.297003] ? sched_clock_cpu+0x126/0x170 [ 261.301003] ? find_held_lock+0x39/0x1c0 [ 261.306155] ? stop_critical_timings+0x420/0x420 [ 261.311004] ? check_flags.part.36+0x450/0x450 [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 [ 261.326142] ? cyc2ns_read_end+0x10/0x10 [ 261.330139] ? raw_bind+0x280/0x280 [ 261.334138] ? sched_clock_cpu+0x126/0x170 [ 261.338995] ? check_flags.part.36+0x450/0x450 [ 261.342991] ? __lock_acquire+0x4500/0x4500 [ 261.348994] ? inet_sendmsg+0x11c/0x500 [ 261.352989] ? dst_output+0x180/0x180 [ 261.357012] inet_sendmsg+0x11c/0x500 [ ... ] v2: - clear skb->sk at reassembly routine.(Eric Dumarzet) Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPCVakul Garg
tls_sw_sendmsg() allocates plaintext and encrypted SG entries using function sk_alloc_sg(). In case the number of SG entries hit MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for current SG index to '0'. This leads to calling of function tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes kernel crash. To fix this, set the number of SG elements to the number of elements in plaintext/encrypted SG arrays in case sk_alloc_sg() returns -ENOSPC. Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-08net: sched: act_nat: remove dependency on rtnl lockVlad Buslov
According to the new locking rule, we have to take tcf_lock for both ->init() and ->dump(), as RTNL will be removed. Use tcf spinlock to protect private nat action data from concurrent modification during dump. (nat init already uses tcf spinlock when changing action state) Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-08net: sched: act_skbedit: remove dependency on rtnl lockVlad Buslov
According to the new locking rule, we have to take tcf_lock for both ->init() and ->dump(), as RTNL will be removed. Use tcf lock to protect skbedit action struct private data from concurrent modification in init and dump. Use rcu swap operation to reassign params pointer under protection of tcf lock. (old params value is not used by init, so there is no need of standalone rcu dereference step) Remove rtnl lock assertion that is no longer required. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07tcp: really ignore MSG_ZEROCOPY if no SO_ZEROCOPYVincent Whitchurch
According to the documentation in msg_zerocopy.rst, the SO_ZEROCOPY flag was introduced because send(2) ignores unknown message flags and any legacy application which was accidentally passing the equivalent of MSG_ZEROCOPY earlier should not see any new behaviour. Before commit f214f915e7db ("tcp: enable MSG_ZEROCOPY"), a send(2) call which passed the equivalent of MSG_ZEROCOPY without setting SO_ZEROCOPY would succeed. However, after that commit, it fails with -ENOBUFS. So it appears that the SO_ZEROCOPY flag fails to fulfill its intended purpose. Fix it. Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07net_sched: properly cancel netlink dump on failureCong Wang
When nla_put*() fails after nla_nest_start(), we need to call nla_nest_cancel() to cancel the message, otherwise we end up calling nla_nest_end() like a success. Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key") Cc: Davide Caratti <dcaratti@redhat.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07net: dsa: Expose tagging protocol to user-spaceFlorian Fainelli
There is no way for user-space to know what a given DSA network device's tagging protocol is. Expose this information through a dsa/tagging attribute which reflects the tagging protocol currently in use. This is helpful for configuration (e.g: none behaves dramatically different wrt. bridges) as well as for packet capture tools when there is not a proper Ethernet type available. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07batman-adv: fix hardif_neigh refcount on queue_work() failureMarek Lindner
The hardif_neigh refcounter is to be decreased by the queued work and currently is never decreased if the queue_work() call fails. Fix by checking the queue_work() return value and decrease refcount if necessary. Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-07batman-adv: fix backbone_gw refcount on queue_work() failureMarek Lindner
The backbone_gw refcounter is to be decreased by the queued work and currently is never decreased if the queue_work() call fails. Fix by checking the queue_work() return value and decrease refcount if necessary. Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06xdp: split code for map vs non-map redirectJesper Dangaard Brouer
The compiler does an efficient job of inlining static C functions. Perf top clearly shows that almost everything gets inlined into the function call xdp_do_redirect. The function xdp_do_redirect end-up containing and interleaving the map and non-map redirect code. This is sub-optimal, as it would be strange for an XDP program to use both types of redirect in the same program. The two use-cases are separate, and interleaving the code just cause more instruction-cache pressure. I would like to stress (again) that the non-map variant bpf_redirect is very slow compared to the bpf_redirect_map variant, approx half the speed. Measured with driver i40e the difference is: - map redirect: 13,250,350 pps - non-map redirect: 7,491,425 pps For this reason, the function name of the non-map variant of redirect have been called xdp_do_redirect_slow. This hopefully gives a hint when using perf, that this is not the optimal XDP redirect operating mode. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-06xdp: explicit inline __xdp_map_lookup_elemJesper Dangaard Brouer
The compiler chooses to not-inline the function __xdp_map_lookup_elem, because it can see that it is used by both Generic-XDP and native-XDP do redirect calls (xdp_do_generic_redirect_map and xdp_do_redirect_map). The compiler cannot know that this is a bad choice, as it cannot know that a net device cannot run both XDP modes (Generic or Native) at the same time. Thus, mark this function inline, even-though we normally leave this up-to the compiler. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-06xdp: unlikely instrumentation for xdp map redirectJesper Dangaard Brouer
Notice the compiler generated ASM code layout was suboptimal. It assumed map enqueue errors as the likely case, which is shouldn't. It assumed that xdp_do_flush_map() was a likely case, due to maps changing between packets, which should be very unlikely. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-09-06tipc: call start and done ops directly in __tipc_nl_compat_dumpit()Cong Wang
__tipc_nl_compat_dumpit() uses a netlink_callback on stack, so the only way to align it with other ->dumpit() call path is calling tipc_dump_start() and tipc_dump_done() directly inside it. Otherwise ->dumpit() would always get NULL from cb->args[]. But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve net pointer, the cb->skb here doesn't set skb->sk, the net pointer is saved in msg->net instead, so introduce a helper function __tipc_dump_start() to pass in msg->net. Ying pointed out cb->args[0...3] are already used by other callbacks on this call path, so we can't use cb->args[0] any more, use cb->args[4] instead. Fixes: 9a07efa9aea2 ("tipc: switch to rhashtable iterator") Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-06openvswitch: Derive IP protocol number for IPv6 later fragsYi-Hung Wei
Currently, OVS only parses the IP protocol number for the first IPv6 fragment, but sets the IP protocol number for the later fragments to be NEXTHDF_FRAGMENT. This patch tries to derive the IP protocol number for the IPV6 later frags so that we can match that. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-06batman-adv: Prevent duplicated tvlv handlerSven Eckelmann
The function batadv_tvlv_handler_register is responsible for adding new tvlv_handler to the handler_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: ef26157747d4 ("batman-adv: tvlv - basic infrastructure") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Prevent duplicated global TT entrySven Eckelmann
The function batadv_tt_global_orig_entry_add is responsible for adding new tt_orig_list_entry to the orig_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: d657e621a0f5 ("batman-adv: add reference counting for type batadv_tt_orig_list_entry") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Prevent duplicated softif_vlan entrySven Eckelmann
The function batadv_softif_vlan_get is responsible for adding new softif_vlan to the softif_vlan_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: 5d2c05b21337 ("batman-adv: add per VLAN interface attribute framework") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Prevent duplicated nc_node entrySven Eckelmann
The function batadv_nc_get_nc_node is responsible for adding new nc_nodes to the in_coding_list and out_coding_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: d56b1705e28c ("batman-adv: network coding - detect coding nodes and remove these after timeout") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Prevent duplicated gateway_node entrySven Eckelmann
The function batadv_gw_node_add is responsible for adding new gw_node to the gateway_list. It is expecting that the caller already checked that there is not already an entry with the same key or not. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Fix segfault when writing to sysfs elp_intervalSven Eckelmann
The per hardif sysfs file "batman_adv/elp_interval" is using the generic functions to store/show uint values. The helper __batadv_store_uint_attr requires the softif net_device as parameter to print the resulting change as info text when the users writes to this file. It uses the helper function batadv_info to add it at the same time to the kernel ring buffer and to the batman-adv debug log (when CONFIG_BATMAN_ADV_DEBUG is enabled). The function batadv_info requires as first parameter the batman-adv softif net_device. This parameter is then used to find the private buffer which contains the debug log for this batman-adv interface. But batadv_store_throughput_override used as first argument the slave net_device. This slave device doesn't have the batadv_priv private data which is access by batadv_info. Writing to this file with CONFIG_BATMAN_ADV_DEBUG enabled can either lead to a segfault or to memory corruption. Fixes: 0744ff8fa8fa ("batman-adv: Add hard_iface specific sysfs wrapper macros for UINT") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Fix segfault when writing to throughput_overrideSven Eckelmann
The per hardif sysfs file "batman_adv/throughput_override" prints the resulting change as info text when the users writes to this file. It uses the helper function batadv_info to add it at the same time to the kernel ring buffer and to the batman-adv debug log (when CONFIG_BATMAN_ADV_DEBUG is enabled). The function batadv_info requires as first parameter the batman-adv softif net_device. This parameter is then used to find the private buffer which contains the debug log for this batman-adv interface. But batadv_store_throughput_override used as first argument the slave net_device. This slave device doesn't have the batadv_priv private data which is access by batadv_info. Writing to this file with CONFIG_BATMAN_ADV_DEBUG enabled can either lead to a segfault or to memory corruption. Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-06batman-adv: Avoid probe ELP information leakSven Eckelmann
The probe ELPs for WiFi interfaces are expanded to contain at least BATADV_ELP_MIN_PROBE_SIZE bytes. This is usually a lot more than the number of bytes which the template ELP packet requires. These extra padding bytes were not initialized and thus could contain data which were previously stored at the same location. It is therefore required to set it to some predefined or random values to avoid leaking private information from the system transmitting these kind of packets. Fixes: e4623c913508 ("batman-adv: Avoid probe ELP information leak") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-09-05net/iucv: declare iucv_path_table_empty() as staticJulian Wiedmann
Fixes a compile warning. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05net/af_iucv: fix skb handling on HiperTransport xmit errorJulian Wiedmann
When sending an skb, afiucv_hs_send() bails out on various error conditions. But currently the caller has no way of telling whether the skb was freed or not - resulting in potentially either a) leaked skbs from iucv_send_ctrl(), or b) double-free's from iucv_sock_sendmsg(). As dev_queue_xmit() will always consume the skb (even on error), be consistent and also free the skb from all other error paths. This way callers no longer need to care about managing the skb. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05net/af_iucv: drop inbound packets with invalid flagsJulian Wiedmann
Inbound packets may have any combination of flag bits set in their iucv header. If we don't know how to handle a specific combination, drop the skb instead of leaking it. To clarify what error is returned in this case, replace the hard-coded 0 with the corresponding macro. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05ipv6: add inet6_fill_argsChristian Brauner
inet6_fill_if{addr,mcaddr, acaddr}() already took 6 arguments which meant the 7th argument would need to be pushed onto the stack on x86. Add a new struct inet6_fill_args which holds common information passed to inet6_fill_if{addr,mcaddr, acaddr}() and shortens the functions to three pointer arguments. Signed-off-by: Christian Brauner <christian@brauner.io> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05ipv4: add inet_fill_argsChristian Brauner
inet_fill_ifaddr() already took 6 arguments which meant the 7th argument would need to be pushed onto the stack on x86. Add a new struct inet_fill_args which holds common information passed to inet_fill_ifaddr() and shortens the function to three pointer arguments. Signed-off-by: Christian Brauner <christian@brauner.io> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05rtnetlink: s/IFLA_IF_NETNSID/IFLA_TARGET_NETNSID/gChristian Brauner
IFLA_TARGET_NETNSID is the new alias for IFLA_IF_NETNSID. This commit replaces all occurrences of IFLA_IF_NETNSID with the new alias to indicate that this identifier is the preferred one. Signed-off-by: Christian Brauner <christian@brauner.io> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05rtnetlink: move type calculation out of loopChristian Brauner
I don't see how the type - which is one of RTM_{GETADDR,GETROUTE,GETNETCONF} - can change. So do the message type calculation once before entering the for loop. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDRChristian Brauner
- Backwards Compatibility: If userspace wants to determine whether ipv6 RTM_GETADDR requests support the new IFA_TARGET_NETNSID property it should verify that the reply includes the IFA_TARGET_NETNSID property. If it does not userspace should assume that IFA_TARGET_NETNSID is not supported for ipv6 RTM_GETADDR requests on this kernel. - From what I gather from current userspace tools that make use of RTM_GETADDR requests some of them pass down struct ifinfomsg when they should actually pass down struct ifaddrmsg. To not break existing tools that pass down the wrong struct we will do the same as for RTM_GETLINK | NLM_F_DUMP requests and not error out when the nlmsg_parse() fails. - Security: Callers must have CAP_NET_ADMIN in the owning user namespace of the target network namespace. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05ipv4: enable IFA_TARGET_NETNSID for RTM_GETADDRChristian Brauner
- Backwards Compatibility: If userspace wants to determine whether ipv4 RTM_GETADDR requests support the new IFA_TARGET_NETNSID property it should verify that the reply includes the IFA_TARGET_NETNSID property. If it does not userspace should assume that IFA_TARGET_NETNSID is not supported for ipv4 RTM_GETADDR requests on this kernel. - From what I gather from current userspace tools that make use of RTM_GETADDR requests some of them pass down struct ifinfomsg when they should actually pass down struct ifaddrmsg. To not break existing tools that pass down the wrong struct we will do the same as for RTM_GETLINK | NLM_F_DUMP requests and not error out when the nlmsg_parse() fails. - Security: Callers must have CAP_NET_ADMIN in the owning user namespace of the target network namespace. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05rtnetlink: add rtnl_get_net_ns_capable()Christian Brauner
get_target_net() will be used in follow-up patches in ipv{4,6} codepaths to retrieve network namespaces based on network namespace identifiers. So remove the static declaration and export in the rtnetlink header. Also, rename it to rtnl_get_net_ns_capable() to make it obvious what this function is doing. Export rtnl_get_net_ns_capable() so it can be used when ipv6 is built as a module. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05net/sched: fix memory leak in act_tunnel_key_init()Davide Caratti
If users try to install act_tunnel_key 'set' rules with duplicate values of 'index', the tunnel metadata are allocated, but never released. Then, kmemleak complains as follows: # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111 # echo clear > /sys/kernel/debug/kmemleak # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111 Error: TC IDR already exists. We have an error talking to the kernel # echo scan > /sys/kernel/debug/kmemleak # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8800574e6c80 (size 256): comm "tc", pid 5617, jiffies 4298118009 (age 57.990s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff ................ 81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00 .$.............. backtrace: [<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key] [<000000007d98fccd>] tcf_action_init_1+0x698/0xac0 [<0000000099b8f7cc>] tcf_action_init+0x15c/0x590 [<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2 [<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0 [<000000000bfe7575>] netlink_rcv_skb+0x124/0x350 [<00000000edab656f>] netlink_unicast+0x40f/0x5d0 [<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0 [<0000000063d9d490>] sock_sendmsg+0xb3/0xf0 [<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960 [<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170 [<00000000ce72e4b0>] do_syscall_64+0xa5/0x470 [<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<00000000fac1b476>] 0xffffffffffffffff This problem theoretically happens also in case users attempt to setup a geneve rule having wrong configuration data, or when the kernel fails to allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel metadata also in the above conditions. Addresses-Coverity-ID: 1373974 ("Resource leak") Fixes: d0f6dd8a914f4 ("net/sched: Introduce act_tunnel_key") Fixes: 0ed5269f9e41f ("net/sched: add tunnel option support to act_tunnel_key") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05tipc: orphan sock in tipc_release()Cong Wang
Before we unlock the sock in tipc_release(), we have to detach sk->sk_socket from sk, otherwise a parallel tipc_sk_fill_sock_diag() could stil read it after we free this socket. Fixes: c30b70deb5f4 ("tipc: implement socket diagnostics for AF_TIPC") Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05netlink: Make groups check less stupid in netlink_bind()Dmitry Safonov
As Linus noted, the test for 0 is needless, groups type can follow the usual kernel style and 8*sizeof(unsigned long) is BITS_PER_LONG: > The code [..] isn't technically incorrect... > But it is stupid. > Why stupid? Because the test for 0 is pointless. > > Just doing > if (nlk->ngroups < 8*sizeof(groups)) > groups &= (1UL << nlk->ngroups) - 1; > > would have been fine and more understandable, since the "mask by shift > count" already does the right thing for a ngroups value of 0. Now that > test for zero makes me go "what's special about zero?". It turns out > that the answer to that is "nothing". [..] > The type of "groups" is kind of silly too. > > Yeah, "long unsigned int" isn't _technically_ wrong. But we normally > call that type "unsigned long". Cleanup my piece of pointlessness. Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Fairly-blamed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05packet: add sockopt to ignore outgoing packetsVincent Whitchurch
Currently, the only way to ignore outgoing packets on a packet socket is via the BPF filter. With MSG_ZEROCOPY, packets that are looped into AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even if the filter run from packet_rcv() would reject them. So the presence of a packet socket on the interface takes away the benefits of MSG_ZEROCOPY, even if the packet socket is not interested in outgoing packets. (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily cloned, but the cost for that is much lower.) Add a socket option to allow AF_PACKET sockets to ignore outgoing packets to solve this. Note that the *BSDs already have something similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT. The first intended user is lldpd. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05mac80211: fix pending queue hang due to TX_DROPBob Copeland
In our environment running lots of mesh nodes, we are seeing the pending queue hang periodically, with the debugfs queues file showing lines such as: 00: 0x00000000/348 i.e. there are a large number of frames but no stop reason set. One way this could happen is if queue processing from the pending tasklet exited early without processing all frames, and without having some future event (incoming frame, stop reason flag, ...) to reschedule it. Exactly this can occur today if ieee80211_tx() returns false due to packet drops or power-save buffering in the tx handlers. In the past, this function would return true in such cases, and the change to false doesn't seem to be intentional. Fix this case by reverting to the previous behavior. Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue") Signed-off-by: Bob Copeland <bobcopeland@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-05cfg80211: validate wmm rule when settingStanislaw Gruszka
Add validation check for wmm rule when copy rules from fwdb and print error when rule is invalid. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>