summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-03-12net: rds: drop VLA in rds_walk_conn_path_info()Salvatore Mesoraca
Avoid VLA[1] by using an already allocated buffer passed by the caller. [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12net: rds: drop VLA in rds_for_each_conn_info()Salvatore Mesoraca
Avoid VLA[1] by using an already allocated buffer passed by the caller. [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fixed hashtable representation doesn't support timeout flag, skip it otherwise rules to add elements from the packet fail bogusly fail with EOPNOTSUPP. 2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel, patch from Florian Westphal. 3) Sanitize proc names in several x_tables extensions, also from Florian. 4) Add sanitization to ebt_among wormhash logic, from Florian. 5) Missing release of hook array in flowtable. ====================
2018-03-12net: Make RX-FCS and HW GRO mutually exclusiveGal Pressman
Same as LRO, hardware GRO cannot be enabled with RX-FCS. When both are requested, hardware GRO will be dropped. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12net: llc: drop VLA in llc_sap_mcast()Salvatore Mesoraca
Avoid a VLA[1] by using a real constant expression instead of a variable. The compiler should be able to optimize the original code and avoid using an actual VLA. Anyway this change is useful because it will avoid a false positive with -Wvla, it might also help the compiler generating better code. [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12rds: remove redundant variable 'sg_off'Colin Ian King
Variable sg_off is assigned a value but it is never read, hence it is redundant and can be removed. Cleans up clang warning: net/rds/message.c:373:2: warning: Value stored to 'sg_off' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12ipv6: Use ip6_multipath_hash_policy() in rt6_multipath_hash().David S. Miller
Make use of the new helper. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12sock_diag: request _diag module only when the family or proto has been ↵Xin Long
registered Now when using 'ss' in iproute, kernel would try to load all _diag modules, which also causes corresponding family and proto modules to be loaded as well due to module dependencies. Like after running 'ss', sctp, dccp, af_packet (if it works as a module) would be loaded. For example: $ lsmod|grep sctp $ ss $ lsmod|grep sctp sctp_diag 16384 0 sctp 323584 5 sctp_diag inet_diag 24576 4 raw_diag,tcp_diag,sctp_diag,udp_diag libcrc32c 16384 3 nf_conntrack,nf_nat,sctp As these family and proto modules are loaded unintentionally, it could cause some problems, like: - Some debug tools use 'ss' to collect the socket info, which loads all those diag and family and protocol modules. It's noisy for identifying issues. - Users usually expect to drop sctp init packet silently when they have no sense of sctp protocol instead of sending abort back. - It wastes resources (especially with multiple netns), and SCTP module can't be unloaded once it's loaded. ... In short, it's really inappropriate to have these family and proto modules loaded unexpectedly when just doing debugging with inet_diag. This patch is to introduce sock_load_diag_module() where it loads the _diag module only when it's corresponding family or proto has been already registered. Note that we can't just load _diag module without the family or proto loaded, as some symbols used in _diag module are from the family or proto module. v1->v2: - move inet proto check to inet_diag to avoid a compiling err. v2->v3: - define sock_load_diag_module in sock.c and export one symbol only. - improve the changelog. Reported-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11openvswitch: meter: fix the incorrect calculation of max delta_tzhangliping
Max delat_t should be the full_bucket/rate instead of the full_bucket. Also report EINVAL if the rate is zero. Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: zhangliping <zhangliping02@baidu.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-11netfilter: nf_tables: release flowtable hooksPablo Neira Ayuso
Otherwise we leak this array. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-11netfilter: bridge: ebt_among: add more missing match size checksFlorian Westphal
ebt_among is special, it has a dynamic match size and is exempt from the central size checks. commit c4585a2823edf ("bridge: ebt_among: add missing match size checks") added validation for pool size, but missed fact that the macros ebt_among_wh_src/dst can already return out-of-bound result because they do not check value of wh_src/dst_ofs (an offset) vs. the size of the match that userspace gave to us. v2: check that offset has correct alignment. Paolo Abeni points out that we should also check that src/dst wormhash arrays do not overlap, and src + length lines up with start of dst (or vice versa). v3: compact wormhash_sizes_valid() part NB: Fixes tag is intentionally wrong, this bug exists from day one when match was added for 2.6 kernel. Tag is there so stable maintainers will notice this one too. Tested with same rules from the earlier patch. Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks") Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-11netfilter: x_tables: add and use xt_check_proc_nameFlorian Westphal
recent and hashlimit both create /proc files, but only check that name is 0 terminated. This can trigger WARN() from procfs when name is "" or "/". Add helper for this and then use it for both. Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-11netfilter: ebtables: fix erroneous reject of last ruleFlorian Westphal
The last rule in the blob has next_entry offset that is same as total size. This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel. Fixes: b71812168571fa ("netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-03-09ip6erspan: make sure enough headroom at xmit.William Tu
The patch adds skb_cow_header() to ensure enough headroom at ip6erspan_tunnel_xmit before pushing the erspan header to the skb. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09ip6erspan: improve error handling for erspan version number.William Tu
When users fill in incorrect erspan version number through the struct erspan_metadata uapi, current code skips pushing the erspan header but continue pushing the gre header, which is incorrect. The patch fixes it by returning error. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09ip6gre: add erspan v2 to tunnel lookupWilliam Tu
The patch adds the erspan v2 proto in ip6gre_tunnel_lookup so the erspan v2 tunnel can be found correctly. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: introduce IFF_NO_RX_HANDLERPaolo Abeni
Some network devices - notably ipvlan slave - are not compatible with any kind of rx_handler. Currently the hook can be installed but any configuration (bridge, bond, macsec, ...) is nonfunctional. This change allocates a priv_flag bit to mark such devices and explicitly forbid installing a rx_handler if such bit is set. The new bit is used by ipvlan slave device. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09pktgen: Remove VLA usageGustavo A. R. Silva
In preparation to enabling -Wvla, remove VLA usage and replace it with a fixed-length array instead. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: use skb_is_gso_sctp() instead of open-codingDaniel Axtens
As well as the basic conversion, I noticed that a lot of the SCTP code checks gso_type without first checking skb_is_gso() so I have added that where appropriate. Also, document the helper. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: implement get_fill_size routine in act_gactRoman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: calculate add/delete event message sizeRoman Mashak
Introduce routines to calculate size of the shared tc netlink attributes and the full message size including netlink header and tc service header. Update add/delete action logic to have the size for event messages, the size is passed to tcf_add_notify() and tcf_del_notify() where the notification message is being allocated and constructed. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: update Add/Delete action API with new argumentRoman Mashak
Introduce a new function argument to carry total attributes size for correct allocation of skb in event messages. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net: do not create fallback tunnels for non-default namespacesEric Dumazet
fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0, ip6tnl0, ip6gre0) are automatically created when the corresponding module is loaded. These tunnels are also automatically created when a new network namespace is created, at a great cost. In many cases, netns are used for isolation purposes, and these extra network devices are a waste of resources. We are using thousands of netns per host, and hit the netns creation/delete bottleneck a lot. (Many thanks to Kirill for recent work on this) Add a new sysctl so that we can opt-out from this automatic creation. Note that these tunnels are still created for the initial namespace, to be the least intrusive for typical setups. Tested: lpk43:~# cat add_del_unshare.sh for i in `seq 1 40` do (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) & done wait lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m37.521s user 0m0.886s sys 7m7.084s lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net lpk43:~# time ./add_del_unshare.sh real 0m4.761s user 0m0.851s sys 1m8.343s lpk43:~# Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event()Eric Dumazet
A tun device type can trivially be set to arbitrary value using TUNSETLINK ioctl(). Therefore, lowpan_device_event() must really check that ieee802154_ptr is not NULL. Fixes: 2c88b5283f60d ("ieee802154: 6lowpan: remove check on null") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Aring <alex.aring@gmail.com> Cc: Stefan Schmidt <stefan@osg.samsung.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()Lorenzo Bianconi
Fix the following slab-out-of-bounds kasan report in ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not linear and the accessed data are not in the linear data region of orig_skb. [ 1503.122508] ================================================================== [ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990 [ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932 [ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124 [ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014 [ 1503.123527] Call Trace: [ 1503.123579] <IRQ> [ 1503.123638] print_address_description+0x6e/0x280 [ 1503.123849] kasan_report+0x233/0x350 [ 1503.123946] memcpy+0x1f/0x50 [ 1503.124037] ndisc_send_redirect+0x94e/0x990 [ 1503.125150] ip6_forward+0x1242/0x13b0 [...] [ 1503.153890] Allocated by task 1932: [ 1503.153982] kasan_kmalloc+0x9f/0xd0 [ 1503.154074] __kmalloc_track_caller+0xb5/0x160 [ 1503.154198] __kmalloc_reserve.isra.41+0x24/0x70 [ 1503.154324] __alloc_skb+0x130/0x3e0 [ 1503.154415] sctp_packet_transmit+0x21a/0x1810 [ 1503.154533] sctp_outq_flush+0xc14/0x1db0 [ 1503.154624] sctp_do_sm+0x34e/0x2740 [ 1503.154715] sctp_primitive_SEND+0x57/0x70 [ 1503.154807] sctp_sendmsg+0xaa6/0x1b10 [ 1503.154897] sock_sendmsg+0x68/0x80 [ 1503.154987] ___sys_sendmsg+0x431/0x4b0 [ 1503.155078] __sys_sendmsg+0xa4/0x130 [ 1503.155168] do_syscall_64+0x171/0x3f0 [ 1503.155259] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 1503.155436] Freed by task 1932: [ 1503.155527] __kasan_slab_free+0x134/0x180 [ 1503.155618] kfree+0xbc/0x180 [ 1503.155709] skb_release_data+0x27f/0x2c0 [ 1503.155800] consume_skb+0x94/0xe0 [ 1503.155889] sctp_chunk_put+0x1aa/0x1f0 [ 1503.155979] sctp_inq_pop+0x2f8/0x6e0 [ 1503.156070] sctp_assoc_bh_rcv+0x6a/0x230 [ 1503.156164] sctp_inq_push+0x117/0x150 [ 1503.156255] sctp_backlog_rcv+0xdf/0x4a0 [ 1503.156346] __release_sock+0x142/0x250 [ 1503.156436] release_sock+0x80/0x180 [ 1503.156526] sctp_sendmsg+0xbb0/0x1b10 [ 1503.156617] sock_sendmsg+0x68/0x80 [ 1503.156708] ___sys_sendmsg+0x431/0x4b0 [ 1503.156799] __sys_sendmsg+0xa4/0x130 [ 1503.156889] do_syscall_64+0x171/0x3f0 [ 1503.156980] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 1503.157158] The buggy address belongs to the object at ffff8800298ab600 which belongs to the cache kmalloc-1024 of size 1024 [ 1503.157444] The buggy address is located 176 bytes inside of 1024-byte region [ffff8800298ab600, ffff8800298aba00) [ 1503.157702] The buggy address belongs to the page: [ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0 [ 1503.158053] flags: 0x4000000000008100(slab|head) [ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e [ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000 [ 1503.158523] page dumped because: kasan: bad access detected [ 1503.158698] Memory state around the buggy address: [ 1503.158816] ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1503.158988] ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1503.159338] ^ [ 1503.159436] ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1503.159610] ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1503.159785] ================================================================== [ 1503.159964] Disabling lock debugging due to kernel taint The test scenario to trigger the issue consists of 4 devices: - H0: data sender, connected to LAN0 - H1: data receiver, connected to LAN1 - GW0 and GW1: routers between LAN0 and LAN1. Both of them have an ethernet connection on LAN0 and LAN1 On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for data from LAN0 to LAN1. Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send buffer size is set to 16K). While data streams are active flush the route cache on HA multiple times. I have not been able to identify a given commit that introduced the issue since, using the reproducer described above, the kasan report has been triggered from 4.14 and I have not gone back further. Reported-by: Jianlin Shi <jishi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: ethtool: extend RXNFC API to support RSS spreading of filter matchesEdward Cree
We use a two-step process to configure a filter with RSS spreading. First, the RSS context is allocated and configured using ETHTOOL_SRSSH; this returns an identifier (rss_context) which can then be passed to subsequent invocations of ETHTOOL_SRXCLSRLINS to specify that the offset from the RSS indirection table lookup should be added to the queue number (ring_cookie) when delivering the packet. Drivers for devices which can only use the indirection table entry directly (not add it to a base queue number) should reject rule insertions combining RSS with a nonzero ring_cookie. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08rds: rds_info_from_znotifier() can be statickbuild test robot
Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08rds: rds_message_zcopy_from_user() can be statickbuild test robot
Fixes: d40a126b16ea ("rds: refactor zcopy code into rds_message_zcopy_from_user") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net/ncsi: unlock on error in ncsi_set_interface_nl()Dan Carpenter
There are two error paths which are missing unlocks in this function. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net/ncsi: use kfree_skb() instead of kfree()Dan Carpenter
We're supposed to use kfree_skb() to free these sk_buffs. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08openvswitch: fix vport packet length check.William Tu
When sending a packet to a tunnel device, the dev's hard_header_len could be larger than the skb->len in function packet_length(). In the case of ip6gretap/erspan, hard_header_len = LL_MAX_HEADER + t_hlen, which is around 180, and an ARP packet sent to this tunnel has skb->len = 42. This causes the 'unsign int length' to become super large because it is negative value, causing the later ovs_vport_send to drop it due to over-mtu size. The patch fixes it by setting it to 0. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convet ipv6_net_opsKirill Tkhai
These pernet_operations are similar to ipv4_net_ops. They are safe to be async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert ipv4_net_opsKirill Tkhai
These pernet_operations register and unregister bunch of nf_conntrack_l4proto. Exit method unregisters related sysctl, init method calls init_net and get_net_proto. The whole builtin_l4proto4 array has pretty simple init_net and get_net_proto methods. The first one register sysctl table, the second one is just RO memory dereference. So, these pernet_operations are safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_security_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_security table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_raw_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_raw table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_nat_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::nat_table table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert iptable_mangle_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::iptable_mangle table. Another net/pernet_operations do not send ipv4 packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert arptable_filter_net_opsKirill Tkhai
These pernet_operations unregister net::ipv4::arptable_filter. Another net/pernet_operations do not send arp packets to foreign net namespaces. So, we mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert pg_net_opsKirill Tkhai
These pernet_operations create per-net pktgen threads and /proc entries. These pernet subsys looks closed in itself, and there are no pernet_operations outside this file, which are interested in the threads. Init and/or exit methods look safe to be executed in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_queue_net_opsKirill Tkhai
These pernet_operations register and unregister net::nf::queue_handler and /proc entry. The handler is accessed only under RCU, so this looks safe to convert them. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_log_net_opsKirill Tkhai
These pernet_operations create and destroy /proc entries. Also, exit method unsets nfulnl_logger. The logger is not set by default, and it becomes bound via userspace request. So, they look safe to be made async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert cttimeout_opsKirill Tkhai
These pernet_operations also look closed in themself. Exit method touch only per-net structures, so it's safe to execute them for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnl_acct_opsKirill Tkhai
These pernet_operations look closed in themself, and there are no other users of net::nfnl_acct_list outside. They are safe to be executed for several net namespaces in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nfnetlink_net_opsKirill Tkhai
These pernet_operations create and destroy net::nfnl socket of NETLINK_NETFILTER code. There are no other places, where such type the socket is created, except these pernet_operations. It seem other pernet_operations depending on CONFIG_NETFILTER_NETLINK send messages to this socket. So, we mark it async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert nf_tables_net_opsKirill Tkhai
These pernet_operations looks nicely separated per-net. Exit method unregisters net's nf tables objects. We allow them be executed in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert xfrm_user_net_opsKirill Tkhai
These pernet_operations create and destroy net::xfrm::nlsk socket of NETLINK_XFRM. There is only entry point, where it's dereferenced, it's xfrm_user_rcv_msg(). There is no in-kernel senders to this socket. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net: Convert ip6 tables pernet_operationsKirill Tkhai
The pernet_operations: ip6table_filter_net_ops ip6table_mangle_net_ops ip6table_nat_net_ops ip6table_raw_net_ops ip6table_security_net_ops have exit methods, which call ip6t_unregister_table(). ip6table_filter_net_ops has init method registering filter table. Since there must not be in-flight ipv6 packets at the time of pernet_operations execution and since pernet_operations don't send ipv6 packets each other, these pernet_operations are safe to be async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08net/sched: cls_flower: Add support to handle first frag as match fieldPieter Jansen van Vuuren
Allow setting firstfrag as matching option in tc flower classifier. # tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 \ ip_flags firstfrag action mirred egress redirect dev eth1 Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-08devlink: Change dpipe/resource get privilegesArkadi Sharshevsky
Let dpipe/resource be retrieved by unprivileged users. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-03-08 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix various BPF helpers which adjust the skb and its GSO information with regards to SCTP GSO. The latter is a special case where gso_size is of value GSO_BY_FRAGS, so mangling that will end up corrupting the skb, thus bail out when seeing SCTP GSO packets, from Daniel(s). 2) Fix a compilation error in bpftool where BPF_FS_MAGIC is not defined due to too old kernel headers in the system, from Jiri. 3) Increase the number of x64 JIT passes in order to allow larger images to converge instead of punting them to interpreter or having them rejected when the interpreter is not built into the kernel, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>