summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-04-24netfilter: merge meta_bridge into nft_metaFlorian Westphal
It overcomplicates things for no reason. nft_meta_bridge only offers retrieval of bridge port interface name. Because of this being its own module, we had to export all nft_meta functions, which we can then make static again (which even reduces the size of nft_meta -- including bridge port retrieval...): before: text data bss dec hex filename 1838 832 0 2670 a6e net/bridge/netfilter/nft_meta_bridge.ko 6147 936 1 7084 1bac net/netfilter/nft_meta.ko after: 5826 936 1 6763 1a6b net/netfilter/nft_meta.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_tables: always use an upper set size for dynsetsFlorian Westphal
nft rejects rules that lack a timeout and a size limit when they're used to add elements from packet path. Pick a sane upperlimit instead of rejecting outright. The upperlimit is visible to userspace, just as if it would have been given during set declaration. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_tables: support timeouts larger than 23 daysFlorian Westphal
Marco De Benedetto says: I would like to use a timeout of 30 days for elements in a set but it seems there is a some kind of problem above 24d20h31m23s. Fix this by using 'jiffies64' for timeout handling to get same behaviour on 32 and 64bit systems. nftables passes timeouts as u64 in milliseconds to the kernel, but on kernel side we used a mixture of 'long' and jiffies conversions rather than u64 and jiffies64. Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1237 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: xtables: use ipt_get_target_c instead of ipt_get_targetTaehee Yoo
ipt_get_target is used to get struct xt_entry_target and ipt_get_target_c is used to get const struct xt_entry_target. However in the ipt_do_table, ipt_get_target is used to get const struct xt_entry_target. it should be replaced by ipt_get_target_c. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: ebtables: add ebt_get_target and ebt_get_target_cTaehee Yoo
ebt_get_target similar to {ip/ip6/arp}t_get_target. and ebt_get_target_c similar to {ip/ip6/arp}t_get_target_c. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: x_tables: remove duplicate ip6t_get_target function callTaehee Yoo
In the check_target, ip6t_get_target is called twice. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: ebtables: remove EBT_MATCH and EBT_NOMATCHTaehee Yoo
EBT_MATCH and EBT_NOMATCH are used to change return value. match functions(ebt_xxx.c) return false when received frame is not matched and returns true when received frame is matched. but, EBT_MATCH_ITERATE understands oppositely. so, to change return value, EBT_MATCH and EBT_NOMATCH are used. but, we can use operation '!' simply. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: ebtables: add ebt_free_table_info functionTaehee Yoo
A ebt_free_table_info frees all of chainstacks. It similar to xt_free_table_info. this inline function reduces code line. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: add __exit mark to helper modulesTaehee Yoo
There are no __exit mark in the helper modules. because these exit functions used to be called by init function but now that is not. so we can add __exit mark. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: add NAT support for shifted portmap rangesThierry Du Tre
This is a patch proposal to support shifted ranges in portmaps. (i.e. tcp/udp incoming port 5000-5100 on WAN redirected to LAN 192.168.1.5:2000-2100) Currently DNAT only works for single port or identical port ranges. (i.e. ports 5000-5100 on WAN interface redirected to a LAN host while original destination port is not altered) When different port ranges are configured, either 'random' mode should be used, or else all incoming connections are mapped onto the first port in the redirect range. (in described example WAN:5000-5100 will all be mapped to 192.168.1.5:2000) This patch introduces a new mode indicated by flag NF_NAT_RANGE_PROTO_OFFSET which uses a base port value to calculate an offset with the destination port present in the incoming stream. That offset is then applied as index within the redirect port range (index modulo rangewidth to handle range overflow). In described example the base port would be 5000. An incoming stream with destination port 5004 would result in an offset value 4 which means that the NAT'ed stream will be using destination port 2004. Other possibilities include deterministic mapping of larger or multiple ranges to a smaller range : WAN:5000-5999 -> LAN:5000-5099 (maps WAN port 5*xx to port 51xx) This patch does not change any current behavior. It just adds new NAT proto range functionality which must be selected via the specific flag when intended to use. A patch for iptables (libipt_DNAT.c + libip6t_DNAT.c) will also be proposed which makes this functionality immediately available. Signed-off-by: Thierry Du Tre <thierry@dtsystems.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_tables: Simplify set backend selectionPhil Sutter
Drop nft_set_type's ability to act as a container of multiple backend implementations it chooses from. Instead consolidate the whole selection logic in nft_select_set_ops() and the actual backend provided estimate() callback. This turns nf_tables_set_types into a list containing all available backends which is traversed when selecting one matching userspace requested criteria. Also, this change allows to embed nft_set_ops structure into nft_set_type and pull flags field into the latter as it's only used during selection phase. A crucial part of this change is to make sure the new layout respects hash backend constraints formerly enforced by nft_hash_select_ops() function: This is achieved by introduction of a specific estimate() callback for nft_hash_fast_ops which returns false for key lengths != 4. In turn, nft_hash_estimate() is changed to return false for key lengths == 4 so it won't be chosen by accident. Also, both callbacks must return false for unbounded sets as their size estimate depends on a known maximum element count. Note that this patch partially reverts commit 4f2921ca21b71 ("netfilter: nf_tables: meter: pick a set backend that supports updates") by making nft_set_ops_candidate() not explicitly look for an update callback but make NFT_SET_EVAL a regular backend feature flag which is checked along with the others. This way all feature requirements are checked in one go. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_tables: initial support for extended ACK reportingPablo Neira Ayuso
Keep it simple to start with, just report attribute offsets that can be useful to userspace when representating errors to users. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_tables: simplify lookup functionsPablo Neira Ayuso
Replace the nf_tables_ prefix by nft_ and merge code into single lookup function whenever possible. In many cases we go over the 80-chars boundary function names, this save us ~50 LoC. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: fix offloading connections with SNAT+DNATFelix Fietkau
Pass all NAT types to the flow offload struct, otherwise parts of the address/port pair do not get translated properly, causing connection stalls Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: add missing condition for TCP state checkFelix Fietkau
Avoid looking at unrelated fields in UDP packets Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: tear down TCP flows if RST or FIN was seenFelix Fietkau
Allow the slow path to handle the shutdown of the connection with proper timeouts. The packet containing RST/FIN is also sent to the slow path and the TCP conntrack module will update its state. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: add support for sending flows back to the slow pathFelix Fietkau
Since conntrack hasn't seen any packets from the offloaded flow in a while, and the timeout for offloaded flows is set to an extremely long value, we need to fix up the state before we can send a flow back to the slow path. For TCP, reset td_maxwin in both directions, which makes it resync its state on the next packets. Use the regular timeout for TCP and UDP established connections. This allows the slow path to take over again once the offload state has been torn down Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: in flow_offload_lookup, skip entries being deletedFelix Fietkau
Preparation for sending flows back to the slow path Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: add a new flow state for tearing down offloadingFelix Fietkau
On cleanup, this will be treated differently from FLOW_OFFLOAD_DYING: If FLOW_OFFLOAD_DYING is set, the connection is going away, so both the offload state and the connection tracking entry will be deleted. If FLOW_OFFLOAD_TEARDOWN is set, the connection remains alive, but the offload state is torn down. This is useful for cases that require more complex state tracking / timeout handling on TCP, or if the connection has been idle for too long. Support for sending flows back to the slow path will be implemented in a following patch Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: make flow_offload_dead inlineFelix Fietkau
It is too trivial to keep as a separate exported function Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: track flow tables in nf_flow_table directlyFelix Fietkau
Avoids having nf_flow_table depend on nftables (useful for future iptables backport work) Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: fix priv pointer for netdev hookFelix Fietkau
The offload ip hook expects a pointer to the flowtable, not to the rhashtable. Since the rhashtable is the first member, this is safe for the moment, but breaks as soon as the structure layout changes Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: move init code to nf_flow_table_core.cFelix Fietkau
Reduces duplication of .gc and .params in flowtable type definitions and makes the API clearer Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: relax mixed ipv4/ipv6 flowtable dependenciesFelix Fietkau
Since the offload hook code was moved, this table no longer depends on the IPv4 and IPv6 flowtable modules Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: move ipv6 offload hook code to nf_flow_tableFelix Fietkau
Useful as preparation for adding iptables support for offload. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: move ip header check out of nf_flow_exceeds_mtuFelix Fietkau
Allows the function to be shared with the IPv6 hook code Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24netfilter: nf_flow_table: move ipv4 offload hook code to nf_flow_tableFelix Fietkau
Allows some minor code sharing with the ipv6 hook code and is also useful as preparation for adding iptables support for offload Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-23net: fib_rules: fix l3mdev netlink attr processingRoopa Prabhu
Fixes: b16fb418b1bf ("net: fib_rules: add extack support") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23l2tp: check sockaddr length in pppol2tp_connect()Guillaume Nault
Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that it actually points to valid data. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Reported-by: syzbot+a70ac890b23b1bf29f5c@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Fix SIP conntrack with phones sending session descriptions for different media types but same port numbers, from Florian Westphal. 2) Fix incorrect rtnl_lock mutex logic from IPVS sync thread, from Julian Anastasov. 3) Skip compat array allocation in ebtables if there is no entries, also from Florian. 4) Do not lose left/right bits when shifting marks from xt_connmark, from Jack Ma. 5) Silence false positive memleak in conntrack extensions, from Cong Wang. 6) Fix CONFIG_NF_REJECT_IPV6=m link problems, from Arnd Bergmann. 7) Cannot kfree rule that is already in list in nf_tables, switch order so this error handling is not required, from Florian Westphal. 8) Release set name in error path, from Florian. 9) include kmemleak.h in nf_conntrack_extend.c, from Stepheh Rothwell. 10) NAT chain and extensions depend on NF_TABLES. 11) Out of bound access when renaming chains, from Taehee Yoo. 12) Incorrect casting in xt_connmark leads to wrong bitshifting. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net/ipv6: Fix missing rcu dereferences on fromDavid Ahern
kbuild test robot reported 2 uses of rt->from not properly accessed using rcu_dereference: 1. add rcu_dereference_protected to rt6_remove_exception_rt and make sure it is always called with rcu lock held. 2. change rt6_do_redirect to take a reference on 'from' when accessed the first time so it can be used the sceond time outside of the lock Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net/ipv6: add rcu locking to ip6_negative_adviceDavid Ahern
syzbot reported a suspicious rcu_dereference_check: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x14a/0x153 kernel/locking/lockdep.c:4592 rt6_check_expired+0x38b/0x3e0 net/ipv6/route.c:410 ip6_negative_advice+0x67/0xc0 net/ipv6/route.c:2204 dst_negative_advice include/net/sock.h:1786 [inline] sock_setsockopt+0x138f/0x1fe0 net/core/sock.c:1051 __sys_setsockopt+0x2df/0x390 net/socket.c:1899 SYSC_setsockopt net/socket.c:1914 [inline] SyS_setsockopt+0x34/0x50 net/socket.c:1911 Add rcu locking around call to rt6_check_expired in ip6_negative_advice. Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Reported-by: syzbot+2422c9e35796659d2273@syzkaller.appspotmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net: ieee802154: 6lowpan: fix frag reassemblyAlexander Aring
This patch initialize stack variables which are used in frag_lowpan_compare_key to zero. In my case there are padding bytes in the structures ieee802154_addr as well in frag_lowpan_compare_key. Otherwise the key variable contains random bytes. The result is that a compare of two keys by memcmp works incorrect. Fixes: 648700f76b03 ("inet: frags: use rhashtables for reassembly units") Signed-off-by: Alexander Aring <aring@mojatatu.com> Reported-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
2018-04-23ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policyEric Dumazet
KMSAN reported use of uninit-value that I tracked to lack of proper size check on RTA_TABLE attribute. I also believe RTA_PREFSRC lacks a similar check. Fixes: 86872cb57925 ("[IPv6] route: FIB6 configuration using struct fib6_config") Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net: init sk_cookie for inet socketYafang Shao
With sk_cookie we can identify a socket, that is very helpful for traceing and statistic, i.e. tcp tracepiont and ebpf. So we'd better init it by default for inet socket. When using it, we just need call atomic64_read(&sk->sk_cookie). Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net: fib_rules: add extack supportRoopa Prabhu
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23fib_rules: move common handling of newrule delrule msgs into fib_nl2ruleRoopa Prabhu
This reduces code duplication in the fib rule add and del paths. Get rid of validate_rulemsg. This became obvious when adding duplicate extack support in fib newrule/delrule error paths. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23net: introduce a new tracepoint for tcp_rcv_space_adjustYafang Shao
tcp_rcv_space_adjust is called every time data is copied to user space, introducing a tcp tracepoint for which could show us when the packet is copied to user. When a tcp packet arrives, tcp_rcv_established() will be called and with the existed tracepoint tcp_probe we could get the time when this packet arrives. Then this packet will be copied to user, and tcp_rcv_space_adjust will be called and with this new introduced tracepoint we could get the time when this packet is copied to user. With these two tracepoints, we could figure out whether the user program processes this packet immediately or there's latency. Hence in the printk message, sk_cookie is printed as a key to relate tcp_rcv_space_adjust with tcp_probe. Maybe we could export sockfd in this new tracepoint as well, then we could relate this new tracepoint with epoll/read/recv* tracepoints, and finally that could show us the whole lifespan of this packet. But we could also implement that with pid as these functions are executed in process context. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23tcp: don't read out-of-bounds opsizeJann Horn
The old code reads the "opsize" variable from out-of-bounds memory (first byte behind the segment) if a broken TCP segment ends directly after an opcode that is neither EOL nor NOP. The result of the read isn't used for anything, so the worst thing that could theoretically happen is a pagefault; and since the physmap is usually mostly contiguous, even that seems pretty unlikely. The following C reproducer triggers the uninitialized read - however, you can't actually see anything happen unless you put something like a pr_warn() in tcp_parse_md5sig_option() to print the opsize. ==================================== #define _GNU_SOURCE #include <arpa/inet.h> #include <stdlib.h> #include <errno.h> #include <stdarg.h> #include <net/if.h> #include <linux/if.h> #include <linux/ip.h> #include <linux/tcp.h> #include <linux/in.h> #include <linux/if_tun.h> #include <err.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> #include <stdio.h> #include <unistd.h> #include <sys/ioctl.h> #include <assert.h> void systemf(const char *command, ...) { char *full_command; va_list ap; va_start(ap, command); if (vasprintf(&full_command, command, ap) == -1) err(1, "vasprintf"); va_end(ap); printf("systemf: <<<%s>>>\n", full_command); system(full_command); } char *devname; int tun_alloc(char *name) { int fd = open("/dev/net/tun", O_RDWR); if (fd == -1) err(1, "open tun dev"); static struct ifreq req = { .ifr_flags = IFF_TUN|IFF_NO_PI }; strcpy(req.ifr_name, name); if (ioctl(fd, TUNSETIFF, &req)) err(1, "TUNSETIFF"); devname = req.ifr_name; printf("device name: %s\n", devname); return fd; } #define IPADDR(a,b,c,d) (((a)<<0)+((b)<<8)+((c)<<16)+((d)<<24)) void sum_accumulate(unsigned int *sum, void *data, int len) { assert((len&2)==0); for (int i=0; i<len/2; i++) { *sum += ntohs(((unsigned short *)data)[i]); } } unsigned short sum_final(unsigned int sum) { sum = (sum >> 16) + (sum & 0xffff); sum = (sum >> 16) + (sum & 0xffff); return htons(~sum); } void fix_ip_sum(struct iphdr *ip) { unsigned int sum = 0; sum_accumulate(&sum, ip, sizeof(*ip)); ip->check = sum_final(sum); } void fix_tcp_sum(struct iphdr *ip, struct tcphdr *tcp) { unsigned int sum = 0; struct { unsigned int saddr; unsigned int daddr; unsigned char pad; unsigned char proto_num; unsigned short tcp_len; } fakehdr = { .saddr = ip->saddr, .daddr = ip->daddr, .proto_num = ip->protocol, .tcp_len = htons(ntohs(ip->tot_len) - ip->ihl*4) }; sum_accumulate(&sum, &fakehdr, sizeof(fakehdr)); sum_accumulate(&sum, tcp, tcp->doff*4); tcp->check = sum_final(sum); } int main(void) { int tun_fd = tun_alloc("inject_dev%d"); systemf("ip link set %s up", devname); systemf("ip addr add 192.168.42.1/24 dev %s", devname); struct { struct iphdr ip; struct tcphdr tcp; unsigned char tcp_opts[20]; } __attribute__((packed)) syn_packet = { .ip = { .ihl = sizeof(struct iphdr)/4, .version = 4, .tot_len = htons(sizeof(syn_packet)), .ttl = 30, .protocol = IPPROTO_TCP, /* FIXUP check */ .saddr = IPADDR(192,168,42,2), .daddr = IPADDR(192,168,42,1) }, .tcp = { .source = htons(1), .dest = htons(1337), .seq = 0x12345678, .doff = (sizeof(syn_packet.tcp)+sizeof(syn_packet.tcp_opts))/4, .syn = 1, .window = htons(64), .check = 0 /*FIXUP*/ }, .tcp_opts = { /* INVALID: trailing MD5SIG opcode after NOPs */ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 19 } }; fix_ip_sum(&syn_packet.ip); fix_tcp_sum(&syn_packet.ip, &syn_packet.tcp); while (1) { int write_res = write(tun_fd, &syn_packet, sizeof(syn_packet)); if (write_res != sizeof(syn_packet)) err(1, "packet write failed"); } } ==================================== Fixes: cfb6eeb4c860 ("[TCP]: MD5 Signature Option (RFC2385) support.") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22net: sched: ife: check on metadata lengthAlexander Aring
This patch checks if sk buffer is available to dererence ife header. If not then NULL will returned to signal an malformed ife packet. This avoids to crashing the kernel from outside. Signed-off-by: Alexander Aring <aring@mojatatu.com> Reviewed-by: Yotam Gigi <yotam.gi@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22net: sched: ife: handle malformed tlv lengthAlexander Aring
There is currently no handling to check on a invalid tlv length. This patch adds such handling to avoid killing the kernel with a malformed ife packet. Signed-off-by: Alexander Aring <aring@mojatatu.com> Reviewed-by: Yotam Gigi <yotam.gi@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22net: sched: ife: signal not finding metaidAlexander Aring
We need to record stats for received metadata that we dont know how to process. Have find_decode_metaid() return -ENOENT to capture this. Signed-off-by: Alexander Aring <aring@mojatatu.com> Reviewed-by: Yotam Gigi <yotam.gi@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22strparser: Do not call mod_delayed_work with a timeout of LONG_MAXDoron Roberts-Kedes
struct sock's sk_rcvtimeo is initialized to LONG_MAX/MAX_SCHEDULE_TIMEOUT in sock_init_data. Calling mod_delayed_work with a timeout of LONG_MAX causes spurious execution of the work function. timer->expires is set equal to jiffies + LONG_MAX. When timer_base->clk falls behind the current value of jiffies, the delta between timer_base->clk and jiffies + LONG_MAX causes the expiration to be in the past. Returning early from strp_start_timer if timeo == LONG_MAX solves this problem. Found while testing net/tls_sw recv path. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22ipv6: sr: fix NULL pointer dereference in seg6_do_srh_encap()- v4 pktsAhmed Abdelsalam
In case of seg6 in encap mode, seg6_do_srh_encap() calls set_tun_src() in order to set the src addr of outer IPv6 header. The net_device is required for set_tun_src(). However calling ip6_dst_idev() on dst_entry in case of IPv4 traffic results on the following bug. Using just dst->dev should fix this BUG. [ 196.242461] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 196.242975] PGD 800000010f076067 P4D 800000010f076067 PUD 10f060067 PMD 0 [ 196.243329] Oops: 0000 [#1] SMP PTI [ 196.243468] Modules linked in: nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd input_leds glue_helper led_class pcspkr serio_raw mac_hid video autofs4 hid_generic usbhid hid e1000 i2c_piix4 ahci pata_acpi libahci [ 196.244362] CPU: 2 PID: 1089 Comm: ping Not tainted 4.16.0+ #1 [ 196.244606] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 196.244968] RIP: 0010:seg6_do_srh_encap+0x1ac/0x300 [ 196.245236] RSP: 0018:ffffb2ce00b23a60 EFLAGS: 00010202 [ 196.245464] RAX: 0000000000000000 RBX: ffff8c7f53eea300 RCX: 0000000000000000 [ 196.245742] RDX: 0000f10000000000 RSI: ffff8c7f52085a6c RDI: ffff8c7f41166850 [ 196.246018] RBP: ffffb2ce00b23aa8 R08: 00000000000261e0 R09: ffff8c7f41166800 [ 196.246294] R10: ffffdce5040ac780 R11: ffff8c7f41166828 R12: ffff8c7f41166808 [ 196.246570] R13: ffff8c7f52085a44 R14: ffffffffb73211c0 R15: ffff8c7e69e44200 [ 196.246846] FS: 00007fc448789700(0000) GS:ffff8c7f59d00000(0000) knlGS:0000000000000000 [ 196.247286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 196.247526] CR2: 0000000000000000 CR3: 000000010f05a000 CR4: 00000000000406e0 [ 196.247804] Call Trace: [ 196.247972] seg6_do_srh+0x15b/0x1c0 [ 196.248156] seg6_output+0x3c/0x220 [ 196.248341] ? prandom_u32+0x14/0x20 [ 196.248526] ? ip_idents_reserve+0x6c/0x80 [ 196.248723] ? __ip_select_ident+0x90/0x100 [ 196.248923] ? ip_append_data.part.50+0x6c/0xd0 [ 196.249133] lwtunnel_output+0x44/0x70 [ 196.249328] ip_send_skb+0x15/0x40 [ 196.249515] raw_sendmsg+0x8c3/0xac0 [ 196.249701] ? _copy_from_user+0x2e/0x60 [ 196.249897] ? rw_copy_check_uvector+0x53/0x110 [ 196.250106] ? _copy_from_user+0x2e/0x60 [ 196.250299] ? copy_msghdr_from_user+0xce/0x140 [ 196.250508] sock_sendmsg+0x36/0x40 [ 196.250690] ___sys_sendmsg+0x292/0x2a0 [ 196.250881] ? _cond_resched+0x15/0x30 [ 196.251074] ? copy_termios+0x1e/0x70 [ 196.251261] ? _copy_to_user+0x22/0x30 [ 196.251575] ? tty_mode_ioctl+0x1c3/0x4e0 [ 196.251782] ? _cond_resched+0x15/0x30 [ 196.251972] ? mutex_lock+0xe/0x30 [ 196.252152] ? vvar_fault+0xd2/0x110 [ 196.252337] ? __do_fault+0x1f/0xc0 [ 196.252521] ? __handle_mm_fault+0xc1f/0x12d0 [ 196.252727] ? __sys_sendmsg+0x63/0xa0 [ 196.252919] __sys_sendmsg+0x63/0xa0 [ 196.253107] do_syscall_64+0x72/0x200 [ 196.253305] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 196.253530] RIP: 0033:0x7fc4480b0690 [ 196.253715] RSP: 002b:00007ffde9f252f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 196.254053] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007fc4480b0690 [ 196.254331] RDX: 0000000000000000 RSI: 000000000060a360 RDI: 0000000000000003 [ 196.254608] RBP: 00007ffde9f253f0 R08: 00000000002d1e81 R09: 0000000000000002 [ 196.254884] R10: 00007ffde9f250c0 R11: 0000000000000246 R12: 0000000000b22070 [ 196.255205] R13: 20c49ba5e353f7cf R14: 431bde82d7b634db R15: 00007ffde9f278fe [ 196.255484] Code: a5 0f b6 45 c0 41 88 41 28 41 0f b6 41 2c 48 c1 e0 04 49 8b 54 01 38 49 8b 44 01 30 49 89 51 20 49 89 41 18 48 8b 83 b0 00 00 00 <48> 8b 30 49 8b 86 08 0b 00 00 48 8b 40 20 48 8b 50 08 48 0b 10 [ 196.256190] RIP: seg6_do_srh_encap+0x1ac/0x300 RSP: ffffb2ce00b23a60 [ 196.256445] CR2: 0000000000000000 [ 196.256676] ---[ end trace 71af7d093603885c ]--- Fixes: 8936ef7604c11 ("ipv6: sr: fix NULL pointer dereference when setting encap source address") Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com> Acked-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22llc: fix NULL pointer deref for SOCK_ZAPPEDCong Wang
For SOCK_ZAPPED socket, we don't need to care about llc->sap, so we should just skip these refcount functions in this case. Fixes: f7e43672683b ("llc: hold llc_sap before release_sock()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22llc: delete timers synchronously in llc_sk_free()Cong Wang
The connection timers of an llc sock could be still flying after we delete them in llc_sk_free(), and even possibly after we free the sock. We could just wait synchronously here in case of troubles. Note, I leave other call paths as they are, since they may not have to wait, at least we can change them to synchronously when needed. Also, move the code to net/llc/llc_conn.c, which is apparently a better place. Reported-by: <syzbot+f922284c18ea23a8e457@syzkaller.appspotmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22l2tp: fix {pppol2tp, l2tp_dfs}_seq_stop() in case of seq_file overflowGuillaume Nault
Commit 0e0c3fee3a59 ("l2tp: hold reference on tunnels printed in pppol2tp proc file") assumed that if pppol2tp_seq_stop() was called with non-NULL private data (the 'v' pointer), then pppol2tp_seq_start() would not be called again. It turns out that this isn't guaranteed, and overflowing the seq_file's buffer in pppol2tp_seq_show() is a way to get into this situation. Therefore, pppol2tp_seq_stop() needs to reset pd->tunnel, so that pppol2tp_seq_start() won't drop a reference again if it gets called. We also have to clear pd->session, because the rest of the code expects a non-NULL tunnel when pd->session is set. The l2tp_debugfs module has the same issue. Fix it in the same way. Fixes: 0e0c3fee3a59 ("l2tp: hold reference on tunnels printed in pppol2tp proc file") Fixes: f726214d9b23 ("l2tp: hold reference on tunnels printed in l2tp/tunnels debugfs file") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22batman-adv: Remove unused dentry without DEBUGFSSven Eckelmann
The debug_dir variable in the main structures is only accessed when CONFIG_BATMAN_ADV_DEBUGFS is enabled. It can be dropped to potentially save some bytes of memory. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-04-22batman-adv: Avoid bool in structuresSven Eckelmann
Using the bool type for structure member is considered inappropriate [1] for the kernel. Its size is not well defined (but usually 1 byte but maybe also 4 byte). [1] https://lkml.org/lkml/2017/11/21/384 Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-04-22batman-adv: Avoid old nodes disabling multicast optimizations completelyLinus Lüssing
Instead of disabling multicast optimizations mesh-wide once a node with no multicast optimizations capabilities joins the mesh, do the following: Just insert such nodes into the WANT_ALL_IPV4/IPV6 lists. This is sufficient to avoid multicast packet loss to such unsupportive nodes. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>