From 8993cf8edf42527119186b558766539243b791a5 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Mon, 11 Aug 2014 18:21:49 +0200 Subject: netfilter: move NAT Kconfig switches out of the iptables scope Currently, the NAT configs depend on iptables and ip6tables. However, users should be capable of enabling NAT for nft without having to switch on iptables. Fix this by adding new specific IP_NF_NAT and IP6_NF_NAT config switches for iptables and ip6tables NAT support. I have also moved the original NF_NAT_IPV4 and NF_NAT_IPV6 configs out of the scope of iptables to make them independent of it. This patch also adds NETFILTER_XT_NAT which selects the xt_nat combo that provides snat/dnat for iptables. We cannot use NF_NAT anymore since nf_tables can select this. Reported-by: Matteo Croce Signed-off-by: Pablo Neira Ayuso --- net/ipv6/netfilter/Kconfig | 26 +++++++++++++++++++------- net/ipv6/netfilter/Makefile | 2 +- 2 files changed, 20 insertions(+), 8 deletions(-) (limited to 'net/ipv6') diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig index ac93df16f5af..cf0b88f30f6f 100644 --- a/net/ipv6/netfilter/Kconfig +++ b/net/ipv6/netfilter/Kconfig @@ -60,6 +60,16 @@ config NF_LOG_IPV6 depends on NETFILTER_ADVANCED select NF_LOG_COMMON +config NF_NAT_IPV6 + tristate "IPv6 NAT" + depends on NF_CONNTRACK_IPV6 + depends on NETFILTER_ADVANCED + select NF_NAT + help + The IPv6 NAT option allows masquerading, port forwarding and other + forms of full Network Address Port Translation. This can be + controlled by iptables or nft. + config IP6_NF_IPTABLES tristate "IP6 tables support (required for filtering)" depends on INET && IPV6 @@ -232,19 +242,21 @@ config IP6_NF_SECURITY If unsure, say N. -config NF_NAT_IPV6 - tristate "IPv6 NAT" +config IP6_NF_NAT + tristate "ip6tables NAT support" depends on NF_CONNTRACK_IPV6 depends on NETFILTER_ADVANCED select NF_NAT + select NF_NAT_IPV6 + select NETFILTER_XT_NAT help - The IPv6 NAT option allows masquerading, port forwarding and other - forms of full Network Address Port Translation. It is controlled by - the `nat' table in ip6tables, see the man page for ip6tables(8). + This enables the `nat' table in ip6tables. This allows masquerading, + port forwarding and other forms of full Network Address Port + Translation. To compile it as a module, choose M here. If unsure, say N. -if NF_NAT_IPV6 +if IP6_NF_NAT config IP6_NF_TARGET_MASQUERADE tristate "MASQUERADE target support" @@ -265,7 +277,7 @@ config IP6_NF_TARGET_NPT To compile it as a module, choose M here. If unsure, say N. -endif # NF_NAT_IPV6 +endif # IP6_NF_NAT endif # IP6_NF_IPTABLES diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile index c0b263104ed2..c3d3286db4bb 100644 --- a/net/ipv6/netfilter/Makefile +++ b/net/ipv6/netfilter/Makefile @@ -8,7 +8,7 @@ obj-$(CONFIG_IP6_NF_FILTER) += ip6table_filter.o obj-$(CONFIG_IP6_NF_MANGLE) += ip6table_mangle.o obj-$(CONFIG_IP6_NF_RAW) += ip6table_raw.o obj-$(CONFIG_IP6_NF_SECURITY) += ip6table_security.o -obj-$(CONFIG_NF_NAT_IPV6) += ip6table_nat.o +obj-$(CONFIG_IP6_NF_NAT) += ip6table_nat.o # objects for l3 independent conntrack nf_conntrack_ipv6-y := nf_conntrack_l3proto_ipv6.o nf_conntrack_proto_icmpv6.o -- cgit From 793c3b4000a1ef611ae7e5c89bd2a9c6b776cb5e Mon Sep 17 00:00:00 2001 From: Benjamin Block Date: Thu, 21 Aug 2014 19:37:48 +0200 Subject: net: ipv6: fib: don't sleep inside atomic lock The function fib6_commit_metrics() allocates a piece of memory in mode GFP_KERNEL while holding an atomic lock from higher up in the stack, in the function __ip6_ins_rt(). This produces the following BUG: > BUG: sleeping function called from invalid context at mm/slub.c:1250 > in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: dhcpcd > 2 locks held by dhcpcd/2909: > #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20 > #1: (&tb->tb6_lock){++--+.}, at: [] ip6_route_add+0x65a/0x800 > CPU: 1 PID: 2909 Comm: dhcpcd Not tainted 3.17.0-rc1 #1 > Hardware name: ASUS All Series/Q87T, BIOS 0216 10/16/2013 > 0000000000000008 ffff8800c8f13858 ffffffff81af135a 0000000000000000 > ffff880212202430 ffff8800c8f13878 ffffffff810f8d3a ffff880212202c98 > 0000000000000010 ffff8800c8f138c8 ffffffff8121ad0e 0000000000000001 > Call Trace: > [] dump_stack+0x4e/0x68 > [] __might_sleep+0x10a/0x120 > [] kmem_cache_alloc_trace+0x4e/0x190 > [] ? fib6_commit_metrics+0x66/0x110 > [] fib6_commit_metrics+0x66/0x110 > [] fib6_add+0x883/0xa80 > [] ? ip6_route_add+0x65a/0x800 > [] ip6_route_add+0x675/0x800 > [] ? ip6_route_add+0x6a/0x800 > [] inet6_rtm_newroute+0x5c/0x80 > [] rtnetlink_rcv_msg+0x211/0x260 > [] ? rtnl_lock+0x17/0x20 > [] ? lock_release_holdtime+0x28/0x180 > [] ? rtnl_lock+0x17/0x20 > [] ? __rtnl_unlock+0x20/0x20 > [] netlink_rcv_skb+0x6e/0xd0 > [] rtnetlink_rcv+0x25/0x40 > [] netlink_unicast+0xd9/0x180 > [] netlink_sendmsg+0x700/0x770 > [] ? local_clock+0x25/0x30 > [] sock_sendmsg+0x6c/0x90 > [] ? might_fault+0xa3/0xb0 > [] ? verify_iovec+0x7d/0xf0 > [] ___sys_sendmsg+0x37e/0x3b0 > [] ? trace_hardirqs_on_caller+0x185/0x220 > [] ? mutex_unlock+0xe/0x10 > [] ? netlink_insert+0xbc/0xe0 > [] ? netlink_autobind.isra.30+0x125/0x150 > [] ? netlink_autobind.isra.30+0x60/0x150 > [] ? netlink_bind+0x159/0x230 > [] ? might_fault+0x5a/0xb0 > [] ? SYSC_bind+0x7e/0xd0 > [] __sys_sendmsg+0x4d/0x80 > [] SyS_sendmsg+0x12/0x20 > [] system_call_fastpath+0x16/0x1b Fixing this by replacing the mode GFP_KERNEL with GFP_ATOMIC. Signed-off-by: Benjamin Block Acked-by: David Rientjes Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller --- net/ipv6/ip6_fib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'net/ipv6') diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index cb4459bd1d29..76b7f5ee8f4c 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -643,7 +643,7 @@ static int fib6_commit_metrics(struct dst_entry *dst, if (dst->flags & DST_HOST) { mp = dst_metrics_write_ptr(dst); } else { - mp = kzalloc(sizeof(u32) * RTAX_MAX, GFP_KERNEL); + mp = kzalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC); if (!mp) return -ENOMEM; dst_init_metrics(dst, mp, 0); -- cgit From 41ad82f7f8f59a472c8e8ccca56a9c6bdf3c5092 Mon Sep 17 00:00:00 2001 From: Pablo Neira Date: Tue, 2 Sep 2014 14:26:17 +0200 Subject: netfilter: fix missing dependencies in NETFILTER_XT_TARGET_LOG make defconfig reports: warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NETFILTER_ADVANCED) Fixes: d79a61d netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_* Reported-by: kbuild test robot Signed-off-by: Pablo Neira Ayuso Signed-off-by: David S. Miller --- net/ipv6/netfilter/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'net/ipv6') diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig index cf0b88f30f6f..2812816aabdc 100644 --- a/net/ipv6/netfilter/Kconfig +++ b/net/ipv6/netfilter/Kconfig @@ -57,7 +57,7 @@ config NFT_REJECT_IPV6 config NF_LOG_IPV6 tristate "IPv6 packet logging" - depends on NETFILTER_ADVANCED + default m if NETFILTER_ADVANCED=n select NF_LOG_COMMON config NF_NAT_IPV6 -- cgit From a9ed4a2986e13011fcf4ed2d1a1647c53112f55b Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Tue, 2 Sep 2014 10:29:29 +0200 Subject: ipv6: fix rtnl locking in setsockopt for anycast and multicast Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST triggers the assertion in addrconf_join_solict()/addrconf_leave_solict() ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before calling ipv6_dev_mc_inc/dec. This patch moves ASSERT_RTNL() up a level in the call stack. Signed-off-by: Cong Wang Signed-off-by: Sabrina Dubroca Reported-by: Tommi Rantala Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller --- net/ipv6/addrconf.c | 15 +++++---------- net/ipv6/anycast.c | 12 ++++++++++++ net/ipv6/mcast.c | 14 ++++++++++++++ 3 files changed, 31 insertions(+), 10 deletions(-) (limited to 'net/ipv6') diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 0b239fc1816e..aa0e135b808c 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1690,14 +1690,12 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp) addrconf_mod_dad_work(ifp, 0); } -/* Join to solicited addr multicast group. */ - +/* Join to solicited addr multicast group. + * caller must hold RTNL */ void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr) { struct in6_addr maddr; - ASSERT_RTNL(); - if (dev->flags&(IFF_LOOPBACK|IFF_NOARP)) return; @@ -1705,12 +1703,11 @@ void addrconf_join_solict(struct net_device *dev, const struct in6_addr *addr) ipv6_dev_mc_inc(dev, &maddr); } +/* caller must hold RTNL */ void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr) { struct in6_addr maddr; - ASSERT_RTNL(); - if (idev->dev->flags&(IFF_LOOPBACK|IFF_NOARP)) return; @@ -1718,12 +1715,11 @@ void addrconf_leave_solict(struct inet6_dev *idev, const struct in6_addr *addr) __ipv6_dev_mc_dec(idev, &maddr); } +/* caller must hold RTNL */ static void addrconf_join_anycast(struct inet6_ifaddr *ifp) { struct in6_addr addr; - ASSERT_RTNL(); - if (ifp->prefix_len >= 127) /* RFC 6164 */ return; ipv6_addr_prefix(&addr, &ifp->addr, ifp->prefix_len); @@ -1732,12 +1728,11 @@ static void addrconf_join_anycast(struct inet6_ifaddr *ifp) ipv6_dev_ac_inc(ifp->idev->dev, &addr); } +/* caller must hold RTNL */ static void addrconf_leave_anycast(struct inet6_ifaddr *ifp) { struct in6_addr addr; - ASSERT_RTNL(); - if (ifp->prefix_len >= 127) /* RFC 6164 */ return; ipv6_addr_prefix(&addr, &ifp->addr, ifp->prefix_len); diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 210183244689..45b9d81d91e8 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -77,6 +77,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) pac->acl_next = NULL; pac->acl_addr = *addr; + rtnl_lock(); rcu_read_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -137,6 +138,7 @@ int ipv6_sock_ac_join(struct sock *sk, int ifindex, const struct in6_addr *addr) error: rcu_read_unlock(); + rtnl_unlock(); if (pac) sock_kfree_s(sk, pac, sizeof(*pac)); return err; @@ -171,13 +173,17 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) spin_unlock_bh(&ipv6_sk_ac_lock); + rtnl_lock(); rcu_read_lock(); dev = dev_get_by_index_rcu(net, pac->acl_ifindex); if (dev) ipv6_dev_ac_dec(dev, &pac->acl_addr); rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, pac, sizeof(*pac)); + if (!dev) + return -ENODEV; return 0; } @@ -198,6 +204,7 @@ void ipv6_sock_ac_close(struct sock *sk) spin_unlock_bh(&ipv6_sk_ac_lock); prev_index = 0; + rtnl_lock(); rcu_read_lock(); while (pac) { struct ipv6_ac_socklist *next = pac->acl_next; @@ -212,6 +219,7 @@ void ipv6_sock_ac_close(struct sock *sk) pac = next; } rcu_read_unlock(); + rtnl_unlock(); } static void aca_put(struct ifacaddr6 *ac) @@ -233,6 +241,8 @@ int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr) struct rt6_info *rt; int err; + ASSERT_RTNL(); + idev = in6_dev_get(dev); if (idev == NULL) @@ -302,6 +312,8 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr) { struct ifacaddr6 *aca, *prev_aca; + ASSERT_RTNL(); + write_lock_bh(&idev->lock); prev_aca = NULL; for (aca = idev->ac_list; aca; aca = aca->aca_next) { diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 617f0958e164..a23b655a7627 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -172,6 +172,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) mc_lst->next = NULL; mc_lst->addr = *addr; + rtnl_lock(); rcu_read_lock(); if (ifindex == 0) { struct rt6_info *rt; @@ -185,6 +186,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (dev == NULL) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return -ENODEV; } @@ -202,6 +204,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) if (err) { rcu_read_unlock(); + rtnl_unlock(); sock_kfree_s(sk, mc_lst, sizeof(*mc_lst)); return err; } @@ -212,6 +215,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, const struct in6_addr *addr) spin_unlock(&ipv6_sk_mc_lock); rcu_read_unlock(); + rtnl_unlock(); return 0; } @@ -229,6 +233,7 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) if (!ipv6_addr_is_multicast(addr)) return -EINVAL; + rtnl_lock(); spin_lock(&ipv6_sk_mc_lock); for (lnk = &np->ipv6_mc_list; (mc_lst = rcu_dereference_protected(*lnk, @@ -252,12 +257,15 @@ int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) } else (void) ip6_mc_leave_src(sk, mc_lst, NULL); rcu_read_unlock(); + rtnl_unlock(); + atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc); kfree_rcu(mc_lst, rcu); return 0; } } spin_unlock(&ipv6_sk_mc_lock); + rtnl_unlock(); return -EADDRNOTAVAIL; } @@ -302,6 +310,7 @@ void ipv6_sock_mc_close(struct sock *sk) if (!rcu_access_pointer(np->ipv6_mc_list)) return; + rtnl_lock(); spin_lock(&ipv6_sk_mc_lock); while ((mc_lst = rcu_dereference_protected(np->ipv6_mc_list, lockdep_is_held(&ipv6_sk_mc_lock))) != NULL) { @@ -328,6 +337,7 @@ void ipv6_sock_mc_close(struct sock *sk) spin_lock(&ipv6_sk_mc_lock); } spin_unlock(&ipv6_sk_mc_lock); + rtnl_unlock(); } int ip6_mc_source(int add, int omode, struct sock *sk, @@ -845,6 +855,8 @@ int ipv6_dev_mc_inc(struct net_device *dev, const struct in6_addr *addr) struct ifmcaddr6 *mc; struct inet6_dev *idev; + ASSERT_RTNL(); + /* we need to take a reference on idev */ idev = in6_dev_get(dev); @@ -916,6 +928,8 @@ int __ipv6_dev_mc_dec(struct inet6_dev *idev, const struct in6_addr *addr) { struct ifmcaddr6 *ma, **map; + ASSERT_RTNL(); + write_lock_bh(&idev->lock); for (map = &idev->mc_list; (ma=*map) != NULL; map = &ma->next) { if (ipv6_addr_equal(&ma->mca_addr, addr)) { -- cgit From f24062b07dda89b0e24fa48e7bc3865a725f5ee6 Mon Sep 17 00:00:00 2001 From: Nicolas Dichtel Date: Wed, 3 Sep 2014 23:59:21 +0200 Subject: ipv6: fix a refcnt leak with peer addr There is no reason to take a refcnt before deleting the peer address route. It's done some lines below for the local prefix route because inet6_ifa_finish_destroy() will release it at the end. For the peer address route, we want to free it right now. This bug has been introduced by commit caeaba79009c ("ipv6: add support of peer address"). Signed-off-by: Nicolas Dichtel Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller --- net/ipv6/addrconf.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) (limited to 'net/ipv6') diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index aa0e135b808c..ce761c7e6675 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4772,11 +4772,8 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) rt = rt6_lookup(dev_net(dev), &ifp->peer_addr, NULL, dev->ifindex, 1); - if (rt) { - dst_hold(&rt->dst); - if (ip6_del_rt(rt)) - dst_free(&rt->dst); - } + if (rt && ip6_del_rt(rt)) + dst_free(&rt->dst); } dst_hold(&ifp->rt->dst); -- cgit From e7478dfc4656f4a739ed1b07cfd59c12f8eb112e Mon Sep 17 00:00:00 2001 From: Nicolas Dichtel Date: Wed, 3 Sep 2014 23:59:22 +0200 Subject: ipv6: use addrconf_get_prefix_route() to remove peer addr addrconf_get_prefix_route() ensures to get the right route in the right table. Signed-off-by: Nicolas Dichtel Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller --- net/ipv6/addrconf.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'net/ipv6') diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index ce761c7e6675..fc1fac2a0528 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4768,10 +4768,9 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) addrconf_leave_solict(ifp->idev, &ifp->addr); if (!ipv6_addr_any(&ifp->peer_addr)) { struct rt6_info *rt; - struct net_device *dev = ifp->idev->dev; - rt = rt6_lookup(dev_net(dev), &ifp->peer_addr, NULL, - dev->ifindex, 1); + rt = addrconf_get_prefix_route(&ifp->peer_addr, 128, + ifp->idev->dev, 0, 0); if (rt && ip6_del_rt(rt)) dst_free(&rt->dst); } -- cgit From de185ab46cb02df9738b0d898b0c3a89181c5526 Mon Sep 17 00:00:00 2001 From: WANG Cong Date: Fri, 5 Sep 2014 14:33:00 -0700 Subject: ipv6: restore the behavior of ipv6_sock_ac_drop() It is possible that the interface is already gone after joining the list of anycast on this interface as we don't hold a refcount for the device, in this case we are safe to ignore the error. What's more important, for API compatibility we should not change this behavior for applications even if it were correct. Fixes: commit a9ed4a2986e13011 ("ipv6: fix rtnl locking in setsockopt for anycast and multicast") Cc: Sabrina Dubroca Cc: David S. Miller Signed-off-by: Cong Wang Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller --- net/ipv6/anycast.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'net/ipv6') diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index 45b9d81d91e8..ff2de7d9d8e6 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -182,8 +182,6 @@ int ipv6_sock_ac_drop(struct sock *sk, int ifindex, const struct in6_addr *addr) rtnl_unlock(); sock_kfree_s(sk, pac, sizeof(*pac)); - if (!dev) - return -ENODEV; return 0; } -- cgit