summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2011-03-03Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h
2011-03-03ipv6: Use ERR_CAST in addrconf_dst_alloc.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02ipv4: Make output route lookup return rtable directly.David S. Miller
Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02xfrm: Return dst directly from xfrm_lookup()David S. Miller
Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01ipv6: Make icmp route lookup code a bit clearer.David S. Miller
The route lookup code in icmpv6_send() is slightly tricky as a result of having to handle all of the requirements of RFC 4301 host relookups. Pull the route resolution into a seperate function, so that the error handling and route reference counting is hopefully easier to see and contained wholly within this new routine. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01xfrm: Handle blackhole route creation via afinfo.David S. Miller
That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01ipv6: Normalize arguments to ip6_dst_blackhole().David S. Miller
Return a dst pointer which is potentitally error encoded. Don't pass original dst pointer by reference, pass a struct net instead of a socket, and elide the flow argument since it is unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01xfrm: Kill XFRM_LOOKUP_WAIT flag.David S. Miller
This can be determined from the flow flags instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01ipv6: Change final dst lookup arg name to "can_sleep"David S. Miller
Since it indicates whether we are invoked from a sleepable context or not. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01net: Add FLOWI_FLAG_CAN_SLEEP.David S. Miller
And set is in contexts where the route resolution can sleep. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01ipv6: Consolidate route lookup sequences.David S. Miller
Route lookups follow a general pattern in the ipv6 code wherein we first find the non-IPSEC route, potentially override the flow destination address due to ipv6 options settings, and then finally make an IPSEC search using either xfrm_lookup() or __xfrm_lookup(). __xfrm_lookup() is used when we want to generate a blackhole route if the key manager needs to resolve the IPSEC rules (in this case -EREMOTE is returned and the original 'dst' is left unchanged). Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC resolution is necessary, we simply fail the lookup completely. All of these cases are encapsulated into two routines, ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which handles unconnected UDP datagram sockets. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01inet: Remove unused sk_sndmsg_* from UFOHerbert Xu
UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-28net: TX timestamps for IPv6 UDP packetsAnders Berggren
Enabling TX timestamps (SO_TIMESTAMPING) for IPv6 UDP packets, in the same fashion as for IPv4. Necessary in order for NICs such as Intel 82580 to timestamp IPv6 packets. Signed-off-by: Anders Berggren <anders@halon.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25ipv6: ignore rtnl_unicast() return codeHagen Paul Pfeifer
rtnl_unicast() return value is not of interest, we can silently ignore it, save some instructions and four byte on the stack. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25ipv6: variable next is never used in this functionHagen Paul Pfeifer
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25ipv6: hash is calculated but not used afterwardsHagen Paul Pfeifer
hash is declared and assigned but not used anymore. ipv6_addr_hash() exhibit no side-effects. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25ipv6: totlen is declared and assigned but not usedHagen Paul Pfeifer
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25sysctl: ipv6: use correct net in ipv6_sysctl_rtcache_flushLucian Adrian Grijincu
Before this patch issuing these commands: fd = open("/proc/sys/net/ipv6/route/flush") unshare(CLONE_NEWNET) write(fd, "stuff") would flush the newly created net, not the original one. The equivalent ipv4 code is correct (stores the net inside ->extra1). Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23xfrm: Const'ify address arguments to ->dst_lookup()David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23xfrm: Const'ify tmpl and address arguments to ->init_temprop()David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22xfrm: Mark flowi arg to xfrm_type->reject() const.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22xfrm: Mark flowi arg to ->init_tempsel() const.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22xfrm: Mark flowi arg to ->fill_dst() const.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22xfrm: Mark flowi arg to ->get_tos() const.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-20tcp: Remove debug macro of TCP_CHECK_TIMERShan Wei
Now, TCP_CHECK_TIMER is not used for debuging, it does nothing. And, it has been there for several years, maybe 6 years. Remove it to keep code clearer. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-19Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/e1000e/netdev.c net/xfrm/xfrm_policy.c
2011-02-19Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2011-02-18net: provide default_advmss() methods to blackhole dst_opsEric Dumazet
Commit 0dbaee3b37e118a (net: Abstract default ADVMSS behind an accessor.) introduced a possible crash in tcp_connect_init(), when dst->default_advmss() is called from dst_metric_advmss() Reported-by: George Spelvin <linux@horizon.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: Add initial_ref arg to dst_alloc().David S. Miller
This allows avoiding multiple writes to the initial __refcnt. The most simplest cases of wanting an initial reference of "1" in ipv4 and ipv6 have been converted, the rest have been left along and kept at the existing "0". Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17netfilter: ip6t_LOG: fix a flaw in printing the MACJoerg Marx
The flaw was in skipping the second byte in MAC header due to increasing the pointer AND indexed access starting at '1'. Signed-off-by: Joerg Marx <joerg.marx@secunet.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-15Merge branch 'master' into for-nextJiri Kosina
2011-02-10inet: Create a mechanism for upward inetpeer propagation into routes.David S. Miller
If we didn't have a routing cache, we would not be able to properly propagate certain kinds of dynamic path attributes, for example PMTU information and redirects. The reason is that if we didn't have a routing cache, then there would be no way to lookup all of the active cached routes hanging off of sockets, tunnels, IPSEC bundles, etc. Consider the case where we created a cached route, but no inetpeer entry existed and also we were not asked to pre-COW the route metrics and therefore did not force the creation a new inetpeer entry. If we later get a PMTU message, or a redirect, and store this information in a new inetpeer entry, there is no way to teach that cached route about the newly existing inetpeer entry. The facilities implemented here handle this problem. First we create a generation ID. When we create a cached route of any kind, we remember the generation ID at the time of attachment. Any time we force-create an inetpeer entry in response to new path information, we bump that generation ID. The dst_ops->check() callback is where the knowledge of this event is propagated. If the global generation ID does not equal the one stored in the cached route, and the cached route has not attached to an inetpeer yet, we look it up and attach if one is found. Now that we've updated the cached route's information, we update the route's generation ID too. This clears the way for implementing PMTU and redirects directly in the inetpeer cache. There is absolutely no need to consult cached route information in order to maintain this information. At this point nothing bumps the inetpeer genids, that comes in the later changes which handle PMTUs and redirects using inetpeers. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10inetpeer: Abstract address representation further.David S. Miller
Future changes will add caching information, and some of these new elements will be addresses. Since the family is implicit via the ->daddr.family member, replicating the family in ever address we store is entirely redundant. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08net: Kill NETEVENT_PMTU_UPDATE.David S. Miller
Nobody actually does anything in response to the event, so just kill it off. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04inetpeer: Move ICMP rate limiting state into inet_peer entries.David S. Miller
Like metrics, the ICMP rate limiting bits are cached state about a destination. So move it into the inet_peer entries. If an inet_peer cannot be bound (the reason is memory allocation failure or similar), the policy is to allow. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-02-03net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31net: Fix ipv6 neighbour unregister_sysctl_table warningEric W. Biederman
In my testing of 2.6.37 I was occassionally getting a warning about sysctl table entries being unregistered in the wrong order. Digging in it turns out this dates back to the last great sysctl reorg done where Al Viro introduced the requirement that sysctl directories needed to be created before and destroyed after the files in them. It turns out that in that great reorg /proc/sys/net/ipv6/neigh was overlooked. So this patch fixes that oversight and makes an annoying warning message go away. >------------[ cut here ]------------ >WARNING: at kernel/sysctl.c:1992 unregister_sysctl_table+0x134/0x164() >Pid: 23951, comm: kworker/u:3 Not tainted 2.6.37-350888.2010AroraKernelBeta.fc14.x86_64 #1 >Call Trace: > [<ffffffff8103e034>] warn_slowpath_common+0x80/0x98 > [<ffffffff8103e061>] warn_slowpath_null+0x15/0x17 > [<ffffffff810452f8>] unregister_sysctl_table+0x134/0x164 > [<ffffffff810e7834>] ? kfree+0xc4/0xd1 > [<ffffffff813439b2>] neigh_sysctl_unregister+0x22/0x3a > [<ffffffffa02cd14e>] addrconf_ifdown+0x33f/0x37b [ipv6] > [<ffffffff81331ec2>] ? skb_dequeue+0x5f/0x6b > [<ffffffffa02ce4a5>] addrconf_notify+0x69b/0x75c [ipv6] > [<ffffffffa02eb953>] ? ip6mr_device_event+0x98/0xa9 [ipv6] > [<ffffffff813d2413>] notifier_call_chain+0x32/0x5e > [<ffffffff8105bdea>] raw_notifier_call_chain+0xf/0x11 > [<ffffffff8133cdac>] call_netdevice_notifiers+0x45/0x4a > [<ffffffff8133d2b0>] rollback_registered_many+0x118/0x201 > [<ffffffff8133d3af>] unregister_netdevice_many+0x16/0x6d > [<ffffffff8133d571>] default_device_exit_batch+0xa4/0xb8 > [<ffffffff81337c42>] ? cleanup_net+0x0/0x194 > [<ffffffff81337a2a>] ops_exit_list+0x4e/0x56 > [<ffffffff81337d36>] cleanup_net+0xf4/0x194 > [<ffffffff81053318>] process_one_work+0x187/0x280 > [<ffffffff8105441b>] worker_thread+0xff/0x19f > [<ffffffff8105431c>] ? worker_thread+0x0/0x19f > [<ffffffff8105776d>] kthread+0x7d/0x85 > [<ffffffff81003824>] kernel_thread_helper+0x4/0x10 > [<ffffffff810576f0>] ? kthread+0x0/0x85 > [<ffffffff81003820>] ? kernel_thread_helper+0x0/0x10 >---[ end trace 8a7e9310b35e9486 ]--- Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31net: Add default_mtu() methods to blackhole dst_opsRoland Dreier
When an IPSEC SA is still being set up, __xfrm_lookup() will return -EREMOTE and so ip_route_output_flow() will return a blackhole route. This can happen in a sndmsg call, and after d33e455337ea ("net: Abstract default MTU metric calculation behind an accessor.") this leads to a crash in ip_append_data() because the blackhole dst_ops have no default_mtu() method and so dst_mtu() calls a NULL pointer. Fix this by adding default_mtu() methods (that simply return 0, matching the old behavior) to the blackhole dst_ops. The IPv4 part of this patch fixes a crash that I saw when using an IPSEC VPN; the IPv6 part is untested because I don't have an IPv6 VPN, but it looks to be needed as well. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27net: Store ipv4/ipv6 COW'd metrics in inetpeer cache.David S. Miller
Please note that the IPSEC dst entry metrics keep using the generic metrics COW'ing mechanism using kmalloc/kfree. This gives the IPSEC routes an opportunity to use metrics which are unique to their encapsulated paths. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-01-27ipv6: Remove route peer binding assertions.David S. Miller
They are bogus. The basic idea is that I wanted to make sure that prefixed routes never bind to peers. The test I used was whether RTF_CACHE was set. But first of all, the RTF_CACHE flag is set at different spots depending upon which ip6_rt_copy() caller you're talking about. I've validated all of the code paths, and even in the future where we bind peers more aggressively (for route metric COW'ing) we never bind to prefix'd routes, only fully specified ones. This even applies when addrconf or icmp6 routes are allocated. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26net: Implement read-only protection and COW'ing of metrics.David S. Miller
Routing metrics are now copy-on-write. Initially a route entry points it's metrics at a read-only location. If a routing table entry exists, it will point there. Else it will point at the all zero metric place-holder called 'dst_default_metrics'. The writeability state of the metrics is stored in the low bits of the metrics pointer, we have two bits left to spare if we want to store more states. For the initial implementation, COW is implemented simply via kmalloc. However future enhancements will change this to place the writable metrics somewhere else, in order to increase sharing. Very likely this "somewhere else" will be the inetpeer cache. Note also that this means that metrics updates may transiently fail if we cannot COW the metrics successfully. But even by itself, this patch should decrease memory usage and increase cache locality especially for routing workloads. In those cases the read-only metric copies stay in place and never get written to. TCP workloads where metrics get updated, and those rare cases where PMTU triggers occur, will take a very slight performance hit. But that hit will be alleviated when the long-term writable metrics move to a more sharable location. Since the metrics storage went from a u32 array of RTAX_MAX entries to what is essentially a pointer, some retooling of the dst_entry layout was necessary. Most importantly, we need to preserve the alignment of the reference count so that it doesn't share cache lines with the read-mostly state, as per Eric Dumazet's alignment assertion checks. The only non-trivial bit here is the move of the 'flags' member into the writeable cacheline. This is OK since we are always accessing the flags around the same moment when we made a modification to the reference count. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-01-26xfrm6: Don't forget to propagate peer into ipsec route.David S. Miller
Like ipv4, we have to propagate the ipv6 route peer into the ipsec top-level route during instantiation. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-25ipv6: Revert 'administrative down' address handling changes.David S. Miller
This reverts the following set of commits: d1ed113f1669390da9898da3beddcc058d938587 ("ipv6: remove duplicate neigh_ifdown") 29ba5fed1bbd09c2cba890798c8f9eaab251401d ("ipv6: don't flush routes when setting loopback down") 9d82ca98f71fd686ef2f3017c5e3e6a4871b6e46 ("ipv6: fix missing in6_ifa_put in addrconf") 2de795707294972f6c34bae9de713e502c431296 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept") 8595805aafc8b077e01804c9a3668e9aa3510e89 ("IPv6: only notify protocols if address is compeletely gone") 27bdb2abcc5edb3526e25407b74bf17d1872c329 ("IPv6: keep tentative addresses in hash table") 93fa159abe50d3c55c7f83622d3f5c09b6e06f4b ("IPv6: keep route for tentative address") 8f37ada5b5f6bfb4d251a7f510f249cb855b77b3 ("IPv6: fix race between cleanup and add/delete address") 84e8b803f1e16f3a2b8b80f80a63fa2f2f8a9be6 ("IPv6: addrconf notify when address is unavailable") dc2b99f71ef477a31020511876ab4403fb7c4420 ("IPv6: keep permanent addresses on admin down") because the core semantic change to ipv6 address handling on ifdown has broken some things, in particular "disable_ipv6" sysctl handling. Stephen has made several attempts to get things back in working order, but nothing has restored disable_ipv6 fully yet. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24ipv6: Always clone offlink routes.David S. Miller
Do not handle PMTU vs. route lookup creation any differently wrt. offlink routes, always clone them. Reported-by: PK <runningdoglackey@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24net: change netdev->features to u32Michał Mirosław
Quoting Ben Hutchings: we presumably won't be defining features that can only be enabled on 64-bit architectures. Occurences found by `grep -r` on net/, drivers/net, include/ [ Move features and vlan_features next to each other in struct netdev, as per Eric Dumazet's suggestion -DaveM ] Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20ipv6: raw: rcu annotationsEric Dumazet
Remove sparse warnings, using a function typedef to be able to use __rcu annotation on mh_filter pointer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20net: ipv6: sit: fix rcu annotationsEric Dumazet
Fix minor __rcu annotations and remove sparse warnings Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>