summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2009-03-20Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c
2009-03-19SVCRDMA: fix recent printk format warnings.Tom Talpey
printk formats in prior commit were reversed/incorrect. Compiled without warning on x86 and x86_64, but detected on ppc. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: Ensure we close the socket on EPIPE errors too...Trond Myklebust
As long as one task is holding the socket lock, then calls to xprt_force_disconnect(xprt) will not succeed in shutting down the socket. In particular, this would mean that a server initiated shutdown will not succeed until the lock is relinquished. In order to avoid the deadlock, we should ensure that xs_tcp_send_request() closes the socket on EPIPE errors too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: xs_tcp_connect_worker{4,6}: merge common codeTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: Add a sysctl to control the duration of the socket linger timeoutTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC socketsTrond Myklebust
This fixes a regression against FreeBSD servers as reported by Tomas Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers don't ever react to the client closing the socket, and so commit e06799f958bf7f9f8fae15f0c6f519953fb0257c (SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket) causes the setup to hang forever whenever the client attempts to close and then reconnect. We break the deadlock by adding a 'linger2' style timeout to the socket, after which, the client will abort the connection using a TCP 'RST'. The default timeout is set to 15 seconds. A subsequent patch will put it under user control by means of a systctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-18netns: oops in ip[6]_frag_reasm incrementing statsJorge Boncompte [DTI2]
dev can be NULL in ip[6]_frag_reasm for skb's coming from RAW sockets. Quagga's OSPFD sends fragmented packets on a RAW socket, when netfilter conntrack reassembles them on the OUTPUT path you hit this code path. You can test it with something like "hping2 -0 -d 2000 -f AA.BB.CC.DD" With help from Jarek Poplawski. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18net: kfree(napi->skb) => kfree_skbRoel Kluin
struct sk_buff pointers should be freed with kfree_skb. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18net: fix sctp breakageAl Viro
broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should be -stable fodder as well... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18tipc: fix non-const printf format argumentsStephen Hemminger
Fix warnings from current gcc about using non-const strings as printf args in TIPC. Compile tested only (not a TIPC user). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18ipv6: fix display of local and remote sit endpointsBjørn Mork
This fixes the regressions cause by commit 1326c3d5a4b792a2b15877feb7fb691f8945d203 (v2.6.28-rc6-461-g23a12b1) broke the display of local and remote addresses of an SIT tunnel in iproute2. nt->parms is used by ipip6_tunnel_init() and therefore need to be initialized first. Tracked as http://bugzilla.kernel.org/show_bug.cgi?id=12868 Reported-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18tcp: remove parameter from tcp_recv_urg().Rami Rosen
This patch removes an unused parameter (addr_len) from tcp_recv_urg() method in net/ipv4/tcp.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18ipv6: Fix incorrect disable_ipv6 behaviorBrian Haley
Fix the behavior of allowing both sysctl and addrconf_dad_failure() to set the disable_ipv6 parameter without any bad side-effects. If DAD fails and accept_dad > 1, we will still set disable_ipv6=1, but then instead of allowing an RA to add an address then immediately fail DAD, we simply don't allow the address to be added in the first place. This also lets the user set this flag and disable all IPv6 addresses on the interface, or on the entire system. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-18svcrpc: take advantage of tcp autotuningOlga Kornievskaia
Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18knfsd: add file to export stats about nfsd poolsGreg Banks
Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void *)1) * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_*()" by Harshula Jayasuriya. * merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18knfsd: avoid overloading the CPU scheduler with enormous load averagesGreg Banks
Avoid overloading the CPU scheduler with enormous load averages when handling high call-rate NFS loads. When the knfsd bottom half is made aware of an incoming call by the socket layer, it tries to choose an nfsd thread and wake it up. As long as there are idle threads, one will be woken up. If there are lot of nfsd threads (a sensible configuration when the server is disk-bound or is running an HSM), there will be many more nfsd threads than CPUs to run them. Under a high call-rate low service-time workload, the result is that almost every nfsd is runnable, but only a handful are actually able to run. This situation causes two significant problems: 1. The CPU scheduler takes over 10% of each CPU, which is robbing the nfsd threads of valuable CPU time. 2. At a high enough load, the nfsd threads starve userspace threads of CPU time, to the point where daemons like portmap and rpc.mountd do not schedule for tens of seconds at a time. Clients attempting to mount an NFS filesystem timeout at the very first step (opening a TCP connection to portmap) because portmap cannot wake up from select() and call accept() in time. Disclaimer: these effects were observed on a SLES9 kernel, modern kernels' schedulers may behave more gracefully. The solution is simple: keep in each svc_pool a counter of the number of threads which have been woken but have not yet run, and do not wake any more if that count reaches an arbitrary small threshold. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. The server was running 128 nfsds. Profiling showed schedule() taking 6.7% of every CPU, and __wake_up() taking 5.2%. This patch drops those contributions to 3.0% and 2.2%. Load average was over 120 before the patch, and 20.9 after. This patch is a forward-ported version of knfsd-avoid-nfsd-overload which has been shipping in the SGI "Enhanced NFS" product since 2006. It has been posted before: http://article.gmane.org/gmane.linux.nfs/10374 Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18netfilter: ctnetlink: fix rcu context imbalancePatrick McHardy
Introduced by 7ec47496 (netfilter: ctnetlink: cleanup master conntrack assignation): net/netfilter/nf_conntrack_netlink.c:1275:2: warning: context imbalance in 'ctnetlink_create_conntrack' - different lock contexts for basic block Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-18netfilter: remove nf_ct_l4proto_find_get/nf_ct_l4proto_putFlorian Westphal
users have been moved to __nf_ct_l4proto_find. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-18netfilter: ctnetlink: remove remaining module refcountingFlorian Westphal
Convert the remaining refcount users. As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2009-03-17Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/igb/igb_main.c drivers/net/qlge/qlge_main.c drivers/net/wireless/ath9k/ath9k.h drivers/net/wireless/ath9k/core.h drivers/net/wireless/ath9k/hw.c
2009-03-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-03-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2009-03-17gro: Fix legacy path napi_complete crashHerbert Xu
On the legacy netif_rx path, I incorrectly tried to optimise the napi_complete call by using __napi_complete before we reenable IRQs. This simply doesn't work since we need to flush the held GRO packets first. This patch fixes it by doing the obvious thing of reenabling IRQs first and then calling napi_complete. Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-17gro: Fix vlan/netpoll check againHerbert Xu
Jarek Poplawski pointed out that my previous fix is broken for VLAN+netpoll as if netpoll is enabled we'd end up in the normal receive path instead of the VLAN receive path. This patch fixes it by calling the VLAN receive hook. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-16cfg80211: add regulatory netlink multicast groupLuis R. Rodriguez
This allows us to send to userspace "regulatory" events. For now we just send an event when we change regulatory domains. We also notify userspace when devices are using their own custom world roaming regulatory domains. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16cfg80211: move enum reg_set_by to nl80211.hLuis R. Rodriguez
We do this so we can later inform userspace who set the regulatory domain and provide details of the request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16cfg80211: remove REGDOM_SET_BY_INITLuis R. Rodriguez
This is not used as we can always just assume the first regulatory domain set will _always_ be a static regulatory domain. REGDOM_SET_BY_CORE will be the first request from cfg80211 for a regdomain and that then populates the first regulatory request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16mac80211: deauth before flushing STA informationHerton Ronaldo Krzesinski
Even after commit "mac80211: deauth when interface is marked down" (e327b847 on Linus tree), userspace still isn't notified when interface goes down. There isn't a problem with this commit, but because of other code changes it doesn't work on kernels >= 2.6.28 (works if same/similar change applied on 2.6.27 for example). The issue is as follows: after commit "mac80211: restructure disassoc/deauth flows" in 2.6.28, the call to ieee80211_sta_deauthenticate added by commit e327b847 will not work: because we do sta_info_flush(local, sdata) inside ieee80211_stop (iface.c), all stations in interface are cleared, so when calling ieee80211_sta_deauthenticate->ieee80211_set_disassoc (mlme.c), inside ieee80211_set_disassoc we have this in the beginning: sta = sta_info_get(local, ifsta->bssid); if (!sta) { The !sta check triggers, thus the function returns early and ieee80211_sta_send_apinfo(sdata, ifsta) later isn't called, so wpa_supplicant/userspace isn't notified with SIOCGIWAP. This commit moves deauthentication to before flushing STA info (sta_info_flush), thus the above can't happen and userspace is really notified when interface goes down. Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16mac80211: handle failed scan requests in STA modeHelmut Schaa
If cfg80211 requests a scan it awaits either a return code != 0 from the scan function or the cfg80211_scan_done to be called. In case of a STA mac80211's scan function ever returns 0 and queues the scan request. If ieee80211_sta_work is executed and ieee80211_start_scan fails for some reason cfg80211_scan_done will never be called but cfg80211 still thinks the scan was triggered successfully and will refuse any future scan requests due to drv->scan_req not being cleaned up. If a scan is triggered from within the MLME a similar problem appears. If ieee80211_start_scan returns an error, local->scan_req will not be reset and mac80211 will refuse any future scan requests. Hence, in both cases call ieee80211_scan_failed (which notifies cfg80211 and resets local->scan_req) if ieee80211_start_scan returns an error. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16cfg80211: fix max tx power for world regdom on 5 GHz to 20dBmLuis R. Rodriguez
This is the lowest value amongst countries which do enable 5 GHz operation. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16cfg80211: Enable passive scan on channels 12-14 for world roamingLuis R. Rodriguez
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16mac80211: Fix WMM ACM parsing and AC downgrade operationJouni Malinen
Incorrect local->wmm_acm bits were set for AC_BK and AC_BE. Fix this and add some comments to make it easier to understand the AC-to-UP(pair) mapping. Set the wmm_acm bits (and show WMM debug) even if the driver does not implement conf_tx() handler. In addition, fix the ACM-based AC downgrade code to not use the highest priority in error cases. We need to break the loop to get the correct AC_BK value (3) instead of returning 0 (which would indicate AC_VO). The comment here was not really very useful either, so let's provide somewhat more helpful description of the situation. Since it is very unlikely that the ACM flag would be set for AC_BK and AC_BE, these bugs are not likely to be seen in real life networks. Anyway, better do these things correctly should someone really use silly AP configuration (and to pass some functionality tests, too). Remove the TODO comment about handling ACM. Downgrading AC is perfectly valid mechanism for ACM. Eventually, we may add support for WMM-AC and send a request for a TS, but anyway, that functionality won't be here at the location of this TODO comment. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16mac80211: Fix panic on fragmentation with power savingJouni Malinen
It was possible to hit a kernel panic on NULL pointer dereference in dev_queue_xmit() when sending power save buffered frames to a STA that woke up from sleep. This happened when the buffered frame was requeued for transmission in ap_sta_ps_end(). In order to avoid the panic, copy the skb->dev and skb->iif values from the first fragment to all other fragments. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16lib80211: silence excessive crypto debugging messagesJohn W. Linville
When they were part of the now defunct ieee80211 component, these messages were only visible when special debugging settings were enabled. Let's mirror that with a new lib80211 debugging Kconfig option. Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-16GRO: Move netpoll checks to correct locationHerbert Xu
As my netpoll fix for net doesn't really work for net-next, we need this update to move the checks into the right place. As it stands we may pass freed skbs to netpoll_receive_skb. This patch also introduces a netpoll_rx_on function to avoid GRO completely if we're invoked through netpoll. This might seem paranoid but as netpoll may have an external receive hook it's better to be safe than sorry. I don't think we need this for 2.6.29 though since there's nothing immediately broken by it. This patch also moves the GRO_* return values to netdevice.h since VLAN needs them too (I tried to avoid this originally but alas this seems to be the easiest way out). This fixes a bug in VLAN where it continued to use the old return value 2 instead of the correct GRO_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-16netfilter: xtables: add cluster matchPablo Neira Ayuso
This patch adds the iptables cluster match. This match can be used to deploy gateway and back-end load-sharing clusters. The cluster can be composed of 32 nodes maximum (although I have only tested this with two nodes, so I cannot tell what is the real scalability limit of this solution in terms of cluster nodes). Assuming that all the nodes see all packets (see below for an example on how to do that if your switch does not allow this), the cluster match decides if this node has to handle a packet given: (jhash(source IP) % total_nodes) & node_mask For related connections, the master conntrack is used. The following is an example of its use to deploy a gateway cluster composed of two nodes (where this is the node 1): iptables -I PREROUTING -t mangle -i eth1 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth1 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth1 \ -m mark ! --mark 0xffff -j DROP iptables -A PREROUTING -t mangle -i eth2 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth2 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth2 \ -m mark ! --mark 0xffff -j DROP And the following commands to make all nodes see the same packets: ip maddr add 01:00:5e:00:01:01 dev eth1 ip maddr add 01:00:5e:00:01:02 dev eth2 arptables -I OUTPUT -o eth1 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:01 arptables -I INPUT -i eth1 --h-length 6 \ --destination-mac 01:00:5e:00:01:01 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 arptables -I OUTPUT -o eth2 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:02 arptables -I INPUT -i eth2 --h-length 6 \ --destination-mac 01:00:5e:00:01:02 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 In the case of TCP connections, pickup facility has to be disabled to avoid marking TCP ACK packets coming in the reply direction as valid. echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose BTW, some final notes: * This match mangles the skbuff pkt_type in case that it detects PACKET_MULTICAST for a non-multicast address. This may be done in a PKTTYPE target for this sole purpose. * This match supersedes the CLUSTERIP target. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16net: netfilter conntrack - add per-net functionality for DCCP protocolCyrill Gorcunov
Module specific data moved into per-net site and being allocated/freed during net namespace creation/deletion. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16net: sysctl_net - use net_eq to compare netsCyrill Gorcunov
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)" r8169: use hardware auto-padding. igb: remove ASPM L0s workaround netxen: remove old flash check. mv643xx_eth: fix unicast address filter corruption on mtu change xfrm: Fix xfrm_state_find() wrt. wildcard source address. emac: Fix clock control for 405EX and 405EXr chips ixgbe: fix multiple unicast address support via-velocity: Fix DMA mapping length errors on transmit. qlge: bugfix: Pad outbound frames smaller than 60 bytes. qlge: bugfix: Move netif_napi_del() to common call point. qlge: bugfix: Tell hw to strip vlan header. qlge: bugfix: Increase filter on inbound csum. dnet: replace obsolete *netif_rx_* functions with *napi_* net: Add be2net driver. dnet: Fix warnings on 64-bit. dnet: Dave DNET ethernet controller driver (updated) ipv6: Fix BUG when disabled ipv6 module is unloaded bnx2x: Using DMAE to initialize the chip bnx2x: Casting page alignment ...
2009-03-16netfilter: conntrack: check for NEXTHDR_NONE before header sanity checkingChristoph Paasch
NEXTHDR_NONE doesn't has an IPv6 option header, so the first check for the length will always fail and results in a confusing message "too short" if debugging enabled. With this patch, we check for NEXTHDR_NONE before length sanity checkings are done. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: conntrack: fix dropping packet after l4proto->packet()Christoph Paasch
We currently use the negative value in the conntrack code to encode the packet verdict in the error. As NF_DROP is equal to 0, inverting NF_DROP makes no sense and, as a result, no packets are ever dropped. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: ctnetlink: fix crash during expectation creationPablo Neira Ayuso
This patch fixes a possible crash due to the missing initialization of the expectation class when nf_ct_expect_related() is called. Reported-by: BORBELY Zoltan <bozo@andrews.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: xtables: avoid pointer to selfJan Engelhardt
Commit 784544739a25c30637397ace5489eeb6e15d7d49 (netfilter: iptables: lock free counters) broke a number of modules whose rule data referenced itself. A reallocation would not reestablish the correct references, so it is best to use a separate struct that does not fall under RCU. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16Move FASYNC bit handling to f_op->fasync()Jonathan Corbet
Removing the BKL from FASYNC handling ran into the challenge of keeping the setting of the FASYNC bit in filp->f_flags atomic with regard to calls to the underlying fasync() function. Andi Kleen suggested moving the handling of that bit into fasync(); this patch does exactly that. As a result, we have a couple of internal API changes: fasync() must now manage the FASYNC bit, and it will be called without the BKL held. As it happens, every fasync() implementation in the kernel with one exception calls fasync_helper(). So, if we make fasync_helper() set the FASYNC bit, we can avoid making any changes to the other fasync() functions - as long as those functions, themselves, have proper locking. Most fasync() implementations do nothing but call fasync_helper() - which has its own lock - so they are easily verified as correct. The BKL had already been pushed down into the rest. The networking code has its own version of fasync_helper(), so that code has been augmented with explicit FASYNC bit handling. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16netfilter: auto-load ip_queue module when socket openedScott James Remnant
The ip_queue module is missing the net-pf-16-proto-3 alias that would causae it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: auto-load ip6_queue module when socket openedScott James Remnant
The ip6_queue module is missing the net-pf-16-proto-13 alias that would cause it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: ctnetlink: move event reporting for new entries outside the lockPablo Neira Ayuso
This patch moves the event reporting outside the lock section. With this patch, the creation and update of entries is homogeneous from the event reporting perspective. Moreover, as the event reporting is done outside the lock section, the netlink broadcast delivery can benefit of the yield() call under congestion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: ctnetlink: cleanup conntrack update preliminary checkingsPablo Neira Ayuso
This patch moves the preliminary checkings that must be fulfilled to update a conntrack, which are the following: * NAT manglings cannot be updated * Changing the master conntrack is not allowed. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16netfilter: ctnetlink: cleanup master conntrack assignationPablo Neira Ayuso
This patch moves the assignation of the master conntrack to ctnetlink_create_conntrack(), which is where it really belongs. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>