summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2010-06-22netfilter: xt_IDLETIMER needs kdev_t.hRandy Dunlap
Add header file to fix build error: net/netfilter/xt_IDLETIMER.c:276: error: implicit declaration of function 'MKDEV' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-22IPVS: one-packet schedulingNick Chalk
Allow one-packet scheduling for UDP connections. When the fwmark-based or normal virtual service is marked with '-o' or '--ops' options all connections are created only to schedule one packet. Useful to schedule UDP packets from same client port to different real servers. Recommended with RR or WRR schedulers (the connections are not visible with ipvsadm -L). Signed-off-by: Nick Chalk <nick@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-21udp: Fix bogus UFO packet generationHerbert Xu
It has been reported that the new UFO software fallback path fails under certain conditions with NFS. I tracked the problem down to the generation of UFO packets that are smaller than the MTU. The software fallback path simply discards these packets. This patch fixes the problem by not generating such packets on the UFO path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-21mac80211: Add interface for driver to temporarily disable dynamic psJuuso Oikarinen
This mechanism introduced in this patch applies (at least) for hardware designs using a single shared antenna for both WLAN and BT. In these designs, the antenna must be toggled between WLAN and BT. In those hardware, managing WLAN co-existence with Bluetooth requires WLAN full power save whenever there is Bluetooth activity in order for WLAN to be able to periodically relinquish the antenna to be used for BT. This is because BT can only access the shared antenna when WLAN is idle or asleep. Some hardware, for instance the wl1271, are able to indicate to the host whenever there is BT traffic. In essence, the hardware will send an indication to the host whenever there is, for example, SCO traffic or A2DP traffic, and will send another indication when the traffic is over. The hardware gets information of Bluetooth traffic via hardware co-existence control lines - these lines are used to negotiate the shared antenna ownership. The hardware will give the antenna to BT whenever WLAN is sleeping. This patch adds the interface to mac80211 to facilitate temporarily disabling of dynamic power save as per request of the WLAN driver. This interface will immediately force WLAN to full powersave, hence allowing BT coexistence as described above. In these kind of shared antenna desings, when WLAN powersave is fully disabled, Bluetooth will not work simultaneously with WLAN at all. This patch does not address that problem. This interface will not change PSM state, so if PSM is disabled it will remain so. Solving this problem requires knowledge about BT state, and is best done in user-space. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-21mac80211: Fix compile warning in scan.c.Gertjan van Wingerde
Fix the following compile warning: CC [M] net/mac80211/scan.o net/mac80211/scan.c: In function 'ieee80211_request_internal_scan': net/mac80211/scan.c:749:23: warning: comparison between 'enum nl80211_band' and 'enum ieee80211_band' caused by the local variable band not being of the proper 'ieee80211_band' type. Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-20caif: Add debug connection type for CAIF.Sjur Braendeland
Added new CAIF protocol type CAIFPROTO_DEBUG for accessing CAIF debug on the ST Ericsson modems. There are two debug servers on the modem, one for radio related debug (CAIF_RADIO_DEBUG_SERVICE) and the other for communication/application related debug (CAIF_COM_DEBUG_SERVICE). The debug connection can contain trace debug printouts or interactive debug used for debugging and test. Debug connections can be of type STREAM or SEQPACKET. Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20caif: Use link layer MTU instead of fixed MTUSjur Braendeland
Previously CAIF supported maximum transfer size of ~4050. The transfer size is now calculated dynamically based on the link layers mtu size. Signed-off-by: Sjur Braendeland@stericsson.com Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20caif: Bugfix - RFM must support segmentation.Sjur Braendeland
CAIF Remote File Manager may send or receive more than 4050 bytes. Due to this The CAIF RFM service have to support segmentation. Signed-off-by: Sjur Braendeland@stericsson.com Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20caif: Bugfix not all services uses flow-ctrl.Sjur Braendeland
Flow control is not used by all CAIF services. The usage of flow control is now part of the gerneal initialization function for CAIF Services. Signed-off-by: Sjur Braendeland@stericsson.com Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-18mac80211: fix sw scan bracketingJohannes Berg
Currently, detection in hwsim and ath9k can detect that two sw scans are in flight at the same time, which isn't really true. It is caused by a race condition, because the scan complete callback is called too late, after the lock has been dropped, so that a new scan can be started before it is called. It is also called too early semantically, as it is currently called _after_ the return to the operating channel -- it should be before so that drivers know this is the operating channel again. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-18wireless: move regulatory_init to .init.textUwe Kleine-König
regulatory_init is only called by cfg80211_init which is in .init.text, too. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-18cfg80211: move cfg80211_exit to .exit.textUwe Kleine-König
cfg80211_exit is only used as module_exit function, so it can go to .exit.text saving a few bytes when CONFIG_CFG80211=y. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-17Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2010-06-17bridge: fdb cleanup runs too oftenstephen hemminger
It is common in end-node, non STP bridges to set forwarding delay to zero; which causes the forwarding database cleanup to run every clock tick. Change to run only as soon as needed or at next ageing timer interval which ever is sooner. Use round_jiffies_up macro rather than attempting round up by changing value. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-17Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 Conflicts: net/mac80211/mlme.c
2010-06-17netfilter: nf_nat: support user-specified SNAT rules in LOCAL_INPatrick McHardy
2.6.34 introduced 'conntrack zones' to deal with cases where packets from multiple identical networks are handled by conntrack/NAT. Packets are looped through veth devices, during which they are NATed to private addresses, after which they can continue normally through the stack and possibly have NAT rules applied a second time. This works well, but is needlessly complicated for cases where only a single SNAT/DNAT mapping needs to be applied to these packets. In that case, all that needs to be done is to assign each network to a seperate zone and perform NAT as usual. However this doesn't work for packets destined for the machine performing NAT itself since its corrently not possible to configure SNAT mappings for the LOCAL_IN chain. This patch adds a new INPUT chain to the NAT table and changes the targets performing SNAT to be usable in that chain. Example usage with two identical networks (192.168.0.0/24) on eth0/eth1: iptables -t raw -A PREROUTING -i eth0 -j CT --zone 1 iptables -t raw -A PREROUTING -i eth0 -j MARK --set-mark 1 iptables -t raw -A PREROUTING -i eth1 -j CT --zone 2 iptabels -t raw -A PREROUTING -i eth1 -j MARK --set-mark 2 iptables -t nat -A INPUT -m mark --mark 1 -j NETMAP --to 10.0.0.0/24 iptables -t nat -A POSTROUTING -m mark --mark 1 -j NETMAP --to 10.0.0.0/24 iptables -t nat -A INPUT -m mark --mark 2 -j NETMAP --to 10.0.1.0/24 iptables -t nat -A POSTROUTING -m mark --mark 2 -j NETMAP --to 10.0.1.0/24 iptables -t raw -A PREROUTING -d 10.0.0.0/24 -j CT --zone 1 iptables -t raw -A OUTPUT -d 10.0.0.0/24 -j CT --zone 1 iptables -t raw -A PREROUTING -d 10.0.1.0/24 -j CT --zone 2 iptables -t raw -A OUTPUT -d 10.0.1.0/24 -j CT --zone 2 iptables -t nat -A PREROUTING -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24 iptables -t nat -A OUTPUT -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24 iptables -t nat -A PREROUTING -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24 iptables -t nat -A OUTPUT -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24 Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-16net: Export cred_to_ucred to modules.David S. Miller
AF_UNIX references this, and can be built as a module, so... Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16af_unix: Allow connecting to sockets in other network namespaces.Eric W. Biederman
Remove the restriction that only allows connecting to a unix domain socket identified by unix path that is in the same network namespace. Crossing network namespaces is always tricky and we did not support this at first, because of a strict policy of don't mix the namespaces. Later after Pavel proposed this we did not support this because no one had performed the audit to make certain using unix domain sockets across namespaces is safe. What fundamentally makes connecting to af_unix sockets in other namespaces is safe is that you have to have the proper permissions on the unix domain socket inode that lives in the filesystem. If you want strict isolation you just don't create inodes where unfriendlys can get at them, or with permissions that allow unfriendlys to open them. All nicely handled for us by the mount namespace and other standard file system facilities. I looked through unix domain sockets and they are a very controlled environment so none of the work that goes on in dev_forward_skb to make crossing namespaces safe appears needed, we are not loosing controll of the skb and so do not need to set up the skb to look like it is comming in fresh from the outside world. Further the fields in struct unix_skb_parms should not have any problems crossing network namespaces. Now that we handle SCM_CREDENTIALS in a way that gives useable values across namespaces. There does not appear to be any operational problems with encouraging the use of unix domain sockets across containers either. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16af_unix: Allow credentials to work across user and pid namespaces.Eric W. Biederman
In unix_skb_parms store pointers to struct pid and struct cred instead of raw uid, gid, and pid values, then translate the credentials on reception into values that are meaningful in the receiving processes namespaces. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16scm: Capture the full credentials of the scm sender.Eric W. Biederman
Start capturing not only the userspace pid, uid and gid values of the sending process but also the struct pid and struct cred of the sending process as well. This is in preparation for properly supporting SCM_CREDENTIALS for sockets that have different uid and/or pid namespaces at the different ends. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16af_netlink: Add needed scm_destroy after scm_send.Eric W. Biederman
scm_send occasionally allocates state in the scm_cookie, so I have modified netlink_sendmsg to guarantee that when scm_send succeeds scm_destory will be called to free that state. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16af_unix: Allow SO_PEERCRED to work across namespaces.Eric W. Biederman
Use struct pid and struct cred to store the peer credentials on struct sock. This gives enough information to convert the peer credential information to a value relative to whatever namespace the socket is in at the time. This removes nasty surprises when using SO_PEERCRED on socket connetions where the processes on either side are in different pid and user namespaces. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16sock: Introduce cred_to_ucredEric W. Biederman
To keep the coming code clear and to allow both the sock code and the scm code to share the logic introduce a fuction to translate from struct cred to struct ucred. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16Clear IFF_XMIT_DST_RELEASE for teql interfacesTom Hughes
https://bugzilla.kernel.org/show_bug.cgi?id=16183 The sch_teql module, which can be used to load balance over a set of underlying interfaces, stopped working after 2.6.30 and has been broken in all kernels since then for any underlying interface which requires the addition of link level headers. The problem is that the transmit routine relies on being able to access the destination address in the skb in order to do address resolution once it has decided which underlying interface it is going to transmit through. In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by default for all interfaces, which causes the destination address to be released before the transmit routine for the interface is called. The solution is to clear that flag for teql interfaces. Signed-off-by: Tom Hughes <tom@compton.nu> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16syncookies: check decoded options against sysctl settingsFlorian Westphal
Discard the ACK if we find options that do not match current sysctl settings. Previously it was possible to create a connection with sack, wscale, etc. enabled even if the feature was disabled via sysctl. Also remove an unneeded call to tcp_sack_reset() in cookie_check_timestamp: Both call sites (cookie_v4_check, cookie_v6_check) zero "struct tcp_options_received", hand it to tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack) and then call cookie_check_timestamp(). Even if num_sacks/dsacks were changed, the structure is allocated on the stack and after cookie_check_timestamp returns only a few selected members are copied to the inet_request_sock. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-06-16mac80211: fix warn, enum may be used uninitializedChristoph Fritz
regression introduced by b8d92c9c141ee3dc9b3537b1f0ffb4a54ea8d9b2 In function ‘ieee80211_work_rx_queued_mgmt’: warning: ‘rma’ may be used uninitialized in this function this re-adds default value WORK_ACT_NONE back to rma Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-16inetpeer: restore small inet_peer structuresEric Dumazet
Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches. Thats a bit unfortunate, since old size was exactly 64 bytes. This can be solved, using an union between this rcu_head an four fields, that are normally used only when a refcount is taken on inet_peer. rcu_head is used only when refcnt=-1, right before structure freeing. Add a inet_peer_refcheck() function to check this assertion for a while. We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16Merge branch 'master' into for-nextJiri Kosina
2010-06-16fix typos concerning "initiali[zs]e"Uwe Kleine-König
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-15inetpeer: do not use zero refcnt for freed entriesEric Dumazet
Followup of commit aa1039e73cc2 (inetpeer: RCU conversion) Unused inet_peer entries have a null refcnt. Using atomic_inc_not_zero() in rcu lookups is not going to work for them, and slow path is taken. Fix this using -1 marker instead of 0 for deleted entries. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15bridge: Add const to dummy br_netpoll_send_skbHerbert Xu
The version of br_netpoll_send_skb used when netpoll is off is missing a const thus causing a warning. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15bridge: Fix OOM crash in deliver_cloneHerbert Xu
The bridge multicast patches introduced an OOM crash in the forward path, when deliver_clone fails to clone the skb. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15ipfrag : frag_kfree_skb() cleanupEric Dumazet
Third param (work) is unused, remove it. Remove __inline__ and inline qualifiers. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15ip_frag: Remove some atomic opsEric Dumazet
Instead of doing one atomic operation per frag, we can factorize them. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15ipv6: syncookies: do not skip ->iif initializationFlorian Westphal
When syncookies are in effect, req->iif is left uninitialized. In case of e.g. link-local addresses the route lookup then fails and no syn-ack is sent. Rearrange things so ->iif is also initialized in the syncookie case. want_cookie can only be true when the isn was zero, thus move the want_cookie check into the "!isn" branch. Cc: Glenn Griffin <ggriffin.kernel@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15inetpeer: RCU conversionEric Dumazet
inetpeer currently uses an AVL tree protected by an rwlock. It's possible to make most lookups use RCU 1) Add a struct rcu_head to struct inet_peer 2) add a lookup_rcu_bh() helper to perform lockless and opportunistic lookup. This is a normal function, not a macro like lookup(). 3) Add a limit to number of links followed by lookup_rcu_bh(). This is needed in case we fall in a loop. 4) add an smp_wmb() in link_to_pool() right before node insert. 5) make unlink_from_pool() use atomic_cmpxchg() to make sure it can take last reference to an inet_peer, since lockless readers could increase refcount, even while we hold peers.lock. 6) Delay struct inet_peer freeing after rcu grace period so that lookup_rcu_bh() cannot crash. 7) inet_getpeer() first attempts lockless lookup. Note this lookup can fail even if target is in AVL tree, but a concurrent writer can let tree in a non correct form. If this attemps fails, lock is taken a regular lookup is performed again. 8) convert peers.lock from rwlock to a spinlock 9) Remove SLAB_HWCACHE_ALIGN when peer_cachep is created, because rcu_head adds 16 bytes on 64bit arches, doubling effective size (64 -> 128 bytes) In a future patch, this is probably possible to revert this part, if rcu field is put in an union to share space with rid, ip_id_count, tcp_ts & tcp_ts_stamp. These fields being manipulated only with refcnt > 0. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-06-15mac80211: Use a separate CCMP PN receive counter for management framesJouni Malinen
When management frame protection (IEEE 802.11w) is used, we must use a separate counter for tracking received CCMP packet number for the management frames. The previously used NUM_RX_DATA_QUEUESth queue was shared with data frames when QoS was not used and that can cause problems in detecting replays incorrectly for robust management frames. Add a new counter just for robust management frames to avoid this issue. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-15mac80211: Protect Deauthentication frame when using MFPJouni Malinen
When management frame protection (IEEE 802.11w) is used, Deauthentication frame needs to be protected when the pairwise key is configured. mac80211 was removing the station entry (and its keys) before actually sending out the Deauthentication frame. Fix this by reordering the code to send the frame before the station entry gets removed. This matches an earlier change that handled the Disassociation frame processing, but missed Deauthentication frames. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-15mac80211: Fix ps-qos network latency handlingJuuso Oikarinen
The ps-qos latency handling is broken. It uses predetermined latency values to select specific dynamic PS timeouts. With common AP configurations, these values overlap with beacon interval and are therefore essentially useless (for network latencies less than the beacon interval, PSM is disabled.) This patch remedies the problem by replacing the predetermined network latency values with one high value (1900ms) which is used to go trigger full psm. For backwards compatibility, the value 2000ms is still mapped to a dynamic ps timeout of 100ms. Currently also the mac80211 internal value for storing user space configured dynamic PSM values is incorrectly in the driver visible ieee80211_conf struct. Move it to the ieee80211_local struct. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-15Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-06-15tcp: unify tcp flag macrosChangli Gao
unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH, TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced with the corresponding TCPHDR_*. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- include/net/tcp.h | 24 ++++++------- net/ipv4/tcp.c | 8 ++-- net/ipv4/tcp_input.c | 2 - net/ipv4/tcp_output.c | 59 ++++++++++++++++----------------- net/netfilter/nf_conntrack_proto_tcp.c | 32 ++++++----------- net/netfilter/xt_TCPMSS.c | 4 -- 6 files changed, 58 insertions(+), 71 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15bridge: use rx_handler_data pointer to store net_bridge_port pointerJiri Pirko
Register net_bridge_port pointer as rx_handler data pointer. As br_port is removed from struct net_device, another netdev priv_flag is added to indicate the device serves as a bridge port. Also rcuized pointers are now correctly dereferenced in br_fdb.c and in netfilter parts. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15net: add rx_handler data pointerJiri Pirko
Add possibility to register rx_handler data pointer along with a rx_handler. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15bridge: Fix netpoll supportHerbert Xu
There are multiple problems with the newly added netpoll support: 1) Use-after-free on each netpoll packet. 2) Invoking unsafe code on netpoll/IRQ path. 3) Breaks when netpoll is enabled on the underlying device. This patch fixes all of these problems. In particular, we now allocate proper netpoll structures for each underlying device. We only allow netpoll to be enabled on the bridge when all the devices underneath it support netpoll. Once it is enabled, we do not allow non-netpoll devices to join the bridge (until netpoll is disabled again). This allows us to do away with the npinfo juggling that caused problem number 1. Incidentally this patch fixes number 2 by bypassing unsafe code such as multicast snooping and netfilter. Reported-by: Qianfeng Zhang <frzhang@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15netpoll: Allow netpoll_setup/cleanup recursionHerbert Xu
This patch adds the functions __netpoll_setup/__netpoll_cleanup which is designed to be called recursively through ndo_netpoll_seutp. They must be called with RTNL held, and the caller must initialise np->dev and ensure that it has a valid reference count. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15netpoll: Add ndo_netpoll_setupHerbert Xu
This patch adds ndo_netpoll_setup as the initialisation primitive to complement ndo_netpoll_cleanup. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15netpoll: Add locking for netpoll_setup/cleanupHerbert Xu
As it stands, netpoll_setup and netpoll_cleanup have no locking protection whatsoever. So chaos ensures if two entities try to perform them on the same device. This patch adds RTNL to the equation. The code has been rearranged so that bits that do not need RTNL protection are now moved to the top of netpoll_setup. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-15netpoll: Fix RCU usageHerbert Xu
The use of RCU in netpoll is incorrect in a number of places: 1) The initial setting is lacking a write barrier. 2) The synchronize_rcu is in the wrong place. 3) Read barriers are missing. 4) Some places are even missing rcu_read_lock. 5) npinfo is zeroed after freeing. This patch fixes those issues. As most users are in BH context, this also converts the RCU usage to the BH variant. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>