summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-10-21netxen: make local function static.stephen hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21fib: introduce fib_alias_accessed() helperEric Dumazet
Perf tools session at NFWS 2010 pointed out a false sharing on struct fib_alias that can be avoided pretty easily, if we set FA_S_ACCESSED bit only if needed (ie : not already set) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21net/sched: fix missing spinlock initEric Dumazet
Under network load, doing : tc qdisc del dev eth0 root triggers : [ 167.193087] BUG: spinlock bad magic on CPU#3, udpflood/4928 [ 167.193139] lock: c15bc324, .magic: 00000000, .owner: <none>/-1, .owner_cpu: -1 [ 167.193193] Pid: 4928, comm: udpflood Not tainted 2.6.36-rc7-11417-g215340c-dirty #323 [ 167.193245] Call Trace: [ 167.193292] [<c13abaa0>] ? printk+0x18/0x20 [ 167.193342] [<c11afb53>] spin_bug+0xa3/0xf0 [ 167.193389] [<c11afcdd>] do_raw_spin_lock+0x7d/0x160 [ 167.193440] [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0 [ 167.193496] [<c107382b>] ? trace_hardirqs_on+0xb/0x10 [ 167.193545] [<c13ae99a>] _raw_spin_lock+0x3a/0x40 [ 167.193593] [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0 [ 167.193641] [<c1313d4e>] __dev_xmit_skb+0x27e/0x2b0 commit 79640a4ca695 (add additional lock to qdisc to increase throughput) forgot to initialize noop_qdisc and noqueue_qdisc busylock Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ipvs: provide address family for debuggingJulian Anastasov
As skb->protocol is not valid in LOCAL_OUT add parameter for address family in packet debugging functions. Even if ports are not present in AH and ESP change them to use ip_vs_tcpudp_debug_packet to show at least valid addresses as before. This patch removes the last user of skb->protocol in IPVS. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: inherit forwarding method in backupJulian Anastasov
Connections in backup server should inherit the forwarding method from real server. It is a way to fix a problem where the forwarding method in backup connection is damaged by logical OR operation with the real server's connection flags. And the change is needed for setups where the backup server uses different forwarding method for the same real servers. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: changes for local clientJulian Anastasov
This patch deals with local client processing. Prefer LOCAL_OUT hook for scheduling connections from local clients. LOCAL_IN is still supported if the packets are not marked as processed in LOCAL_OUT. The idea to process requests in LOCAL_OUT is to alter conntrack reply before it is confirmed at POST_ROUTING. If the local requests are processed in LOCAL_IN the conntrack can not be updated and matching by state is impossible. Add the following handlers: - ip_vs_reply[46] at LOCAL_IN:99 to process replies from remote real servers to local clients. Now when both replies from remote real servers (ip_vs_reply*) and local real servers (ip_vs_local_reply*) are handled it is safe to remove the conn_out_get call from ip_vs_in because it does not support related ICMP packets. - ip_vs_local_request[46] at LOCAL_OUT:-98 to process requests from local client Handling in LOCAL_OUT causes some changes: - as skb->dev, skb->protocol and skb->pkt_type are not defined in LOCAL_OUT make sure we set skb->dev before calling icmpv6_send, prefer skb_dst(skb) for struct net and remove the skb->protocol checks from TUN transmitters. [ horms@verge.net.au: removed trailing whitespace ] Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: changes for local real serverJulian Anastasov
This patch deals with local real servers: - Add support for DNAT to local address (different real server port). It needs ip_vs_out hook in LOCAL_OUT for both families because skb->protocol is not set for locally generated packets and can not be used to set 'af'. - Skip packets in ip_vs_in marked with skb->ipvs_property because ip_vs_out processing can be executed in LOCAL_OUT but we still have the conn_out_get check in ip_vs_in. - Ignore packets with inet->nodefrag from local stack - Require skb_dst(skb) != NULL because we use it to get struct net - Add support for changing the route to local IPv4 stack after DNAT depending on the source address type. Local client sets output route and the remote client sets input route. It looks like IPv6 does not need such rerouting because the replies use addresses from initial incoming header, not from skb route. - All transmitters now have strict checks for the destination address type: redirect from non-local address to local real server requires NAT method, local address can not be used as source address when talking to remote real server. - Now LOCALNODE is not set explicitly as forwarding method in real server to allow the connections to provide correct forwarding method to the backup server. Not sure if this breaks tools that expect to see 'Local' real server type. If needed, this can be supported with new flag IP_VS_DEST_F_LOCAL. Now it should be possible connections in backup that lost their fwmark information during sync to be forwarded properly to their daddr, even if it is local address in the backup server. By this way backup could be used as real server for DR or TUN, for NAT there are some restrictions because tuple collisions in conntracks can create problems for the traffic. - Call ip_vs_dst_reset when destination is updated in case some real server IP type is changed between local and remote. [ horms@verge.net.au: removed trailing whitespace ] Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: move ip_route_me_harder for ICMPJulian Anastasov
Currently, ip_route_me_harder after ip_vs_out_icmp is called even if packet is not related to IPVS connection. Move it into handle_response_icmp. Also, force rerouting if sending to local client because IPv4 stack uses addresses from the route. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: create ip_vs_defrag_userJulian Anastasov
Create new function ip_vs_defrag_user to return correct IP_DEFRAG_xxx user depending on the hooknum. It will be needed when we add handlers in LOCAL_OUT. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: fix CHECKSUM_PARTIAL for TUN methodJulian Anastasov
The recent change in IP_VS_XMIT_TUNNEL to set CHECKSUM_NONE is not correct. After adding IPIP header skb->csum becomes invalid but the CHECKSUM_PARTIAL case must be supported. So, use skb_forward_csum() which is most suitable for us to allow local clients to send IPIP to remote real server. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: stop ICMP from FORWARD to localJulian Anastasov
Delivering locally ICMP from FORWARD hook is not supported. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: do not schedule conns from real serversJulian Anastasov
This patch is needed to avoid scheduling of packets from local real server when we add ip_vs_in in LOCAL_OUT hook to support local client. Currently, when ip_vs_in can not find existing connection it tries to create new one by calling ip_vs_schedule. The default indication from ip_vs_schedule was if connection was scheduled to real server. If real server is not available we try to use the bypass forwarding method or to send ICMP error. But in some cases we do not want to use the bypass feature. So, add flag 'ignored' to indicate if the scheduler ignores this packet. Make sure we do not create new connections from replies. We can hit this problem for persistent services and local real server when ip_vs_in is added to LOCAL_OUT hook to handle local clients. Also, make sure ip_vs_schedule ignores SYN packets for Active FTP DATA from local real server. The FTP DATA connection should be created on SYN+ACK from client to assign correct connection daddr. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: switch to notrack modeJulian Anastasov
Change skb->ipvs_property semantic. This is preparation to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1 will be used to avoid expensive lookups for traffic sent by transmitters. Now when conntrack support is not used we call ip_vs_notrack method to avoid problems in OUTPUT and POST_ROUTING hooks instead of exiting POST_ROUTING as before. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: optimize checksums for appsJulian Anastasov
Avoid full checksum calculation for apps that can provide info whether csum was broken after payload mangling. For now only ip_vs_ftp mangles payload and it updates the csum, so the full recalculation is avoided for all packets. Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP). It is needed to support SNAT from local address for the case when csum is fully recalculated. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21ipvs: fix CHECKSUM_PARTIAL for TCP, UDPJulian Anastasov
Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP, UDP not tested because it needs network card with HW CSUM support. May be fixes problem where IPVS can not be used in virtual boxes. Problem appears with DNAT to local address when the local stack sends reply in CHECKSUM_PARTIAL mode. Fix tcp_dnat_handler and udp_dnat_handler to provide vaddr and daddr in right order (old and new IP) when calling tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL). Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2010-10-21r8169: print errors when dma mapping failStanislaw Gruszka
If dma mapping fail we are dropping packages or fail to open device. But exact reason of drop/fail stays unknow for a user, so print errors. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: (re)init phy on resumeStanislaw Gruszka
Fix switching device to low-speed mode after resume reported in: https://bugzilla.redhat.com/show_bug.cgi?id=502974 Reported-and-tested-by: Laurentiu Badea <bugzilla-redhat@wotevah.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: changing mtu clean upStanislaw Gruszka
Since we do not change rx buffer size any longer, we can clean up rtl8169_change_mtu and in consequence rtl8169_down. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: do not account fragments as packetsStanislaw Gruszka
Only increase tx_{packets,dropped} statistics when transmit or drop full skb, not just fragment. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: use pointer to struct device as local variableStanislaw Gruszka
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: replace PCI_DMA_{TO,FROM}DEVICE to DMA_{TO,FROM}_DEVICEStanislaw Gruszka
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: init rx ring cleanupStanislaw Gruszka
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21r8169: check dma mapping failuresStanislaw Gruszka
Check possible dma mapping errors and do clean up if it happens. Fix overwrap bug in rtl8169_tx_clear on the way. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21block: Turn bvec_k{un,}map_irq() into static inline functionsGeert Uytterhoeven
Convert bvec_k{un,}map_irq() from macros to static inline functions if !CONFIG_HIGHMEM, so we can easier detect mistakes like the one fixed in 93055c31045a2d5599ec613a0c6cdcefc481a460 ("ps3disk: passing wrong variable = to bvec_kunmap_irq()") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-21bnx2x: Update bnx2x to use new vlan accleration.Hao Zheng
Make the bnx2x driver use the new vlan accleration model. Signed-off-by: Hao Zheng <hzheng@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> CC: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ixgbe: Update ixgbe to use new vlan accleration.Jesse Gross
Make the ixgbe driver use the new vlan accleration model. Signed-off-by: Jesse Gross <jesse@nicira.com> CC: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com> CC: Emil Tantilov <emil.s.tantilov@intel.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21bnx2: Update bnx2 to use new vlan accleration.Jesse Gross
Make the bnx2 driver use the new vlan accleration model. Signed-off-by: Jesse Gross <jesse@nicira.com> CC: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21bridge: Add support for TX vlan offload.Jesse Gross
If some of the underlying devices support it, enable vlan offload on transmit for bridge devices. This allows senders to take advantage of the hardware support, similar to other forms of acceleration. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ethtool: Add support for vlan accleration.Jesse Gross
Now that vlan acceleration is handled consistently regardless of usage, it is possible to enable and disable it at will. This adds support for Ethtool operations that change the offloading status for debugging purposes, similar to other forms of hardware acceleration. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21vlan: Centralize handling of hardware acceleration.Jesse Gross
Currently each driver that is capable of vlan hardware acceleration must be aware of the vlan groups that are configured and then pass the stripped tag to a specialized receive function. This is different from other types of hardware offload in that it places a significant amount of knowledge in the driver itself rather keeping it in the networking core. This makes vlan offloading function more similarly to other forms of offloading (such as checksum offloading or TSO) by doing the following: * On receive, stripped vlans are passed directly to the network core, without attempting to check for vlan groups or reconstructing the header if no group * vlans are made less special by folding the logic into the main receive routines * On transmit, the device layer will add the vlan header in software if the hardware doesn't support it, instead of spreading that logic out in upper layers, such as bonding. There are a number of advantages to this: * Fixes all bugs with drivers incorrectly dropping vlan headers at once. * Avoids having to disable VLAN acceleration when in promiscuous mode (good for bridging since it always puts devices in promiscuous mode). * Keeps VLAN tag separate until given to ultimate consumer, which avoids needing to do header reconstruction as in tg3 unless absolutely necessary. * Consolidates common code in core networking. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21vlan: Avoid hash table lookup to find group.Jesse Gross
A struct net_device always maps to zero or one vlan groups and we always know the device when we are looking up a group. We currently do a hash table lookup on the device to find the group but it is much simpler to just store a pointer. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21vlan: Enable software emulation for vlan accleration.Jesse Gross
Currently users of hardware vlan accleration need to know whether the device supports it before generating packets. However, vlan acceleration will soon be available in a more flexible manner so knowing ahead of time becomes much more difficult. This adds a software fallback path for vlan packets on devices without the necessary offloading support, similar to other types of hardware accleration. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21vlan: Don't check for vlan group before vlan_tx_tag_present.Jesse Gross
Many (but not all) drivers check to see whether there is a vlan group configured before using a tag stored in the skb. There's not much point in this check since it just throws away data that should only be present in the expected circumstances. However, it will soon be legal and expected to get a vlan tag when no vlan group is configured, so remove this check from all drivers to avoid dropping the tags. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21vlan: Rename VLAN_GROUP_ARRAY_LEN to VLAN_N_VID.Jesse Gross
VLAN_GROUP_ARRAY_LEN is simply the number of possible vlan VIDs. Since vlan groups will soon be more of an implementation detail for vlan devices, rename the constant to be descriptive of its actual purpose. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ebtables: Allow filtering of hardware accelerated vlan frames.Jesse Gross
An upcoming commit will allow packets with hardware vlan acceleration information to be passed though more parts of the network stack, including packets trunked through the bridge. This adds support for matching and filtering those packets through ebtables. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21enic: Fix log messageVasanthy Kolluri
Fix a log message Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21enic: Change min MTUVasanthy Kolluri
Change min MTU to 68. Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21enic: Replace firmware devcmd CMD_ENABLE with CMD_ENABLE_WAITVasanthy Kolluri
Replace no wait CMD_ENABLE firmware devcmd with CMD_ENABLE_WAIT Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21enic: Make firmware cognizant of the user set mac addressVasanthy Kolluri
Let the firmware know about the mac address set by the user using ndo_set_mac_address Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21enic: Add support for multiple hardware receive queuesVasanthy Kolluri
Add support for multiple hardware receive queues. The ingress traffic is hashed into one of the receive queues based on IP or TCP or both headers. The max no. of receive queues supported is 8. Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ibmveth: Free irq on error pathDenis Kirjanov
Free irq on error path. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21ibmveth: Cleanup error handling inside ibmveth_openDenis Kirjanov
Remove duplicated code in one place. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21UBI: tighten the corrupted PEB criteriaArtem Bityutskiy
If we get a bit-flip of ECC error while reading the data area, do not add it to corrupted list, because it is possible that this is just unstable PEB with corruptions caused by unclean reboots. This patch also improves commentaries. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21UBI: fix check_data_ff return codeArtem Bityutskiy
When the data does not contain all 0xFF bytes, 'check_data_ff()' should return 1, not -EINVAL; Also, the caller ('process_eb()') should not add the PEB to the "corrupted" list if there was a read error. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21UBI: remember copy_flag while scanningArtem Bityutskiy
While scanning the flash we read all VID headers and store some important information in 'struct ubi_scan_leb'. Store also the 'copy_flag' value there as it is needed when comparing LEBs. We do not increase memory consumption because this is just one bit and we have plenty of spare bits in 'struct ubi_scan_leb' (sizeof(struct ubi_scan_leb) is 48 both with and without this patch). Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21UBIFS: do not allocate unneeded scan bufferArtem Bityutskiy
In 'ubifs_replay_journal()' we allocate 'sbuf' for scanning the log. However, we already have 'c->sbuf' for these purposes, so do not allocate yet another one. This reduces UBIFS memory consumption while recovering. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21UBIFS: do not forget to cancel timersArtem Bityutskiy
This is a bug-fix: when we unmount, and we are currently in R/O mode because of an error - we do not sync write-buffers, which means we also do not cancel write-buffer timers we may possibly have armed. This patch fixes the issue. The issue can easily be reproduced by enabling UBIFS failure debug mode (echo 4 > /sys/module/ubifs/parameters/debug_tsts) and unmounting as soon as a failure happen. At some point the system oopses because we have an armed hrtimer but UBIFS is unmounted already. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21UBIFS: remove a bit of unneeded codeArtem Bityutskiy
This is a clean-up patch which: 1. Removes explicite 'hrtimer_cancel()' after 'ubifs_wbuf_sync()' in 'ubifs_remount_ro()', because the timers will be canceled by 'ubifs_wbuf_sync()', no need to cancel them for the second time. 2. Remove "if (c->jheads)" check from 'ubifs_put_super()', because at journal heads must always be allocated there, since we checked earlier that we were mounted R/W, and the olny situation when journal heads are not allocated is when mounter or re-mounted R/O. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-10-21Merge branch 'vhost-net' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
2010-10-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6