linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-01-29	tuntap: allow polling/writing/reading when detached	Jason Wang
	We forbid polling, writing and reading when the file were detached, this may complex the user in several cases: - when guest pass some buffers to vhost/qemu and then disable some queues, host/qemu needs to do its own cleanup on those buffers which is complex sometimes. We can do this simply by allowing a user can still write to an disabled queue. Write to an disabled queue will cause the packet pass to the kernel and read will get nothing. - align the polling behavior with macvtap which never fails when the queue is created. This can simplify the polling errors handling of its user (e.g vhost) We can simply achieve this by don't assign NULL to tfile->tun when detached. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	vhost_net: handle polling errors when setting backend	Jason Wang
	Currently, the polling errors were ignored, which can lead following issues: - vhost remove itself unconditionally from waitqueue when stopping the poll, this may crash the kernel since the previous attempt of starting may fail to add itself to the waitqueue - userspace may think the backend were successfully set even when the polling failed. Solve this by: - check poll->wqh before trying to remove from waitqueue - report polling errors in vhost_poll_start(), tx_poll_start(), the return value will be checked and returned when userspace want to set the backend After this fix, there still could be a polling failure after backend is set, it will addressed by the next patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	vhost_net: correct error handling in vhost_net_set_backend()	Jason Wang
	Currently, when vhost_init_used() fails the sock refcnt and ubufs were leaked. Correct this by calling vhost_init_used() before assign ubufs and restore the oldsock when it fails. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	tun: fix carrier on/off status	Michael S. Tsirkin
	Commit c8d68e6be1c3b242f1c598595830890b65cea64a removed carrier off call from tun_detach since it's now called on queue disable and not only on tun close. This confuses userspace which used this flag to detect a free tun. To fix, put this back but under if (clean). Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	pktgen: correctly handle failures when adding a device	Cong Wang
	The return value of pktgen_add_device() is not checked, so even if we fail to add some device, for example, non-exist one, we still see "OK:...". This patch fixes it. After this patch, I got: # echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0 -bash: echo: write error: No such device # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: Result: ERROR: can not add device non-exist # echo "add_device eth0" > /proc/net/pktgen/kpktgend_0 # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: eth0 Result: OK: add_device=eth0 (Candidate for -stable) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	netem: fix delay calculation in rate extension	Johannes Naab
	The delay calculation with the rate extension introduces in v3.3 does not properly work, if other packets are still queued for transmission. For the delay calculation to work, both delay types (latency and delay introduces by rate limitation) have to be handled differently. The latency delay for a packet can overlap with the delay of other packets. The delay introduced by the rate however is separate, and can only start, once all other rate-introduced delays finished. Latency delay is from same distribution for each packet, rate delay depends on the packet size. .: latency delay -: rate delay x: additional delay we have to wait since another packet is currently transmitted .....---- Packet 1 .....xx------ Packet 2 .....------ Packet 3 ^^^^^ latency stacks ^^ rate delay doesn't stack ^^ latency stacks -----> time When a packet is enqueued, we first consider the latency delay. If other packets are already queued, we can reduce the latency delay until the last packet in the queue is send, however the latency delay cannot be <0, since this would mean that the rate is overcommitted. The new reference point is the time at which the last packet will be send. To find the time, when the packet should be send, the rate introduces delay has to be added on top of that. Signed-off-by: Johannes Naab <jn@stusta.de> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	l2tp: prevent l2tp_tunnel_delete racing with userspace close	Tom Parkin
	If a tunnel socket is created by userspace, l2tp hooks the socket destructor in order to clean up resources if userspace closes the socket or crashes. It also caches a pointer to the struct sock for use in the data path and in the netlink interface. While it is safe to use the cached sock pointer in the data path, where the skb references keep the socket alive, it is not safe to use it elsewhere as such access introduces a race with userspace closing the socket. In particular, l2tp_tunnel_delete is prone to oopsing if a multithreaded userspace application closes a socket at the same time as sending a netlink delete command for the tunnel. This patch fixes this oops by forcing l2tp_tunnel_delete to explicitly look up a tunnel socket held by userspace using sockfd_lookup(). Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller
	Bring in the 'net' tree so that we can get some ipv4/ipv6 bug fixes that some net-next work will build upon. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	ipv6: add anti-spoofing checks for 6to4 and 6rd	Hannes Frederic Sowa
	This patch adds anti-spoofing checks in sit.c as specified in RFC3964 section 5.2 for 6to4 and RFC5969 section 12 for 6rd. I left out the checks which could easily be implemented with netfilter. Specifically this patch adds following logic (based loosely on the pseudocode in RFC3964 section 5.2): if prefix (inner_src_v6) == rd6_prefix (2002::/16 is the default) and outer_src_v4 != embedded_ipv4 (inner_src_v6) drop if prefix (inner_dst_v6) == rd6_prefix (or 2002::/16 is the default) and outer_dst_v4 != embedded_ipv4 (inner_dst_v6) drop accept To accomplish the specified security checks proposed by above RFCs, it is still necessary to employ uRPF filters with netfilter. These new checks only kick in if the employed addresses are within the 2002::/16 or another range specified by the 6rd-prefix (which defaults to 2002::/16). Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	gianfar: Pack struct gfar_priv_grp into three cachelines	Claudiu Manoil
	* remove unused members(!): imask, ievent * move space consuming interrupt name strings (int_name_* members) to external structures, unessential for the driver's hot path * keep high priority hot path data within the first 2 cache lines This reduces struct gfar_priv_grp from 6 to 3 cache lines. (Also fixed checkpatch warnings for the old code, in the process.) Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	gianfar: Cleanup gfar_parse_group() code	Claudiu Manoil
	Factor out redundant code (improve readability, source code size). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	gianfar: Optimize struct gfar_priv_tx_q for two cache lines	Claudiu Manoil
	Resize and regroup structure members to eliminate memory holes and to pack the structure into 2 cache lines (from 3). tx_ring_size was resized from 4 to 2 bytes and few members were re-grouped in order to eliminate byte holes and achieve compactness. Where possible, few members were grouped according to their usage and access order (i.e. start_xmit vs. clean_tx_ring members), less important members were pushed at the end. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	ipv6: Fix inet6_csk_bind_conflict so it builds with user namespaces enabled	Eric W. Biederman
	When attempting to build linux-next with user namespaces enabled I ran into this fun build error. CC net/ipv6/inet6_connection_sock.o .../net/ipv6/inet6_connection_sock.c: In function ‘inet6_csk_bind_conflict’: .../net/ipv6/inet6_connection_sock.c:37:12: error: incompatible types when initializing type ‘int’ using type ‘kuid_t’ .../net/ipv6/inet6_connection_sock.c:54:30: error: incompatible type for argument 1 of ‘uid_eq’ .../include/linux/uidgid.h:48:20: note: expected ‘kuid_t’ but argument is of type ‘int’ make[3]: * [net/ipv6/inet6_connection_sock.o] Error 1 make[2]: * [net/ipv6] Error 2 make[2]: *** Waiting for unfinished jobs.... Using kuid_t instead of int to hold the uid fixes this. Cc: Tom Herbert <therbert@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	pktgen: support net namespace	Cong Wang
	v3: make pktgen_threads list per-namespace v2: remove a useless check This patch add net namespace to pktgen, so that we can use pktgen in different namespaces. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	mac80211: dynamic short slot time for MBSSs	Thomas Pedersen
	The standard mandates mesh STAs to set the ERP Short Slot Time capability info bit in beacons to 0. Even though this is their way of disallowing short slot time for mesh STAs, there should be no harm in enabling it if we determine all STAs in the current MBSS support ERP rates. Increases throughput about 20% for legacy rates when enabled. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-29	net: fec: add napi support to improve proformance	Frank Li
	Add napi support Before this patch iperf -s -i 1 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.192.242.153 port 5001 connected with 10.192.242.138 port 50004 [ ID] Interval Transfer Bandwidth [ 4] 0.0- 1.0 sec 41.2 MBytes 345 Mbits/sec [ 4] 1.0- 2.0 sec 43.7 MBytes 367 Mbits/sec [ 4] 2.0- 3.0 sec 42.8 MBytes 359 Mbits/sec [ 4] 3.0- 4.0 sec 43.7 MBytes 367 Mbits/sec [ 4] 4.0- 5.0 sec 42.7 MBytes 359 Mbits/sec [ 4] 5.0- 6.0 sec 43.8 MBytes 367 Mbits/sec [ 4] 6.0- 7.0 sec 43.0 MBytes 361 Mbits/sec After this patch [ 4] 2.0- 3.0 sec 51.6 MBytes 433 Mbits/sec [ 4] 3.0- 4.0 sec 51.8 MBytes 435 Mbits/sec [ 4] 4.0- 5.0 sec 52.2 MBytes 438 Mbits/sec [ 4] 5.0- 6.0 sec 52.1 MBytes 437 Mbits/sec [ 4] 6.0- 7.0 sec 52.1 MBytes 437 Mbits/sec [ 4] 7.0- 8.0 sec 52.3 MBytes 439 Mbits/sec Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	ethoc: Cleanup driver format	Barry Grussling
	Cleanup the format of ethoc.c to meet network driver style as per checkpatch.pl. Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	ip_gre: When TOS is inherited, use configured TOS value for non-IP packets	David Ward
	A GRE tunnel can be configured so that outgoing tunnel packets inherit the value of the TOS field from the inner IP header. In doing so, when a non-IP packet is transmitted through the tunnel, the TOS field will always be set to 0. Instead, the user should be able to configure a different TOS value as the fallback to use for non-IP packets. This is helpful when the non-IP packets are all control packets and should be handled by routers outside the tunnel as having Internet Control precedence. One example of this is the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are encapsulated directly by GRE and do not contain an inner IP header. Under the existing behavior, the IFLA_GRE_TOS parameter must be set to '1' for the TOS value to be inherited. Now, only the least significant bit of this parameter must be set to '1', and when a non-IP packet is sent through the tunnel, the upper 6 bits of this same parameter will be copied into the TOS field. (The ECN bits get masked off as before.) This behavior is backwards-compatible with existing configurations and iproute2 versions. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	Merge branch 'for-linville' of ↵	John W. Linville
	git://git.kernel.org/pub/scm/linux/kernel/git/luca/wl12xx Conflicts: drivers/net/wireless/ti/wlcore/main.c
2013-01-29	ipv4: introduce address lifetime	Jiri Pirko
	There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	Merge branch 'ipfrags'	David S. Miller
	Jesper Dangaard Brouer says: ==================== This patchset is V2, with some trivial code fixes, which were noticed by DaveM. It is still a partly respin of my fragmentation optimization patches: http://thread.gmane.org/gmane.linux.network/250914 This is not the complete patchset, from the gmane link above. In this patchset, I primarily focus on adjusting cacheline for better SMP/NUMA performance. Once this patchset have been agreed upon, I will continue and respin the rest of my patches. This time around, I have created a frag DoS generator, via the tool trafgen (http://netsniff-ng.org/). To create a stable DoS scenario (no longer relying on frame dropping due to disabled flow-control). Two 10G interfaces are under-test, and uses Ethernet flow-control. A third interface is used for generating the DoS attack (this interface is also 10G, but it does not need to be, as 500Kpps DoS is enough). Test types summary (netperf): Test-20G64K == 2x10G with 65K fragments Test-20G3F == 2x10G with 3x fragments (31472 bytes) Test-20G64K+DoS == Same as 20G64K with frag DoS Test-20G3F+DoS == Same as 20G3F with frag DoS Patch list: Patch-01 - net: cacheline adjust struct netns_frags for better frag performance Patch-02 - net: cacheline adjust struct inet_frags for better frag performance Patch-03 - net: cacheline adjust struct inet_frag_queue Patch-04 - net: frag helper functions for mem limit tracking Patch-05 - net: use lib/percpu_counter API for fragmentation mem accounting Patch-06 - net: frag, move LRU list maintenance outside of rwlock Performance table summary: Test-type: Test-20G64K Test-20G3F 20G64K+DoS 20G3F+DoS ---------- ----------- ---------- ---------- --------- net-next: 15114.5 Mbit/s 8954.21 2444.28 3918.01 Mbit/s Patch-01: 16075.8 Mbit/s 8976.18 2621.49 4072.79 Mbit/s Patch-02: 17806.9 Mbit/s 9280.32 2478.62 4274.59 Mbit/s Patch-03: 17317.4 Mbit/s 9308.62 2546.05 4336.59 Mbit/s Patch-04: 17635.9 Mbit/s 9256.16 2535.25 4327.63 Mbit/s Patch-05: 18027.0 Mbit/s 9918.99 2492.62 3621.68 Mbit/s Patch-06: 18486.7 Mbit/s 10723.20 3657.85 4560.64 Mbit/s I cannot explain the under-DoS regression that patch-05/percpu_counter introduces. But patch-06/LRU-lock corrects the situation again. Below is a testlab setup description, with links to the trafgen DoS packet config used. Testlab ======= Server setup ------------ The machine acting as a server: - 2x CPU (E5-2630) - Thus a NUMA arch/machine - 4x 10Gbit/s ports - NICs 2x Intel Dual port 82599 based (driver ixgbe) Setup: - Interfaces uses Ethernet flow control - Flush all iptables - Remove all iptables related module. - Kill irqbalance - Pin each 10G NIC port to a single* CPU each Pinning can easily be done by command hacks:: for x in /proc/irq//eth8/../smp_affinity_list ; do echo 1 > $x; done for x in /proc/irq//eth9/../smp_affinity_list ; do echo 3 > $x; done for x in /proc/irq//eth31/../smp_affinity_list; do echo 6 > $x; done for x in /proc/irq//eth32/../smp_affinity_list; do echo 8 > $x; done Notice NUMA setting: The CPU to NIC tying is carefully choosen according to the NUMA node setup. Thus, NICs connected to a PCI-e slot that is connected to a physical CPU socket are tied together. Choosing only a single CPU per NIC (port) is just to ease provoking and debugging this performance issue. (In real setups, you can choose more CPU, just remember the NUMA node in the equation). Tools ----- Netperf is used, with option -T to ensure CPU binding. The netserver processes, are NAPI pinned:: numactl -m0 -c0 netserver numactl -m1 -c 1 netserver -p 1337 I now have a frag DoS generator, created via the tool: trafgen (see: http://netsniff-ng.org/) Trafgen packet config file: http://people.netfilter.org/hawk/frag_work/trafgen/frag_packet03_small_frag.txf Notice, I'm using features of trafgen, recently developed by Daniel Borkmann, thus you need the latest git tree to use my trafgen packet config. git://github.com/borkmann/netsniff-ng.git Command line: trafgen --dev eth51 --conf frag_packet03_small_frag.txf -V -k 100 --cpus 2 Tests types ----------- Test(20G64K) UDP-64K 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((65507)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} \| awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Test(20G3F) UDP-3xfrags 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((31472)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}. && \ tail -n3 ${LOG}.{31,81} \| awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Awk script for summming results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tail -n3 ${LOG}.{31,81} \| awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: frag, move LRU list maintenance outside of rwlock	Jesper Dangaard Brouer
	Updating the fragmentation queues LRU (Least-Recently-Used) list, required taking the hash writer lock. However, the LRU list isn't tied to the hash at all, so we can use a separate lock for it. Original-idea-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: use lib/percpu_counter API for fragmentation mem accounting	Jesper Dangaard Brouer
	Replace the per network namespace shared atomic "mem" accounting variable, in the fragmentation code, with a lib/percpu_counter. Getting percpu_counter to scale to the fragmentation code usage requires some tweaks. At first view, percpu_counter looks superfast, but it does not scale on multi-CPU/NUMA machines, because the default batch size is too small, for frag code usage. Thus, I have adjusted the batch size by using __percpu_counter_add() directly, instead of percpu_counter_sub() and percpu_counter_add(). The batch size is increased to 130.000, based on the largest 64K fragment memory usage. This does introduce some imprecise memory accounting, but its does not need to be strict for this use-case. It is also essential, that the percpu_counter, does not share cacheline with other writers, to make this scale. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: frag helper functions for mem limit tracking	Jesper Dangaard Brouer
	This change is primarily a preparation to ease the extension of memory limit tracking. The change does reduce the number atomic operation, during freeing of a frag queue. This does introduce a some performance improvement, as these atomic operations are at the core of the performance problems seen on NUMA systems. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: cacheline adjust struct inet_frag_queue	Jesper Dangaard Brouer
	Fragmentation code cacheline adjusting of struct inet_frag_queue. Take advantage of the size of struct timer_list, and move all but spinlock_t lock, below the timer struct. On 64-bit 'lru_list', 'list' and 'refcnt', fits exactly into the next cacheline, and a new cacheline starts at 'fragments'. The netns_frags *net pointer is moved to the end of the struct, because its used in a compare, with "next/close-by" elements of which this struct is embedded into. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: cacheline adjust struct inet_frags for better frag performance	Jesper Dangaard Brouer
	The globally shared rwlock, of struct inet_frags, shares cacheline with the 'rnd' number, which is used by the hash calculations. Fix this, as this obviously is a bad idea, as unnecessary cache-misses will occur when accessing the 'rnd' number. Also small note that, moving function ptr (*match) up in struct, is to avoid it lands on the next cacheline (on 64-bit). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: cacheline adjust struct netns_frags for better frag performance	Jesper Dangaard Brouer
	This small cacheline adjustment of struct netns_frags improves performance significantly for the fragmentation code. Struct members 'lru_list' and 'mem' are both hot elements, and it hurts performance, due to cacheline bouncing at every call point, when they share a cacheline. Also notice, how mem is placed together with 'high_thresh' and 'low_thresh', as they are used in the compare operations together. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	net: ks8851: convert to threaded IRQ	Felipe Balbi
	just as it should have been. It also helps removing the, now unnecessary, workqueue. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29	arm64: Select ARCH_WANT_FRAME_POINTERS	Catalin Marinas
	The compiler generates framepointers by default anyway but this is to avoid a kbuild warning. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	x86, efi: remove attribute check from setup_efi_pci	Maarten Lankhorst
	It looks like the original commit that copied the rom contents from efi always copied the rom, and the fixup in setup_efi_pci from commit 886d751a2ea99a160 ("x86, efi: correct precedence of operators in setup_efi_pci") broke that. This resulted in macbook pro's no longer finding the rom images, and thus not being able to use the radeon card any more. The solution is to just remove the check for now, and always copy the rom if available. Reported-by: Vitaly Budovski <vbudovski+news@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Seth Forshee <seth.forshee@canonical.com> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-01-29	arm64: Add kvm_para.h and xor.h generic headers	Catalin Marinas
	Required for make allyesconfig. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	ALSA: hda - Fix non-snoop page handling	Takashi Iwai
	For non-snoop mode, we fiddle with the page attributes of CORB/RIRB and the position buffer, but also the ring buffers. The problem is that the current code blindly assumes that the buffer is contiguous. However, the ring buffers may be SG-buffers, thus a wrong vmapped address is passed there, leading to Oops. This patch fixes the handling for SG-buffers. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=800701 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-01-29	nfc: pn533: Remove unreachable code	Waldemar Rymarkiewicz
	Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-29	nfc: pn533: Use static poll_mod and std_frame_ops	Waldemar Rymarkiewicz
	These variables are not exported. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-29	arm64: SMP: enable PSCI boot method	Marc Zyngier
	Wire the PSCI implementation into the SMP secondary startup code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	arm64: psci: add support for PSCI invocations from the kernel	Will Deacon
	This patch adds support for the Power State Coordination Interface defined by ARM, allowing Linux to request CPU-centric power-management operations from firmware implementing the PSCI protocol. Signed-off-by: Will Deacon <will.deacon@arm.com> [Marc: s/u32/u64/ in the relevant spots, and switch from an initcall to an simpler init function] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	arm64: SMP: rework the SMP code to be enabling method agnostic	Marc Zyngier
	In order to introduce PSCI support, let the SMP code handle multiple enabling methods. This also allow CPUs to be booted using different methods (though this feels a bit weird...). In the process, move the spin-table code to its own file. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	arm64: perf: add guest vs host discrimination	Marc Zyngier
	Add minimal guest support to perf, so it can distinguish whether the PMU interrupt was in the host or the guest, as well as collecting some very basic information (guest PC, user vs kernel mode). This is not feature complete though, as it doesn't support backtracing in the guest. Based on the x86 implementation, tested with KVM/arm64. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	arm64: add COMPAT_PSR_*_BIT flags	Marc Zyngier
	In order to mess with the processor state when running 32bit guests, define all the AArch32 PSR flags. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-29	tracing: Remove second iterator initializer	Jovi Zhang
	The trace iterator is already initialized by trace_init_global_iter(), so there is no need to initialize it again. Link: http://lkml.kernel.org/r/CACV3sb+G1YnO6168JhY3dEadmJi58pA5-2cSZT8E0WVHJNFt9Q@mail.gmail.com Signed-off-by: Jovi Zhang <bookjovi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29	Merge branch 'acpi-lpss' into acpi-cleanup	Rafael J. Wysocki
	The following commits depend on the 'acpi-lpss' material.
2013-01-29	Merge branch 'acpi-scan' into acpi-cleanup	Rafael J. Wysocki
	The following commits depend on the 'acpi-scan' material.
2013-01-29	ACPI / scan: Make scanning of fixed devices follow the general scheme	Rafael J. Wysocki
	Make acpi_bus_scan_fixed() use device_attach() directly to attach drivers, if any, to the fixed devices in analogy with how acpi_bus_scan() works, which allows the last argument of acpi_add_single_object() to be dropped and the manipulation of the flags.match_driver bit to be moved to acpi_init_device_object() and acpi_device_add_finalize(). After these changes all of the functions for the initialization and registration of struct acpi_device objects work in the same way for all of them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-01-29	Merge branch 'acpi-pm' into acpi-cleanup	Rafael J. Wysocki
	The following commits depend on the 'acpi-pm' material.
2013-01-29	cfg80211: add SME state to warning in __cfg80211_mlme_disassoc	Johannes Berg
	The warning here occasionally triggers but we haven't found the cause yet. It's a valid warning since if it triggers the SME state got confused, so add the SME state to it to help narrow it down in the future. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-29	mac80211: remove IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL	Stanislaw Gruszka
	This is basically a revert of: commit 5b632fe85ec82e5c43740b52e74c66df50a37db3 Author: Stanislaw Gruszka <sgruszka@redhat.com> Date: Mon Dec 3 12:56:33 2012 +0100 mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL We do not need this flag any longer, rt2x00 BAR/BA problem was fixed correctly by wireless-testing commit: commit 84e9e8ebd369679a958200a8baca96aafb2393bb Author: Helmut Schaa <helmut.schaa@googlemail.com> Date: Thu Jan 17 17:34:32 2013 +0100 rt2x00: Improve TX status handling for BlockAckReq frames Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-01-29	Merge remote-tracking branch 'wireless-next/master' into HEAD	Johannes Berg

2013-01-29	regulator: tps65090: add DT support	Laxman Dewangan
	Add DT support for TI PMIC tps65090 regulator driver. The DT of this device have node regulator and all regulator's node of this device is added under this node. The device tree binding document has the required information for adding this device on DTS file. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-01-29	GFS2: Use ->writepages for ordered writes	Steven Whitehouse
	Instead of using a list of buffers to write ahead of the journal flush, this now uses a list of inodes and calls ->writepages via filemap_fdatawrite() in order to achieve the same thing. For most use cases this results in a shorter ordered write list, as well as much larger i/os being issued. The ordered write list is sorted by inode number before writing in order to retain the disk block ordering between inodes as per the previous code. The previous ordered write code used to conflict in its assumptions about how to write out the disk blocks with mpage_writepages() so that with this updated version we can also use mpage_writepages() for GFS2's ordered write, writepages implementation. So we will also send larger i/os from writeback too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-01-29	GFS2: Clean up freeze code	Steven Whitehouse
	The freeze code has not been looked at a lot recently. Upstream has moved on, and this is an attempt to catch us back up again. There is a vfs level interface for the freeze code which can be called from our (obsolete, but kept for backward compatibility purposes) sysfs freeze interface. This means freezing this way vs. doing it from the ioctl should now work in identical fashion. As a result of this, the freeze function is only called once and we can drop our own special purpose code for counting the number of freezes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>