linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2011-05-12	xen mmu: fix a race window causing leave_mm BUG()	Tian, Kevin
	There's a race window in xen_drop_mm_ref, where remote cpu may exit dirty bitmap between the check on this cpu and the point where remote cpu handles drop request. So in drop_other_mm_ref we need check whether TLB state is still lazy before calling into leave_mm. This bug is rarely observed in earlier kernel, but exaggerated by the commit 831d52bc153971b70e64eccfbed2b232394f22f8 ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm") which clears bitmap after changing the TLB state. the call trace is as below: --------------------------------- kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb CPU 1 Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP: e02b:ffff88002805be48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) Stack: ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 Call Trace: <IRQ> [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30 <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef [<ffffffff81042fcf>] ? need_resched+0x23/0x2d [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 [<ffffffff81114362>] ? do_execve+0x1c3/0x29e [<ffffffff8101155d>] ? sys_execve+0x43/0x5d [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e [<ffffffff81013daa>] ? child_rip+0xa/0x20 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 [<ffffffff81013da0>] ? child_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 RSP <ffff88002805be48> ---[ end trace ce9cee6832a9c503 ]--- Tested-by: Maoxiaoyun<tinnycloud@hotmail.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> [v1: Fleshed out the git description a bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	mac80211: mesh: move some code to make it static	Johannes Berg
	There's no need to have table functions in one file and all users in another, move the functions to the right file and make them static. Also move a static variable to the beginning of the file to make it easier to find. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	cfg80211/mac80211: avoid bounce back mac->cfg->mac on sched_scan_stopped	Luciano Coelho
	When sched_scan_stopped was called by the driver, mac80211 calls cfg80211, which in turn was calling mac80211 back with a flag "driver_initiated". This flag was used so that mac80211 would do the necessary cleanup but would not call the driver. This was enough to prevent the bounce back between the driver and mac80211, but not between mac80211 and cfg80211. To fix this, we now do the cleanup in mac80211 before calling cfg80211. To help with locking issues, the workqueue was moved from cfg80211 to mac80211. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Luciano Coelho <coelho@ti.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mac80211: fix a few RCU issues	Johannes Berg
	A few configuration functions correctly do rcu_read_lock() but don't correctly reference some pointers protected by RCU. Fix that. Cc: stable@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mac80211: fix another key non-race	Johannes Berg
	The code here is only not racy because all the places that assign the pointers it uses are holding the sta_mtx as well as the key_mtx and so can't race against this because this code holds the sta_mtx. But that's not intuitive, so fix it to hold the key_mtx. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mac80211: make key locking clearer	Johannes Berg
	The code in ieee80211_del_key() doesn't acquire the key_mtx properly when it dereferences the keys. It turns out that isn't actually necessary since the key_mtx itself seems to be redundant since all key manipulations are done under the RTNL, but as long as we have the key_mtx we should use it the right way too. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mac80211: remove pointless mesh path timer RCU code	Johannes Berg
	The code here to RCU-dereference a pointer that's on the stack is totally pointless, RCU isn't magic (like say Java's weak references are), so the code can't work like whoever wrote it thought it might. Remove it so readers don't get confused. Note that it seems that a bug is there anyway: I don't see any code that cancels the timer when a mesh path struct is destroyed. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	ath9k_hw: Fix STA connection issues with AR9380 (XB113).	Senthil Balasubramanian
	XB113 (AR9380) 3x3 SB 5G only cards were failing to connect to APs due to incorrect xpabiaslevel configuration. fix it. Cc: stable@kernel.org Cc: Ray Li <ray.li@greenwavereality.com> Cc: Kathy Giori <kathy.giori@atheros.com> Cc: Aeolus Yang <aeolus.yang@atheros.com> Cc: compat@orbit-lab.org Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mwifiex: remove mwifiex_recv_complete function	Amitkumar Karwar
	The function - increments dropped rx_packet count if status code passed to it is "-1". - frees SKB buffer. But currently the function is being called with "0" status code. This patch replaces above function by dev_kfree_skb_any() call. Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	bcma: pci: trivial: correct amount of maximum retries	Rafał Miłecki
	Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	ath9k_hw: fix dual band assumption for XB113	Luis R. Rodriguez
	The XB113 cards are single band, 5 GHz-only, but the default settings were configured to assume it was dual band. Users of these cards then would see 2.4 GHz channels but you would never get any scan results from these channels given that the radio is not present. Cc: stable@kernel.org Cc: Fiona Cain <Fiona.Cain@atheros.com> Cc: Ray Li <ray.li@greenwavereality.com> Cc: Kathy Giori <kathy.giori@atheros.com> Cc: Aeolus Yang <aeolus.yang@atheros.com> Cc: Dan Friedman <dan.friedman@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	mac80211: fix contention time computation in minstrel, minstrel_ht	Daniel Halperin
	When transmitting a frame, the transmitter waits a random number of slots between 0 and cw. Thus, the contention time is (cw / 2) * t_slot which we can represent instead as (cw * t_slot) >> 1. Also fix a few other accounting bugs around contention time, and add comments. Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	cfg80211: restrict AP beacon intervals	Johannes Berg
	Multiple virtual AP interfaces can currently try to use different beacon intervals, but that just leads to problems since it won't actually be done that way by drivers. Return an error in this case to make sure it won't be done wrong. Also, ignore attempts to change the DTIM period or beacon interval during the lifetime of the BSS. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	Merge branch 'master' of ↵	John W. Linville
	git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6
2011-05-12	Merge branch 'fbmem'	Linus Torvalds
	* fbmem: fbmem: make read/write/ioctl use the frame buffer at open time fbcon: add lifetime refcount to opened frame buffers
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - remove unused variable from struct ads7845_ser_req Input: ads7846 - make transfer buffers DMA safe
2011-05-12	x86/mm: Fix section mismatch derived from native_pagetable_reserve()	Sedat Dilek
	With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range() The function native_pagetable_reserve() references the function __init memblock_x86_reserve_range(). This is often because native_pagetable_reserve lacks a __init annotation or the annotation of memblock_x86_reserve_range is wrong. This patch fixes the issue. Thanks to pipacs from PaX project for help on IRC. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	x86,xen: introduce x86_init.mapping.pagetable_reserve	Stefano Stabellini
	Introduce a new x86_init hook called pagetable_reserve that at the end of init_memory_mapping is used to reserve a range of memory addresses for the kernel pagetable pages we used and free the other ones. On native it just calls memblock_x86_reserve_range while on xen it also takes care of setting the spare memory previously allocated for kernel pagetable pages from RO to RW, so that it can be used for other purposes. A detailed explanation of the reason why this hook is needed follows. As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason we need a hook to reserve the kernel pagetable pages we used and free the other ones so that they can be reused for other purposes. On native it just means calling memblock_x86_reserve_range, on Xen it also means marking RW the pagetable pages that we allocated before but that haven't been used before. Another way to fix this is without using the hook is by adding a 'if (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen counterpart, but that is just nasty. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""	Konrad Rzeszutek Wilk
	This reverts commit a38647837a411f7df79623128421eef2118b5884. It does not work with certain AMD machines. last_pfn = 0x100000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 02c3a000 Base memory trampoline at [ffff88000009b000] 9b000 size 20480 init_memory_mapping: 0000000000000000-0000000100000000 0000000000 - 0100000000 page 4k kernel direct mapping tables up to 100000000 @ ff7fb000-100000000 init_memory_mapping: 0000000100000000-00000001e0800000 0100000000 - 01e0800000 page 4k kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000 xen: setting RW the range fffdc000 - 100000000 RAMDISK: 0203b000 - 02c3a000 No NUMA configuration found Faking a node at 0000000000000000-00000001e0800000 NUMA: Using 63 for the hash shift. Initmem setup node 0 0000000000000000-00000001e0800000 NODE_DATA [00000001dfffb000 - 00000001dfffffff] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea PGD 0 Oops: 0003 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1 RIP: e030:[<ffffffff81cf6a75>] [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP: e02b:ffffffff81c01e38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040 RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000 RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400 FS: 0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020) Stack: 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45 Call Trace: [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6f67>] numa_init+0x6c/0x70 [<ffffffff81cf7057>] initmem_init+0x39/0x3b [<ffffffff81ce5865>] setup_arch+0x64e/0x769 [<ffffffff815e43c1>] ? printk+0x51/0x53 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89 RIP [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP <ffffffff81c01e38> CR2: 0000000000000000 ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill the idle task! Pid: 0, comm: swapper Tainted: G D 2.6.39-0-virtual #6~smb1 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	Merge branches 'cma', 'cxgb4' and 'qib' into for-next	Roland Dreier

2011-05-12	IB/qib: Use pci_dev->revision	Sergei Shtylyov
	The driver reads PCI revision ID from the PCI configuration register while it's already stored by PCI subsystem in the revision field of struct pci_dev. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix oops in revalidate when called with NULL nameidata
2011-05-12	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic sparc32: fix sparcstation 5 boot sparc32: fix section mismatch warnings in apc, pmc and time_32
2011-05-12	Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM ARM: zImage: the page table memory must be considered before relocation ARM: zImage: make sure not to relocate on top of the relocation code ARM: zImage: Fix bad SP address after relocating kernel ARM: zImage: make sure the stack is 64-bit aligned ARM: RiscPC: acornfb: fix section mismatches ARM: RiscPC: etherh: fix section mismatches
2011-05-12	fbmem: make read/write/ioctl use the frame buffer at open time	Linus Torvalds
	read/write/ioctl on a fbcon file descriptor has traditionally used the fbcon not when it was opened, but as it was at the time of the call. That makes no sense, but the lack of sense is much more obvious now that we properly ref-count the usage - it means that the ref-counting doesn't actually protect operations we do on the frame buffer. This changes it to look at the fb_info that we got at open time, but in order to avoid using a frame buffer long after it has been unregistered, we do verify that it is still current, and return -ENODEV if not. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	fbcon: add lifetime refcount to opened frame buffers	Linus Torvalds
	This just adds the refcount and the new registration lock logic. It does not (for example) actually change the read/write/ioctl routines to actually use the frame buffer that was opened: those function still end up alway susing whatever the current frame buffer is at the time of the call. Without this, if something holds the frame buffer open over a framebuffer switch, the close() operation after the switch will access a fb_info that has been free'd by the unregistering of the old frame buffer. (The read/write/ioctl operations will normally not cause problems, because they will - illogically - pick up the new fbcon instead. But a switch that happens just as one of those is going on might see problems too, the window is just much smaller: one individual op rather than the whole open-close sequence.) This use-after-free is apparently fairly easily triggered by the Ubuntu 11.04 boot sequence. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	sfc: Always map MCDI shared memory as uncacheable	Ben Hutchings
	We enabled write-combining for memory-mapped registers in commit 65f0b417dee94f779ce9b77102b7d73c93723b39, but inhibited it for the MCDI shared memory where this is not supported. However, write-combining mappings also allow read-reordering, which may also be a problem. I found that when an SFC9000-family controller is connected to an Intel 3000 chipset, and write-combining is enabled, the controller stops responding to PCIe read requests during driver initialisation while the driver is polling for completion of an MCDI command. This results in an NMI and system hang. Adding read memory barriers between all reads to the shared memory area appears to reduce but not eliminate the probability of this. We have not yet established whether this is a bug in our BIU or in the PCIe bridge. For now, work around by mapping the shared memory area separately. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-05-12	mac80211: Fix mesh-related build breakage...	Yogesh Ashok Powar
	net/mac80211/cfg.c: In function ‘sta_apply_parameters’: net/mac80211/cfg.c:746: error: ‘struct sta_info’ has no member named ‘plink_state’ make[1]: * [net/mac80211/cfg.o] Error 1 make: * [net/mac80211/mac80211.ko] Error 2 Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-05-12	x86: Remove warning and warning_symbol from struct stacktrace_ops	Richard Weinberger
	Both warning and warning_symbol are nowhere used. Let's get rid of them. Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Soeren Sandmann Pedersen <ssp@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: x86 <x86@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Robert Richter <robert.richter@amd.com> Cc: Paul Mundt <lethal@linux-sh.org> Link: http://lkml.kernel.org/r/1305205872-10321-2-git-send-email-richard@nod.at Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-05-12	seqlock: Don't smp_rmb in seqlock reader spin loop	Milton Miller
	Move the smp_rmb after cpu_relax loop in read_seqlock and add ACCESS_ONCE to make sure the test and return are consistent. A multi-threaded core in the lab didn't like the update from 2.6.35 to 2.6.36, to the point it would hang during boot when multiple threads were active. Bisection showed af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867 (clockevents: Remove the per cpu tick skew) as the culprit and it is supported with stack traces showing xtime_lock waits including tick_do_update_jiffies64 and/or update_vsyscall. Experimentation showed the combination of cpu_relax and smp_rmb was significantly slowing the progress of other threads sharing the core, and this patch is effective in avoiding the hang. A theory is the rmb is affecting the whole core while the cpu_relax is causing a resource rebalance flush, together they cause an interfernce cadance that is unbroken when the seqlock reader has interrupts disabled. At first I was confused why the refactor in 3c22cd5709e8143444a6d08682a87f4c57902df3 (kernel: optimise seqlock) didn't affect this patch application, but after some study that affected seqcount not seqlock. The new seqcount was not factored back into the seqlock. I defer that the future. While the removal of the timer interrupt offset created contention for the xtime lock while a cpu does the additonal work to update the system clock, the seqlock implementation with the tight rmb spin loop goes back much further, and is just waiting for the right trigger. Cc: <stable@vger.kernel.org> Signed-off-by: Milton Miller <miltonm@bga.com> Cc: <linuxppc-dev@lists.ozlabs.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Anton Blanchard <anton@samba.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Link: http://lkml.kernel.org/r/%3Cseqlock-rmb%40mdm.bga.com%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-12	ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses	Catalin Marinas
	Since mandatory barriers may be used (explicitly or implicitly via readl etc.) to ensure the ordering between Device and Normal memory accesses, a DMB is not enough. This patch converts it to a DSB. Cc: Colin Cross <ccross@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12	ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls	Arnd Bergmann
	GDB's interrupt.exp test cases currenly fail on ARM. The problem is how do_signal handled restarting interrupted system calls: The entry.S assembler code determines that we come from a system call; and that information is passed as "syscall" parameter to do_signal. That routine then calls get_signal_to_deliver [] and if a signal is to be delivered, calls into handle_signal. If a system call is to be restarted either after the signal handler returns, or if no handler is to be called in the first place, the PC is updated after the get_signal_to_deliver call, either in handle_signal (if we have a handler) or at the end of do_signal (otherwise). Now the problem is that during [], the call to get_signal_to_deliver, a ptrace intercept may happen. During this intercept, the debugger may change registers, including the PC. This is done by GDB if it wants to execute an "inferior call", i.e. the execution of some code in the debugged program triggered by GDB. To this purpose, GDB will save all registers, allocate a stack frame, set up PC and arguments as appropriate for the call, and point the link register to a dummy breakpoint instruction. Once the process is restarted, it will execute the call and then trap back to the debugger, at which point GDB will restore all registers and continue original execution. This generally works fine. However, now consider what happens when GDB attempts to do exactly that while the process was interrupted during execution of a to-be- restarted system call: do_signal is called with the syscall flag set; it calls get_signal_to_deliver, at which point the debugger takes over and changes the PC to point to a completely different place. Now get_signal_to_deliver returns without a signal to deliver; but now do_signal decides it should be restarting a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4 bytes before the function GDB wants to call -- which leads to a subsequent crash. To fix this problem, two things need to be supported: - do_signal must be able to recognize that get_signal_to_deliver changed the PC to a different location, and skip the restart-syscall sequence - once the debugger has restored all registers at the end of the inferior call sequence, do_signal must recognize that now it needs to restart the pending system call, even though it was now entered from a breakpoint instead of an actual svc instruction This set of issues is solved on other platforms, usually by one of two mechanisms: - The status information "do_signal is handling a system call that may need restarting" is itself carried in some register that can be accessed via ptrace. This is e.g. on Intel the "orig_eax" register; on Sparc the kernel defines a magic extra bit in the flags register for this purpose. This allows GDB to manage that state: reset it when doing an inferior call, and restore it after the call is finished. - On s390, do_signal transparently handles this problem without requiring GDB interaction, by performing system call restarting in the following way: first, adjust the PC as necessary for restarting the call. Then, call get_signal_to_deliver; and finally just continue execution at the PC. This way, if GDB does not change the PC, everything is as before. If GDB does change the PC, execution will simply continue there -- and once GDB restores the PC it saved at that point, it will automatically point to the restarted system call. (There is the minor twist how to handle system calls that do not need restarting -- do_signal will undo the PC change in this case, after get_signal_to_deliver has returned, and only if ptrace did not change the PC during that call.) Because there does not appear to be any obvious register to carry the syscall-restart information on ARM, we'd either have to introduce a new artificial ptrace register just for that purpose, or else handle the issue transparently like on s390. The patch below implements the second option; using this patch makes the interrupt.exp test cases pass on ARM, with no regression in the GDB test suite otherwise. Cc: patches@linaro.org Signed-off-by: Ulrich Weigand <ulrich.weigand@linaro.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12	ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM	Will Deacon
	The SPARSEMEM code allocates memmap entries only for sections which are present (i.e. those which contain some valid memory). The membank checks in free_unused_memmap do not take this into account and can incorrectly attempt to free memory which is not allocated, resulting in a BUG() in the bootmem code. However, if memory is configured as follows: \|<----section---->\|<----hole---->\|<----section---->\| +--------+--------+--------------+--------+--------+ \| bank 0 \| unused \| \| bank 1 \| unused \| +--------+--------+--------------+--------+--------+ where a bank only occupies part of a section, the memmap allocated for the remainder of the section can be freed. This patch modifies the checks in free_unused_memmap so that only valid memmap entries are considered for removal. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12	Merge branch 'fortglx/39/tip/timers/rtc' of ↵	Thomas Gleixner
	git://git.linaro.org/people/jstultz/linux into timers/urgent
2011-05-12	sched: Remove unused parameters from sched_fork() and wake_up_new_task()	Samir Bellabes
	sched_fork() and wake_up_new_task() are defined with a parameter 'unsigned long clone_flags', which is unused. This patch removes the parameters. Signed-off-by: Samir Bellabes <sam@synack.fr> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1305130685-1047-1-git-send-email-sam@synack.fr Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-12	Merge commit 'v2.6.39-rc7' into sched/core	Ingo Molnar

2011-05-12	Bluetooth: Remove leftover debug messages	Gustavo F. Padovan
	They were added by me while testing and I forgot to remove. Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-05-11	sparc32: Fixed unaligned memory copying in function ↵	Tkhai Kirill
	__csum_partial_copy_sparc_generic When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits, but it's not guaranteed one of them is zero. So we can get unaligned memory access in label ccte. Example of parameters which lead to this: %o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3 With the parameters I had a memory corruption, when the additional 5 bytes were rewritten. This patch corrects the error. One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft stores word or less. Signed-off-by: Tkhai Kirill <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12	ehea: Fix memory hotplug oops	Anton Blanchard
	The ehea driver oopses during memory hotplug if the ports are not up. A simple testcase: # ifconfig ethX down # echo offline > /sys/devices/system/memory/memory32/state Oops: Kernel access of bad area, sig: 11 [#1] last sysfs file: /sys/devices/system/memory/memory32/state REGS: c000000709393110 TRAP: 0300 Not tainted (2.6.39-rc2-01385-g7ef73bc-dirty) DAR: 0000000000000000, DSISR: 40000000 ... NIP [c000000000067c98] .__wake_up_common+0x48/0xf0 LR [c00000000006d034] .__wake_up+0x54/0x90 Call Trace: [c00000000006d034] .__wake_up+0x54/0x90 [d000000006bb6270] .ehea_rereg_mrs+0x140/0x730 [ehea] [d000000006bb69c4] .ehea_mem_notifier+0x164/0x170 [ehea] [c0000000006fc8a8] .notifier_call_chain+0x78/0xf0 [c0000000000b3d70] .__blocking_notifier_call_chain+0x70/0xb0 [c000000000458d78] .memory_notify+0x28/0x40 [c0000000001871d8] .remove_memory+0x208/0x6d0 [c000000000458264] .memory_section_action+0x94/0x140 [c0000000004583ec] .memory_block_change_state+0xdc/0x1d0 [c0000000004585cc] .store_mem_state+0xec/0x160 [c00000000044768c] .sysdev_store+0x3c/0x50 [c00000000020b48c] .sysfs_write_file+0xec/0x1f0 [c00000000018f86c] .vfs_write+0xec/0x1e0 [c00000000018fa88] .SyS_write+0x58/0xd0 To fix this, initialise the waitqueues during port probe instead of port open. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@kernel.org Acked-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-11	NFSv4.1: Ensure that layoutget uses the correct gfp modes	Trond Myklebust
	Currently, writebacks may end up recursing back into the filesystem due to GFP_KERNEL direct reclaims in the pnfs subsystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-11	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: do not use i_wrbuffer_ref as refcount for Fb cap ceph: fix list_add in ceph_put_snap_realm ceph: print debug message before put mds session
2011-05-11	Merge branch 'drm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/nouveau: fix build regression on alpha due to Xen changes. drm/radeon/kms: fix cayman acceleration drm/radeon: fix cayman struct accessors.
2011-05-11	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: Fix for the TWL4030 PM sleep/wakeup sequence mfd: Fix asic3 build error mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset
2011-05-11	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6	Linus Torvalds
	* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] fix alloc_pgste check in init_new_context [S390] oprofile: fix min/max interval query checks [S390] replace diag10() with diag10_range() function [S390] disassembler: handle b280/spp instruction [S390] kernel: Initialize register 14 when starting new CPU [S390] dasd: prevent IO error during reserve/release loop [S390] sclp/memory hotplug: fix initial usecount of increments
2011-05-11	Revert "Bluetooth: fix shutdown on SCO sockets"	Linus Torvalds
	This reverts commit f21ca5fff6e548833fa5ee8867239a8378623150. Quoth Gustavo F. Padovan: "Commit f21ca5fff6e548833fa5ee8867239a8378623150 can cause a NULL dereference if we call shutdown in a bluetooth SCO socket and doesn't wait the shutdown completion to call close(). Please revert it. I may have a fix for it soon, but we don't have time anymore, so revert is the way to go. ;)" Requested-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11	Merge branch 'pm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM PM / Hibernate: Make snapshot_release() restore GFP mask PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl
2011-05-11	mm: tracing: add missing GFP flags to tracing	Mel Gorman
	include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync. When tracing is enabled, certain flags are not recognised and the text output is less useful as a result. Add the missing flags. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11	tmpfs: fix spurious ENOSPC when racing with unswap	Hugh Dickins
	Testing the shmem_swaplist replacements for igrab() revealed another bug: writes to /dev/loop0 on a tmpfs file which fills its filesystem were sometimes failing with "Buffer I/O error"s. These came from ENOSPC failures of shmem_getpage(), when racing with swapoff: the same could happen when racing with another shmem_getpage(), pulling the page in from swap in between our find_lock_page() and our taking the info->lock (though not in the single-threaded loop case). This is unacceptable, and surprising that I've not noticed it before: it dates back many years, but (presumably) was made a lot easier to reproduce in 2.6.36, which sited a page preallocation in the race window. Fix it by rechecking the page cache before settling on an ENOSPC error. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11	tmpfs: fix race between umount and swapoff	Hugh Dickins
	The use of igrab() in swapoff's shmem_unuse_inode() is just as vulnerable to umount as that in shmem_writepage(). Fix this instance by extending the protection of shmem_swaplist_mutex right across shmem_unuse_inode(): while it's on the list, the inode cannot be evicted (and the filesystem cannot be unmounted) without shmem_evict_inode() taking that mutex to remove it from the list. But since shmem_writepage() might take that mutex, we should avoid making memory allocations or memcg charges while holding it: prepare them at the outer level in shmem_unuse(). When mem_cgroup_cache_charge() was originally placed, we didn't know until that point that the page from swap was actually a shmem page; but nowadays it's noted in the swap_map, so we're safe to charge upfront. For the radix_tree, do as is done in shmem_getpage(): preload upfront, but don't pin to the cpu; so we make a habit of refreshing the node pool, but might dip into GFP_NOWAIT reserves on occasion if subsequently preempted. With the allocation and charge moved out from shmem_unuse_inode(), we can also hold index map and info->lock over from finding the entry. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11	tmpfs: fix race between umount and writepage	Hugh Dickins
	Konstanin Khlebnikov reports that a dangerous race between umount and shmem_writepage can be reproduced by this script: for i in {1..300} ; do mkdir $i while true ; do mount -t tmpfs none $i dd if=/dev/zero of=$i/test bs=1M count=$(($RANDOM % 100)) umount $i done & done on a 6xCPU node with 8Gb RAM: kernel very unstable after this accident. =) Kernel log: VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day... WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98() list_del corruption. prev->next should be ffff880222fdaac8, but was (null) Pid: 11222, comm: mount.tmpfs Not tainted 2.6.39-rc2+ #4 Call Trace: warn_slowpath_common+0x80/0x98 warn_slowpath_fmt+0x41/0x43 __list_del_entry+0x8d/0x98 evict+0x50/0x113 iput+0x138/0x141 ... BUG: unable to handle kernel paging request at ffffffffffffffff IP: shmem_free_blocks+0x18/0x4c Pid: 10422, comm: dd Tainted: G W 2.6.39-rc2+ #4 Call Trace: shmem_recalc_inode+0x61/0x66 shmem_writepage+0xba/0x1dc pageout+0x13c/0x24c shrink_page_list+0x28e/0x4be shrink_inactive_list+0x21f/0x382 ... shmem_writepage() calls igrab() on the inode for the page which came from page reclaim, to add it later into shmem_swaplist for swapoff operation. This igrab() can race with super-block deactivating process: shrink_inactive_list() deactivate_super() pageout() tmpfs_fs_type->kill_sb() shmem_writepage() kill_litter_super() generic_shutdown_super() evict_inodes() igrab() atomic_read(&inode->i_count) skip-inode iput() if (!list_empty(&sb->s_inodes)) printk("VFS: Busy inodes after... This igrap-iput pair was added in commit 1b1b32f2c6f6 "tmpfs: fix shmem_swaplist races" based on incorrect assumptions: igrab() protects the inode from concurrent eviction by deletion, but it does nothing to protect it from concurrent unmounting, which goes ahead despite the raised i_count. So this use of igrab() was wrong all along, but the race made much worse in 2.6.37 when commit 63997e98a3be "split invalidate_inodes()" replaced two attempts at invalidate_inodes() by a single evict_inodes(). Konstantin posted a plausible patch, raising sb->s_active too: I'm unsure whether it was correct or not; but burnt once by igrab(), I am sure that we don't want to rely more deeply upon externals here. Fix it by adding the inode to shmem_swaplist earlier, while the page lock on page in page cache still secures the inode against eviction, without artifically raising i_count. It was originally added later because shmem_unuse_inode() is liable to remove an inode from the list while it's unswapped; but we can guard against that by taking spinlock before dropping mutex. Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>