linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2008-06-06	eCryptfs: remove unnecessary page decrypt call	Michael Halcrow
	The page decrypt calls in ecryptfs_write() are both pointless and buggy. Pointless because ecryptfs_get_locked_page() has already brought the page up to date, and buggy because prior mmap writes will just be blown away by the decrypt call. This patch also removes the declaration of a now-nonexistent function ecryptfs_write_zeros(). Thanks to Eric Sandeen and David Kleikamp for helping to track this down. Eric said: fsx w/ mmap dies quickly ( < 100 ops) without this, and survives nicely (to millions of ops+) with it in place. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	update checkpatch.pl to version 0.19	Andy Whitcroft
	This version is a bit of a whopper. This version brings a few new checks, improvements to a number of checks mostly through modifications to the way types are parsed, several fixes to quote/comment handling, as well as the usual slew of fixes for false positives. Of note: - return is not a function and is now reported, - preprocessor directive detection is loosened to match C99 standard, - we now intuit new type modifiers, and - comment handling is much improved Andy Whitcroft (18): Version: 0.19 fix up a couple of missing newlines in reports colon to parenthesis spacing varies on asm values: #include is a preprocessor statement quotes: fix single character quotes at line end add typedef exception for the non-pointer "function types" kerneldoc parameters must be on one line, relax line length types: word boundary is not always required improved #define bracketing reports uninitialized_var is an annotation not a function name possible types: add possible modifier handling possible types: fastcall is a type modifier types: unsigned is not a modifier on all types static/external initialisation to zero should allow modifiers checkpatch: fix recognition of preprocessor directives -- part 2 comments: fix inter-hunk comment tracking return is not a function do not report include/asm/foo.h use in include/linux/foo.h return is not a function -- tighten test [jengelh@computergmbh.de: fix recognition of preprocessor directives] Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	serial: fix driver_name conflicts	Anton Vorontsov
	Some drivers are using too generic "serial" name for driver_name, this might cause issues, like this: Freescale QUICC Engine UART device driver proc_dir_entry 'serial' already registered Call Trace: [cf82de50] [c0007f7c] show_stack+0x4c/0x1ac (unreliable) [cf82de90] [c00b03fc] proc_register+0xfc/0x1ac [cf82dec0] [c00b05c8] create_proc_entry+0x60/0xac [cf82dee0] [c00b23dc] proc_tty_register_driver+0x60/0x98 [cf82def0] [c016dbd8] tty_register_driver+0x1b4/0x228 [cf82df20] [c0184d70] uart_register_driver+0x144/0x194 [cf82df40] [c030a378] ucc_uart_init+0x2c/0x94 [cf82df50] [c02f21a0] kernel_init+0x98/0x27c [cf82dff0] [c000fa74] kernel_thread+0x44/0x60 ^^ The board is using ucc_uart.c and 8250.c, both registered as "serial". This patch fixes two drivers that are using "serial" for driver_name and not "ttyS" for dev_name. Drivers that are using "ttyS" for dev_name, will conflict anyway, so we don't bother with these. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Alan Cox <alan@redhat.com> Acked-By: Timur Tabi <timur@freescale.com> Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	hugetlb: fix lockdep error	Nick Piggin
	============================================= [ INFO: possible recursive locking detected ] 2.6.26-rc4 #30 --------------------------------------------- heap-overflow/2250 is trying to acquire lock: (&mm->page_table_lock){--..}, at: [<c0000000000cf2e8>] .copy_hugetlb_page_range+0x108/0x280 but task is already holding lock: (&mm->page_table_lock){--..}, at: [<c0000000000cf2dc>] .copy_hugetlb_page_range+0xfc/0x280 other info that might help us debug this: 3 locks held by heap-overflow/2250: #0: (&mm->mmap_sem){----}, at: [<c000000000050e44>] .dup_mm+0x134/0x410 #1: (&mm->mmap_sem/1){--..}, at: [<c000000000050e54>] .dup_mm+0x144/0x410 #2: (&mm->page_table_lock){--..}, at: [<c0000000000cf2dc>] .copy_hugetlb_page_range+0xfc/0x280 stack backtrace: Call Trace: [c00000003b2774e0] [c000000000010ce4] .show_stack+0x74/0x1f0 (unreliable) [c00000003b2775a0] [c0000000003f10e0] .dump_stack+0x20/0x34 [c00000003b277620] [c0000000000889bc] .__lock_acquire+0xaac/0x1080 [c00000003b277740] [c000000000089000] .lock_acquire+0x70/0xb0 [c00000003b2777d0] [c0000000003ee15c] ._spin_lock+0x4c/0x80 [c00000003b277870] [c0000000000cf2e8] .copy_hugetlb_page_range+0x108/0x280 [c00000003b277950] [c0000000000bcaa8] .copy_page_range+0x558/0x790 [c00000003b277ac0] [c000000000050fe0] .dup_mm+0x2d0/0x410 [c00000003b277ba0] [c000000000051d24] .copy_process+0xb94/0x1020 [c00000003b277ca0] [c000000000052244] .do_fork+0x94/0x310 [c00000003b277db0] [c000000000011240] .sys_clone+0x60/0x80 [c00000003b277e30] [c0000000000078c4] .ppc_clone+0x8/0xc Fix is the same way that mm/memory.c copy_page_range does the lockdep annotation. Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	modedb: fix incorrect sync and vmode flags for CVT modes	Krzysztof Helt
	The temporary structure for calculated CVT mode is not initialized. Few fields have only bits or-ed or and-ed so they may be left in incorrect (random) state. Testing of the tridentfb seems like a good exercise for the fbdev layer. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	frv: don't offer BINFMT_FLAT	Adrian Bunk
	Fix the following compile error: CC fs/binfmt_flat.o In file included from /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:36: /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/flat.h:14:22: error: asm/flat.h: No such file or directory /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c: In function 'create_flat_tables': /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:124: error: implicit declaration of function 'flat_stack_align' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:125: error: implicit declaration of function 'flat_argvp_envp_on_stack' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c: In function 'calc_reloc': /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:347: error: implicit declaration of function 'flat_reloc_valid' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c: In function 'load_flat_file': /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:479: error: implicit declaration of function 'flat_old_ram_flag' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:755: error: implicit declaration of function 'flat_set_persistent' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:757: error: implicit declaration of function 'flat_get_relocate_addr' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:765: error: implicit declaration of function 'flat_get_addr_from_rp' /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/binfmt_flat.c:781: error: implicit declaration of function 'flat_put_addr_at_rp' Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Tested-by: David Howells <dhowells@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	cpusets: fix and update Documentation	Miao Xie
	Make the doc consistent with current cpusets implementation. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Acked-by: Paul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	md: do not compute parity unless it is on a failed drive	Dan Williams
	If a block is computed (rather than read) then a check/repair operation may be lead to believe that the data on disk is correct, when infact it isn't. So only compute blocks for failed devices. This issue has been around since at least 2.6.12, but has become harder to hit in recent kernels since most reads bypass the cache. echo repair > /sys/block/mdN/md/sync_action will set the parity blocks to the correct state. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	md: fix uninitialized use of mddev->recovery_wait	Dan Williams
	If an array was created with --assume-clean we will oops when trying to set ->resync_max. Fix this by initializing ->recovery_wait in mddev_find. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	md: fix prexor vs sync_request race	Dan Williams
	During the initial array synchronization process there is a window between when a prexor operation is scheduled to a specific stripe and when it completes for a sync_request to be scheduled to the same stripe. When this happens the prexor completes and the stripe is unconditionally marked "insync", effectively canceling the sync_request for the stripe. Prior to 2.6.23 this was not a problem because the prexor operation was done under sh->lock. The effect in older kernels being that the prexor would still erroneously mark the stripe "insync", but sync_request would be held off and re-mark the stripe as "!in_sync". Change the write completion logic to not mark the stripe "in_sync" if a prexor was performed. The effect of the change is to sometimes not set STRIPE_INSYNC. The worst this can do is cause the resync to stall waiting for STRIPE_INSYNC to be set. If this were happening, then STRIPE_SYNCING would be set and handle_issuing_new_read_requests would cause all available blocks to eventually be read, at which point prexor would never be used on that stripe any more and STRIPE_INSYNC would eventually be set. echo repair > /sys/block/mdN/md/sync_action will correct arrays that may have lost this race. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	spi: fix refcount-related spidev oops-on-rmmod	David Brownell
	This addresses other oopsing paths in "spidev" by changing how it manages refcounting. It decouples the lifecycle of the per-device data from the class device (not just the spi device): - Use class_{create,destroy} not class_{register,unregister}. - Use device_{create,destroy} not device_{register,unregister}. - Free the per-device data only when TWO conditions are true: * Driver is unbound from underlying SPI device, and * Device is no longer open (new) Also, spi_{get,set}_drvdata not dev_{get,set}_drvdata for simpler code. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Sebastian Siewior <bigeasy@tglx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-06	KVM: migrate PIT timer	Marcelo Tosatti
	Migrate the PIT timer to the physical CPU which vcpu0 is scheduled on, similarly to what is done for the LAPIC timers, otherwise PIT interrupts will be delayed until an unrelated event causes an exit. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	IB/ipath: Fix SM trap forwarding	Ralph Campbell
	SM/SMA traps received by the ipath driver should be forwarded to the SM if it is running on the host. The ib_ipath driver was incorrectly replying with "bad method." Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-06-06	KVM: ppc: Report bad GFNs	Hollis Blanchard
	This code shouldn't be hit anyways, but when it is, it's useful to have a little more information about the failure. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: ppc: Use a read lock around MMU operations, and release it on error	Hollis Blanchard
	gfn_to_page() and kvm_release_page_clean() are called from other contexts with mmap_sem locked only for reading. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: ppc: Remove unmatched kunmap() call	Hollis Blanchard
	We're not calling kmap() now, so we shouldn't call kunmap() either. This has no practical effect in the non-highmem case, which is why it hasn't caused more obvious problems. Pointed out by Anthony Liguori. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: ppc: add lwzx/stwz emulation	Hollis Blanchard
	Somehow these load/store instructions got missed before, but weren't used by the guest so didn't break anything. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: ppc: Remove duplicate function	Hollis Blanchard
	This was left behind from some code movement. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	IB/ehca: Reject send WRs only for RESET, INIT and RTR state	Joachim Fenkes
	Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-06-06	KVM: s390: Fix race condition in kvm_s390_handle_wait	Carsten Otte
	The call to add_timer was issued before local_int.lock was taken and before timer_due was set to 0. If the timer expires before the lock is being taken, the timer function will set timer_due to 1 and exit before the vcpu falls asleep. Depending on other external events, the vcpu might sleep forever. This fix pulls setting timer_due to the beginning of the function before add_timer, which ensures correct behavior. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: s390: Send program check on access error	Carsten Otte
	If the guest accesses non-existing memory, the sie64a function returns -EFAULT. We must check the return value and send a program check to the guest if the sie instruction faulted, otherwise the guest will loop at the faulting code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: s390: fix interrupt delivery	Carsten Otte
	The current code delivers pending interrupts before it checks for need_resched. On a busy host, this can lead to a longer interrupt latency if the interrupt is injected while the process is scheduled away. This patch moves delivering the interrupt _after_ schedule(), which makes more sense. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: s390: handle machine checks when guest is running	Christian Borntraeger
	The low-level interrupt handler on s390 checks for _TIF_WORK_INT and exits the guest context, if work is pending. TIF_WORK_INT is defined as_TIF_SIGPENDING \| _TIF_NEED_RESCHED \| _TIF_MCCK_PENDING. Currently the sie loop checks for signals and reschedule, but it does not check for machine checks. That means that we exit the guest context if a machine check is pending, but we do not handle the machine check. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: s390: fix locking order problem in enable_sie	Christian Borntraeger
	There are potential locking problem in enable_sie. We take the task_lock and the mmap_sem. As exit_mm uses the same locks vice versa, this triggers a lockdep warning. The second problem is that dup_mm and mmput might sleep, so we must not hold the task_lock at that moment. The solution is to dup the mm unconditional and use the task_lock before and afterwards to check if we can use the new mm. dup_mm and mmput are called outside the task_lock, but we run update_mm while holding the task_lock, protection us against ptrace. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: s390: use yield instead of schedule to implement diag 0x44	Christian Borntraeger
	diag 0x44 is the common way on s390 to yield the cpu to the hypervisor. It is called by the guest in cpu_relax and in the spinlock code to yield to other guest cpus. This semantic is similar to yield. Lets replace the call to schedule with yield to make sure that current is really yielding. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: x86 emulator: fix hypercall return value on AMD	Avi Kivity
	The hypercall instructions on Intel and AMD are different. KVM allows the guest to choose one or the other (the default is Intel), and if the guest chooses incorrectly, KVM will patch it at runtime to select the correct instruction. This allows live migration between Intel and AMD machines. This patching occurs in the x86 emulator. The current code also executes the hypercall. Unfortunately, the tail end of the x86 emulator code also executes, overwriting the return value of the hypercall with the original contents of rax (which happens to be the hypercall number). Fix not by executing the hypercall in the emulator context; instead let the guest reissue the patched instruction and execute the hypercall via the normal path. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	KVM: ia64: fix zero extending for mmio ld1/2/4 emulation in KVM	Jes Sorensen
	Only copy in the data actually requested by the instruction emulation and zero pad the destination register first. This avoids the problem where emulated mmio access got garbled data from ld2.acq instructions in the vga console driver. Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06	sched: move weighted_cpuload into #ifdef CONFIG_SMP section	Thomas Gleixner
	weighted_cpuload is only used on SMP. move it into the CONFIG_SMP section. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: Move cpu masks from kernel/sched.c into kernel/cpu.c	Max Krasnyansky
	kernel/cpu.c seems a more logical place for those maps since they do not really have much to do with the scheduler these days. kernel/cpu.c is now built for the UP kernel too, but it does not affect the size the kernel sections. $ size vmlinux before text data bss dec hex filename 3313797 307060 310352 3931209 3bfc49 vmlinux after text data bss dec hex filename 3313797 307060 310352 3931209 3bfc49 vmlinux Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Cc: mingo@elte.hu Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: CPU hotplug events must not destroy scheduler domains created by the ↵	Max Krasnyansky
	cpusets First issue is not related to the cpusets. We're simply leaking doms_cur. It's allocated in arch_init_sched_domains() which is called for every hotplug event. So we just keep reallocation doms_cur without freeing it. I introduced free_sched_domains() function that cleans things up. Second issue is that sched domains created by the cpusets are completely destroyed by the CPU hotplug events. For all CPU hotplug events scheduler attaches all CPUs to the NULL domain and then puts them all into the single domain thereby destroying domains created by the cpusets (partition_sched_domains). The solution is simple, when cpusets are enabled scheduler should not create default domain and instead let cpusets do that. Which is exactly what the patch does. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Cc: mingo@elte.hu Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: fix cpuprio build bug	Ingo Molnar
	this patch was not built on !SMP: kernel/sched_rt.c: In function 'inc_rt_tasks': kernel/sched_rt.c:404: error: 'struct rq' has no member named 'online' Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-06	sched: fix the cpuprio count really	Thomas Gleixner
	Peter pointed out that the last version of the "fix" was still one off under certain circumstances. Use BITS_TO_LONG instead to get an accurate result. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: fix cpupri priocount	Gregory Haskins
	A rounding error was pointed out by Peter Zijlstra which would result in the structure holding priorities to be off by one. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: fix cpupri hotplug support	Gregory Haskins
	The RT folks over at RedHat found an issue w.r.t. hotplug support which was traced to problems with the cpupri infrastructure in the scheduler: https://bugzilla.redhat.com/show_bug.cgi?id=449676 This bug affects 23-rt12+, 24-rtX, 25-rtX, and sched-devel. This patch applies to 25.4-rt4, though it should trivially apply to most cpupri enabled kernels mentioned above. It turned out that the issue was that offline cpus could get inadvertently registered with cpupri so that they were erroneously selected during migration decisions. The end result would be an OOPS as the offline cpu had tasks routed to it. This patch generalizes the old join/leave domain interface into an online/offline interface, and adjusts the root-domain/hotplug code to utilize it. I was able to easily reproduce the issue prior to this patch, and am no longer able to reproduce it after this patch. I can offline cpus indefinately and everything seems to be in working order. Thanks to Arnaldo (acme), Thomas, and Peter for doing the legwork to point me in the right direction. Also thank you to Peter for reviewing the early iterations of this patch. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: print the sd->level in sched_domain_debug code	Gautham R Shenoy
	While printing out the visual representation of the sched-domains, print the level (MC, SMT, CPU, NODE, ... ) of each of the sched_domains. Credit: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-06	sched: update the sched-domains debug documentation	Gautham R Shenoy
	SCHED_DOMAIN_DEBUG mentioned in the Documentation for sched-domains for enabling sched-domains debugging doesn't exist anymore. Update the documentation to reflect the correct way of enabling sched-domain debugging. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-06	sched: add comments for ifdefs in sched.c	Dhaval Giani
	make sched.c easier to read. Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: print module list in the "scheduling while atomic" warning	Arjan van de Ven
	For the normal WARN_ON() etc we added a print-the-modules-list already, which is very useful to figure out candidates for certain types of bugs. This patch adds the same print to the "scheduling while atomic" BUG warning, for the same reason: when we get here it's very useful to see which modules are loaded, to narrow down the candidate code list. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: reorder task_struct to reduce padding on 64bit builds	Richard Kennedy
	This patch removes 24 bytes of padding and allows 1 extra object per slab on my fedora based config. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: fix defined-but-unused warning	Rabin Vincent
	Fix this warning, which appears with !CONFIG_SMP: kernel/sched.c:1216: warning: `init_hrtick' defined but not used Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	namespacecheck: more sched.c fixes	Ingo Molnar
	[ Stephen Rothwell <sfr@canb.auug.org.au>: build fix ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	namespacecheck: fixes in kernel/sched.c	Thomas Gleixner
	Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: check for SD_SERIALIZE atomically in rebalance_domains()	Dmitry Adamushko
	Nothing really serious here, mainly just a matter of nit-picking :-/ From: Dmitry Adamushko <dmitry.adamushko@gmail.com> For CONFIG_SCHED_DEBUG && CONFIG_SYSCT configs, sd->flags can be altered while being manipulated in rebalance_domains(). Let's do an atomic check. We rely here on the atomicity of read/write accesses for aligned words. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: fix SCHED_OTHER balance iterator to include all tasks	Gregory Haskins
	The currently logic inadvertently skips the last task on the run-queue, resulting in missed balance opportunities. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: David Bahi <dbahi@novell.com> CC: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: use a 2-d bitmap for searching lowest-pri CPU	Gregory Haskins
	The current code use a linear algorithm which causes scaling issues on larger SMP machines. This patch replaces that algorithm with a 2-dimensional bitmap to reduce latencies in the wake-up path. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: make !hrtick faster	Mike Galbraith
	it is safe to ignore timers and flags when the feature is disabled. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	sched: prioritize non-migratable tasks over migratable ones	Gregory Haskins
	Dmitry Adamushko pointed out a known flaw in the rt-balancing algorithm that could allow suboptimal balancing if a non-migratable task gets queued behind a running migratable one. It is discussed in this thread: http://lkml.org/lkml/2008/4/22/296 This issue has been further exacerbated by a recent checkin to sched-devel (git-id 5eee63a5ebc19a870ac40055c0be49457f3a89a3). >From a pure priority standpoint, the run-queue is doing the "right" thing. Using Dmitry's nomenclature, if T0 is on cpu1 first, and T1 wakes up at equal or lower priority (affined only to cpu1) later, it should wait for T0 to finish. However, in reality that is likely suboptimal from a system perspective if there are other cores that could allow T0 and T1 to run concurrently. Since T1 can not migrate, the only choice for higher concurrency is to try to move T0. This is not something we addessed in the recent rt-balancing re-work. This patch tries to enhance the balancing algorithm by accomodating this scenario. It accomplishes this by incorporating the migratability of a task into its priority calculation. Within a numerical tsk->prio, a non-migratable task is logically higher than a migratable one. We maintain this by introducing a new per-priority queue (xqueue, or exclusive-queue) for holding non-migratable tasks. The scheduler will draw from the xqueue over the standard shared-queue (squeue) when available. There are several details for utilizing this properly. 1) During task-wake-up, we not only need to check if the priority preempts the current task, but we also need to check for this non-migratable condition. Therefore, if a non-migratable task wakes up and sees an equal priority migratable task already running, it will attempt to preempt it if there is a likelyhood that the current task will find an immediate home. 2) Tasks only get this non-migratable "priority boost" on wake-up. Any requeuing will result in the non-migratable task being queued to the end of the shared queue. This is an attempt to prevent the system from being completely unfair to migratable tasks during things like SCHED_RR timeslicing. I am sure this patch introduces potentially "odd" behavior if you concoct a scenario where a bunch of non-migratable threads could starve migratable ones given the right pattern. I am not yet convinced that this is a problem since we are talking about tasks of equal RT priority anyway, and there never is much in the way of guarantees against starvation under that scenario anyway. (e.g. you could come up with a similar scenario with a specific timing environment verses an affinity environment). I can be convinced otherwise, but for now I think this is "ok". Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Dmitry Adamushko <dmitry.adamushko@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-06	[MTD] m25p80.c mutex unlock fix	Chen Gong
	fix a mutex release bug in function m25p80_write. Signed-off-by: Chen Gong <g.chen@freescale.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2008-06-06	sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA card	Jaroslav Franek
	When the Linux kernel is compiled with CONFIG_DEBUG_SHIRQ=y, the Soundblaster Audigy2 ZS Notebook PCMCIA card causes the system hang during boot (udev stage) or when the card is hot-plug. The CONFIG_DEBUG_SHIRQ flag is by default 'y' with all Fedora kernels since 2.6.23. The problem was reported as https://bugzilla.redhat.com/show_bug.cgi?id=326411 The issue was hunted down to the snd_emu10k1_create() routine: /* pseudo-code */ snd_emu10k1_create(...) { ... request_irq(... IRQF_SHARED ...) { register the irq handler #ifdef CONFIG_DEBUG_SHIRQ call the irq handler: snd_emu10k1_interrupt() { poll I/O port // <---- !! system hangs ... } #endif } ... snd_emu10k1_cardbus_init(...) { initialize I/O ports } ... } The early access to I/O port in the interrupt handler causes the freeze. Obviously it is necessary to init the I/O ports before accessing them. This patch moves the registration of the irq handler after the initialization of the I/O ports. Signed-off-by: Jaroslav Franek <jarin.franek@post.cz> Acked-by: James Courtier-Dutton <James@superbug.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-06-06	Input: i8042 - add Fujitsu-Siemens Amilo Pro V2030 to nomux table	Jiri Kosina
	Fujitsu Siemens Amilo Pro V2030 needs nomux table entry, in addition to already existing entry for V2010 model (note that Fujitsu-Siemens changed the capitalization in the DMI data for product). Tested-by: Jiri Mleziva <jmleziva@tiscali.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>