linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-02-11	SGI IA64 UV: fix ia64 build error in the linux-next tree	Dean Nelson
	Fix the ia64 build error that occurs in the linux-next tree by introducing an ia64 version of uv.h. Additionally, clean up the usage of is_uv_system(). Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	x86: drop -fno-stack-protector annotations after pt_regs fixes	Brian Gerst
	Now that no functions rely on struct pt_regs being passed by value, various "no stack protector" annotations can be dropped. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	x86: pass in pt_regs pointer for syscalls that need it	Brian Gerst
	Some syscalls need to access the pt_regs structure, either to copy user register state or to modifiy it. This patch adds stubs to load the address of the pt_regs struct into the %eax register, and changes the syscalls to regparm(1) to receive the pt_regs pointer as the first argument. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	x86: use pt_regs pointer in do_device_not_available()	Brian Gerst
	The generic exception handler (error_code) passes in the pt_regs pointer and the error code (unused in this case). The commit "x86: fix math_emu register frame access" changed this to pass by value, which doesn't work correctly with stack protector enabled. Change it back to use the pt_regs pointer. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	stackprotector: fix multi-word cross-builds	Ingo Molnar
	Stackprotector builds were failing if CROSS_COMPILER was more than a single world (such as when distcc was used) - because the check scripts used $1 instead of $*. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	Merge commit 'v2.6.29-rc4' into x86/cleanups	Ingo Molnar

2009-02-11	x86: fix x86_32 stack protector bugs	Tejun Heo
	Impact: fix x86_32 stack protector Brian Gerst found out that %gs was being initialized to stack_canary instead of stack_canary - 20, which basically gave the same canary value for all threads. Fixing this also exposed the following bugs. * cpu_idle() didn't call boot_init_stack_canary() * stack canary switching in switch_to() was being done too late making the initial run of a new thread use the old stack canary value. Fix all of them and while at it update comment in cpu_idle() about calling boot_init_stack_canary(). Reported-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	x86, apic: make generic_apic_probe() generally available	Ingo Molnar
	Impact: build fix Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	Merge branch 'x86/apic' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/apic
2009-02-11	x86, apic: fix initialization of wakeup_cpu	Alok Kataria
	With refactoring of wake_cpu macros the 32bit code in tip doesn't execute generic_apic_probe if CONFIG_X86_32_NON_STANDARD is not set. Even on a x86 STANDARD cpu we need to execute the generic_apic_probe function, as we rely on this function to execute the update_genapic quirk which initilizes apic->wakeup_cpu. Failing to do so results in we making a call to a null function in do_boot_cpu. The stack trace without the patch goes like this. Booting processor 1 APIC 0x1 ip 0x6000 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<(null)>] (null) pdpt = 0000000000839001 pde = 0000000000c97067 *pte = 0000000000000163 Oops: 0000 [#1] SMP last sysfs file: Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.29-rc4-tip #18) VMware Virtual Platform EIP: 0062:[<00000000>] EFLAGS: 00010293 CPU: 0 EIP is at 0x0 EAX: 00000001 EBX: 00006000 ECX: c077ed00 EDX: 00006000 ESI: 00000001 EDI: 00000001 EBP: ef04cf40 ESP: ef04cf1c DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 006a Process swapper (pid: 1, ti=ef04c000 task=ef050000 task.ti=ef04c000) Stack: c0644e52 00000000 ef04cf24 ef04cf24 c064468d c0886dc0 00000000 c0702aea ef055480 00000001 00000101 dead4ead ffffffff ffffffff c08af530 00000000 c0709715 ef04cf60 ef04cf60 00000001 00000000 00000000 dead4ead ffffffff Call Trace: [<c0644e52>] ? native_cpu_up+0x2de/0x45b [<c064468d>] ? do_fork_idle+0x0/0x19 [<c0645c5e>] ? _cpu_up+0x88/0xe8 [<c0645d20>] ? cpu_up+0x42/0x4e [<c07e7462>] ? kernel_init+0x99/0x14b [<c07e73c9>] ? kernel_init+0x0/0x14b [<c040375f>] ? kernel_thread_helper+0x7/0x10 Code: Bad EIP value. EIP: [<00000000>] 0x0 SS:ESP 006a:ef04cf1c I think we should call generic_apic_probe unconditionally for 32 bit now. Signed-off-by: Alok N Kataria <akataria@vmware.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	[S390] Update default configuration.	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-11	[S390] dasd: fix race in dasd timer handling	Stefan Weinhuber
	In dasd_device_set_timer and dasd_block_set_timer we interpret the return value of mod_timer in a wrong way. If the timer expires in the small window between our check of timer_pending and the call to mod_timer, then the timer will be set, mod_timer returns zero and we will call add_timer for a timer that is already pending. As del_timer and mod_timer do all the necessary checking themselves, we can simplify our code and remove the race a the same time. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-11	[S390] dasd: bus_id -> dev_name() conversion.	Cornelia Huck
	bus_id usage crept in again; fix it. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-02-11	[S390] Fix init irq proc build break.	Sachin Sant
	Embed init_irq_proc(s390) within CONFIG_PROC_FS to fix a build break. Signed-off-by : Sachin Sant <sachinp@in.ibm.com>
2009-02-11	[S390] vdso: fix per cpu vdso pointer in lowcore	Martin Schwidefsky
	The vdso_per_cpu_data entry in the lowcore structure uses __u32 instead of __u64. If the data page is above 4GB the pointer is truncated and the kernel crashes. Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-11	ptrace, x86: fix the usage of ptrace_fork()	Oleg Nesterov
	I noticed by pure accident we have ptrace_fork() and friends. This was added by "x86, bts: add fork and exit handling", commit bf53de907dfdaac178c92d774aae7370d7b97d20. I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly believe this needs the fix. I think something like this program int main(void) { int pid = fork(); if (!pid) { ptrace(PTRACE_TRACEME, 0, NULL, NULL); kill(getpid(), SIGSTOP); fork(); } else { struct ptrace_bts_config bts = { .flags = PTRACE_BTS_O_ALLOC, .size = 4 * 4096, }; wait(NULL); ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK); ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts)); ptrace(PTRACE_CONT, pid, NULL, NULL); sleep(1); } return 0; } should crash the kernel. If the task is traced by its natural parent ptrace_reparented() returns 0 but we should clear ->btsxxx anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	tracing, x86: fix constraint for parent variable	Steven Rostedt
	The constraint used for retrieving and restoring the parent function pointer is incorrect. The parent variable is a pointer, and the address of the pointer is modified by the asm statement and not the pointer itself. It is incorrect to pass it in as an output constraint since the asm will never update the pointer. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11	sparc64: Fix crashes in jbusmc_print_dimm()	David S. Miller
	Return was missing for the case where there is no dimm info match. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-11	Merge branch 'tip/tracing/urgent' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
2009-02-11	ALSA: hda - add id for Intel IbexPeak integrated HDMI codec	Wu Fengguang
	Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-11	ALSA: hda - compute checksum in HDMI audio infoframe	Wu Fengguang
	Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-11	ALSA: hda - enable HDMI audio pin out at module loading time	Wu Fengguang
	We found that enabling/disabling HDMI audio pin out at stream start/stop time will kill the leading 500ms or so sound samples. Avoid this by enabling pin out once and for ever at module loading time. The leading ~500ms audio samples will still be lost when switching from X-channel playback to Y-channel playback where X != Y. However there's no much we can do about it: the audio infoframe has to change and it looks like either G45 or YAMAHA requires some time to switch the configuration. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-11	ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present	Wu Fengguang
	The YAMAHA AV-X1800 requires audio infoframe to include speaker-channel mapping to play >2 channel HDMI audio. In theory that mapping should be derived from its speaker configurations contained in its ELD. However we currently cannot get ELD in console before the KMS functionalities are ready. This is a more or less general issue at least in the near future. As a workaround, we propose to allow playback of mult-channel audio when ELD is not available. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-11	powerpc/mm: Fix _PAGE_COHERENT support on classic ppc32 HW	Kumar Gala
	The following commit: commit 64b3d0e8122b422e879b23d42f9e0e8efbbf9744 Author: Benjamin Herrenschmidt <benh@kernel.crashing.org> Date: Thu Dec 18 19:13:51 2008 +0000 powerpc/mm: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED broke setting of the _PAGE_COHERENT bit in the PPC HW PTE. Since we now actually set _PAGE_COHERENT in the Linux PTE we shouldn't be clearing it out before we propogate it to the PPC HW PTE. Reported-by: Martyn Welch <martyn.welch@gefanuc.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10	sunhme: Fix Quattro HME irq registration on proble failures	Meelis Roos
	Currently, the sunhme driver installs SBus Quattro interrupt handler when at least one HME card was initialized correctly and at least one Quattro card is present. This breaks when a Quattro card fails initialization for whatever reason - IRQ is registered and OOPS happens when it fires. The solution, as suggested by David Miller, was to keep track which cards of the Quattro bundles have been initialized, and request/free the Quattro IRQ only when all four devices have been successfully initialized. The patch only touches SBus initialization - PCI init already resets the card pointer to NULL on init failure. The patch has been tested on Sun E3500 with SBus and PCI single HME cards and one PCI Quattro HME card in a situation where any PCI card failed init when the SBus routines tried to init them by mistake. Additionally it replaces Quattro request_irq panic with error return - if this card fails to work, at least let the others work. Tested on E450 with PCI HME and PCI Quad HME. [ Minor coding style fixups -DaveM ] Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	fore200: fix oops on failed firmware load	Meelis Roos
	Fore 200 ATM driver fails to handle request_firmware failures and oopses when no firmware file was found. Fix it by checking for the right return values and propaganting the return value up. Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	pkt_sched: type should be __u32 in header	Chuck Ebbert
	Using u32 in this header breaks the build of iptables. Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	Phonet: do not compute unused value	Rémi Denis-Courmont
	Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	Phonet: fix double free in GPRS outbound packet error path	Rémi Denis-Courmont
	Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	mdio-gpio: Add mdc pin direction initialization	Paulius Zaleckas
	mdc pin should always be output. Initialize it as output, so each board code does not need to do this. Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] AACI: timeout will reach -1 [ARM] Storage class should be before const qualifier [ARM] pxa: stop and disable IRQ for each DMA channels at startup [ARM] pxa: make more SSCR0 bit definitions visible on multiple processors [ARM] pxa: fix missing of __REG() definition for ac97 registers access [ARM] pxa: fix NAND and MMC clock initialization for pxa3xx
2009-02-10	hugetlbfs: fix build failure with !CONFIG_HUGETLBFS	Stefan Richter
	Fix regression due to 5a6fe125950676015f5108fb71b2a67441755003, "Do not account for the address space used by hugetlbfs using VM_ACCOUNT" which added an argument to the function hugetlb_file_setup() but not to the macro hugetlb_file_setup(). Reported-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10	ASoC: Update SDP3430 machine driver for snd_soc_card	Lopez Cruz, Misael
	This patch replaces "snd_soc_machine" structure by "snd_soc_card" in SP3430 driver. This change is needed in SDP3430 driver to reflect changes introduced by "ASoC: Rename snd_soc_card to snd_soc_machine" patch (875065491fba8eb13219f16c36e79a6fb4e15c68). Signed-off-by: Misael Lopez Cruz <x0052729@ti.com> Acked-by: Jarkko Nikula <jarkko.nikula@nokia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2009-02-10	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Add missing sparsemem.h include powerpc/pci: mmap anonymous memory when legacy_mem doesn't exist powerpc/cell: Add missing #include for oprofile powerpc/ftrace: Fix math to calculate offset in TOC powerpc: Don't emulate mr. instructions powerpc/fsl-booke: Fix mapping functions to use phys_addr_t arch/powerpc: Eliminate double sizeof powerpc/cpm2: Fix set interrupt type powerpc/83xx: Fix TSEC0 workability on MPC8313E-RDB boards powerpc/83xx: Fix missing #{address,size}-cells in mpc8313erdb.dts powerpc/83xx: Build breakage for CONFIG_PM but no CONFIG_SUSPEND
2009-02-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix probe_kernel_{read,write}(). sparc64: Kill .fixup section bloat. sparc64: Don't hook up pcr_ops on spitfire chips. sparc64: Call dump_stack() in die_nmi().
2009-02-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits) bridge: Fix LRO crash with tun IPv6: fix to set device name when new IPv6 over IPv6 tunnel device is created. gianfar: Fix boot hangs while bringing up gianfar ethernet netfilter: xt_sctp: sctp chunk mapping doesn't work netfilter: ctnetlink: fix echo if not subscribed to any multicast group netfilter: ctnetlink: allow changing NAT sequence adjustment in creation netfilter: nf_conntrack_ipv6: don't track ICMPv6 negotiation message netfilter: fix tuple inversion for Node information request netxen: fix msi-x interrupt handling de2104x: force correct order when writing to rx ring tun: Fix unicast filter overflow drivers/isdn: introduce missing kfree drivers/atm: introduce missing kfree sunhme: Don't match PCI devices in SBUS probe. 9p: fix endian issues [attempt 3] net_dma: call dmaengine_get only if NET_DMA enabled 3c509: Fix resume from hibernation for PnP mode. sungem: Soft lockup in sungem on Netra AC200 when switching interface up RxRPC: Fix a potential NULL dereference r8169: Don't update statistics counters when interface is down ...
2009-02-10	Do not account for the address space used by hugetlbfs using VM_ACCOUNT	Mel Gorman
	When overcommit is disabled, the core VM accounts for pages used by anonymous shared, private mappings and special mappings. It keeps track of VMAs that should be accounted for with VM_ACCOUNT and VMAs that never had a reserve with VM_NORESERVE. Overcommit for hugetlbfs is much riskier than overcommit for base pages due to contiguity requirements. It avoids overcommiting on both shared and private mappings using reservation counters that are checked and updated during mmap(). This ensures (within limits) that hugepages exist in the future when faults occurs or it is too easy to applications to be SIGKILLed. As hugetlbfs makes its own reservations of a different unit to the base page size, VM_ACCOUNT should never be set. Even if the units were correct, we would double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may be set because an application can request no reserves be made for hugetlbfs at the risk of getting killed later. With commit fc8744adc870a8d4366908221508bb113d8b72ee, VM_NORESERVE and VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This breaks the accounting for both the core VM and hugetlbfs, can trigger an OOM storm when hugepage pools are too small lockups and corrupted counters otherwise are used. This patch brings hugetlbfs more in line with how the core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10	tracing, x86: fix fixup section to return to original code	Steven Rostedt
	Impact: fix to prevent a kernel crash on fault If for some reason the pointer to the parent function on the stack takes a fault, the fix up code will not return back to the original faulting code. This can lead to unpredictable results and perhaps even a kernel panic. A fault should not happen, but if it does, we should simply disable the tracer, warn, and continue running the kernel. It should not lead to a kernel crash. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10	[SCSI] qla2xxx: Update version number to 8.03.00-k3.	Andrew Vasquez
	Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] qla2xxx: Mask out 'reserved' bits while processing FLT regions.	Andrew Vasquez
	Bits 31-8 are marked as reserved and should be ignored while interpreting a region's code. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] qla2xxx: Correct slab-error overwrite during vport creation and deletion.	Anirban Chakraborty
	The clearing of a vha's req_ques were overrunning during vport creation. During deletion, vport queues should be torn-down after all cleanup has occurred. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] qla2xxx: Properly acknowledge IDC notification messages.	Andrew Vasquez
	To ensure smooth operations amongst the FCoE and NIC side components of the ISP81xx chip, the FCoE driver (qla2xxx) must ensure the 10gb NIC driver (qlge) does not timeout waiting for IDC (Inter-Driver Communication) acknowledgments. The acknowledgment requirements are trivial -- a simple mirroring of incoming mailbox registers during the AEN to a process-context capable mailbox command. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] qla2xxx: Remove interrupt request bit check in the response ↵	Anirban Chakraborty
	processing path in multiq mode. Correct response-queue-0 processing by instructing the firmware to run with interrupt-handshaking disabled, similarly to what is now done for all non-0 response queues. Since all response-queues now run in the same mode, the driver no longer needs the hot-path 'is-disabled-HCCR' test. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] lpfc: introduce missing kfree	Julia Lawall
	Error handling code following a kmalloc should free the allocated data. The semantic match that finds the problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; statement S; expression E; identifier f,l; position p1,p2; expression ptr != NULL; @@ ( if ((x@p1 = $kmalloc\\|kzalloc\\|kcalloc$(...)) == NULL) S \| x@p1 = $kmalloc\\|kzalloc\\|kcalloc$(...); ... if (x == NULL) S ) <... when != x when != if (...) { <+...x...+> } x->f = E ...> ( return $0\\|<+...x...+>\\|ptr$; \| return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print " file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] libiscsi: Fix scsi command timeout oops in iscsi_eh_timed_out	Mike Christie
	Yanling Qi from LSI found the root cause of the panic, below is his analysis: Problem description: the open iscsi driver installs eh_timed_out handler to the blank_transport_template of the scsi middle level that causes panic of timed out command of other host Here are the details Iscsi Session creation During iscsi session creation time, the iscsi_tcp_session_create() of iscsi_tpc.c will create a scsi-host for the session. See the statement marked with the label A. The statement B replaces the shost->transportt point with a local struct variable. static struct iscsi_cls_session * iscsi_tcp_session_create(struct iscsi_endpoint ep, uint16_t cmds_max, uint16_t qdepth, uint32_t initial_cmdsn, uint32_t hostno) { struct iscsi_cls_session cls_session; struct iscsi_session session; struct Scsi_Host shost; int cmd_i; if (ep) { printk(KERN_ERR "iscsi_tcp: invalid ep %p.\n", ep); return NULL; } A shost = iscsi_host_alloc(&iscsi_sht, 0, qdepth); if (!shost) return NULL; B shost->transportt = iscsi_tcp_scsi_transport; shost->max_lun = iscsi_max_lun; Please note the scsi host is allocated by invoking isccsi_host_alloc() in libiscsi.c Polluting the middle level blank_transport_template in iscsi_host_alloc() of libiscsi.c The iscsi_host_alloc() invokes the middle level function scsi_host_alloc() in hosts.c for allocating a scsi_host. Then the statement marked with C assigns the iscsi_eh_cmd_timed_out handler to the eh_timed_out callback function. struct Scsi_Host iscsi_host_alloc(struct scsi_host_template sht, int dd_data_size, uint16_t qdepth) { struct Scsi_Host shost; struct iscsi_host ihost; shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) + dd_data_size); if (!shost) return NULL; C shost->transportt->eh_timed_out = iscsi_eh_cmd_timed_out; Please note the shost->transport is the middle level blank_transport_template as shown in the code segment below. We see two problems here. 1. iscsi_eh_cmd_timed_out is installed to the blank_transport_template that will cause some body else problem. 2. iscsi_eh_cmd_timed_out will never be invoked when iscsi command gets timeout because the statement B resets the pointer. Middle level blank_transport_template In the middle level function scsi_host_alloc() of hosts.c, the middle level assigns a blank_transport_template for those hosts not implementing its transport layer. All HBAs without supporting a specific scsi_transport will share the middle level blank_transport_template. Please see the statement D struct Scsi_Host scsi_host_alloc(struct scsi_host_template sht, int privsize) { struct Scsi_Host shost; gfp_t gfp_mask = GFP_KERNEL; int rval; if (sht->unchecked_isa_dma && privsize) gfp_mask \|= __GFP_DMA; shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask); if (!shost) return NULL; shost->host_lock = &shost->default_lock; spin_lock_init(shost->host_lock); shost->shost_state = SHOST_CREATED; INIT_LIST_HEAD(&shost->__devices); INIT_LIST_HEAD(&shost->__targets); INIT_LIST_HEAD(&shost->eh_cmd_q); INIT_LIST_HEAD(&shost->starved_list); init_waitqueue_head(&shost->host_wait); mutex_init(&shost->scan_mutex); shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy / shost->dma_channel = 0xff; / These three are default values which can be overridden / shost->max_channel = 0; shost->max_id = 8; shost->max_lun = 8; / Give each shost a default transportt / D shost->transportt = &blank_transport_template; Why we see panic at iscsi_eh_cmd_timed_out() The mpp virtual HBA doesn’t have a specific scsi_transport. Therefore, the blank_transport_template will be assigned to the virtual host of the MPP virtual HBA by SCSI middle level. Please note that the statement C has assigned iscsi-transport eh_timedout handler to the blank_transport_template. When a mpp virtual command gets timedout, the iscsi_eh_cmd_timed_out() will be invoked to handle mpp virtual command timeout from the middle level scsi_times_out() function of the scsi_error.c. enum blk_eh_timer_return scsi_times_out(struct request req) { struct scsi_cmnd scmd = req->special; enum blk_eh_timer_return (eh_timed_out)(struct scsi_cmnd ); enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED; scsi_log_completion(scmd, TIMEOUT_ERROR); if (scmd->device->host->transportt->eh_timed_out) E eh_timed_out = scmd->device->host->transportt->eh_timed_out; else if (scmd->device->host->hostt->eh_timed_out) eh_timed_out = scmd->device->host->hostt->eh_timed_out; else eh_timed_out = NULL; if (eh_timed_out) { rtn = eh_timed_out(scmd); It is very easy to understand why we get panic in the iscsi_eh_cmd_timed_out(). A scsi_cmnd from a no-iscsi device definitely can not resolve out a session and session->lock. The panic can be happed anywhere during the differencing. static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd scmd) { struct iscsi_cls_session cls_session; struct iscsi_session session; struct iscsi_conn *conn; enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED; cls_session = starget_to_session(scsi_target(scmd->device)); session = cls_session->dd_data; debug_scsi("scsi cmd %p timedout\n", scmd); spin_lock(&session->lock); This patch fixes the problem by moving the setting of the iscsi_eh_cmd_timed_out to iscsi_add_host, which is after the LLDs have set their transport template to shost->transportt. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] qla2xxx: fix Kernel Panic with Qlogic 2472 Card.	Shyam_Iyer@Dell.com
	Kernel Panic is observed with a Qlogic 2472 Card is plugged into the system and the qla2xxx driver is loaded: QLogic Fibre Channel HBA Driver: 8.02.01.02.11.0-k9 vendor=8086 device=3410 qla2xxx 0000:05:00.0: PCI INT A -> GSI 40 (level, low) -> IRQ 40 qla2xxx 0000:05:00.0: Found an ISP2432, irq 40, iobase 0xffffc2001091c000 qla2xxx 0000:05:00.0: Configuring PCI space... qla2xxx 0000:05:00.0: setting latency timer to 64 qla2xxx 0000:05:00.0: Configure NVRAM parameters... BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 IP: [<ffffffff8036319a>] strncpy+0x5/0x1e PGD 7c564067 PUD 78d8c067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1d.1/usb6/6-2/6-2:1.1/input/input4/event 4/dev CPU 1 Modules linked in: qla2xxx(+) squashfs usb_storage scsi_transport_fc scsi_tgt parport_pc parport arc4 ecb crypto_blkcipher acpi_cpufreq fan loop nfs nfs_acl lockd sunrpc nls_iso8859_1 nls_cp437 ipv6 af_packet st sr_mod ide_disk ide_cd_mod ide_core cdrom usbhid hid ff_memless sg sd_mod crc_t10dif uhci_hcd mptsas mptscsih ehci_hcd mptbase scsi_transport_sas rtc_cmos rtc_core rtc_lib usbcore scsi_mod thermal bnx2 button processor thermal_sys hwmon edd Supported: Yes Pid: 4415, comm: insmod Not tainted 2.6.27.13-1-default #1 RIP: 0010:[<ffffffff8036319a>] [<ffffffff8036319a>] strncpy+0x5/0x1e RSP: 0018:ffff88007b04fbc0 EFLAGS: 00010202 RAX: 00000000000000b7 RBX: ffff88007b9641e0 RCX: ffff88007c1b2ad7 RDX: 000000000000004f RSI: 0000000000000000 RDI: ffff88007c1b2ad7 RBP: ffff88007c1b0620 R08: 0000000000000010 R09: 0000000100000000 R10: 0000000000000046 R11: ffffffff803651c6 R12: ffff88007b074000 R13: ffff88007b964000 R14: ffff88007c1b2ac6 R15: 0000000000000000 FS: 00007f91a6c366f0(0000) GS:ffff88007dbeee40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000007bd7c000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process insmod (pid: 4415, threadinfo ffff88007b04e000, task ffff880078586180) Stack: ffffffffa02d82c4 0000000000002432 ffff88007d385000 ffff88007c1b0620 ffff88007c1b0620 ffff88007c1b0000 ffff88007d385000 0000000000002432 ffffffffa02dcb1e 0000000000002432 ffffc2001091c000 ffff88007c1b0620 Call Trace: [<ffffffffa02d82c4>] qla24xx_nvram_config+0x385/0x6c2 [qla2xxx] [<ffffffffa02dcb1e>] qla2x00_initialize_adapter+0x169/0x383 [qla2xxx] [<ffffffffa02f2040>] qla2x00_probe_one+0x6bc/0x9c6 [qla2xxx] [<ffffffff8037346f>] pci_device_probe+0xb8/0x105 [<ffffffff803e5a27>] really_probe+0xdd/0x1e5 [<ffffffff803e5c14>] __driver_attach+0x46/0x6d [<ffffffff803e51e1>] bus_for_each_dev+0x44/0x78 [<ffffffff803e4ac7>] bus_add_driver+0xef/0x235 [<ffffffff803e5dd8>] driver_register+0xa2/0x11f [<ffffffff803736fd>] __pci_register_driver+0x5d/0x90 [<ffffffffa0308126>] qla2x00_module_init+0x126/0x159 [qla2xxx] [<ffffffff80209041>] _stext+0x41/0x110 [<ffffffff80260abd>] sys_init_module+0xa0/0x1ba [<ffffffff8020bfbb>] system_call_fastpath+0x16/0x1b [<00007f91a679b76a>] 0x7f91a679b76a Code: ff c1 41 39 c0 75 05 45 85 c0 75 bf 41 29 c0 44 89 c0 c3 31 d2 8a 04 16 88 04 17 48 ff c2 84 c0 75 f3 48 89 f8 c3 48 89 f9 eb 10 <8a> 06 3c 01 88 01 48 83 de ff 48 ff c1 48 ff ca 48 85 d2 75 eb RIP [<ffffffff8036319a>] strncpy+0x5/0x1e RSP <ffff88007b04fbc0> CR2: 0000000000000000 ---[ end trace 829d7d78dfafb785 ]--- The attached patch fixes the issue. Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Acked-by: Seokmann Ju <Seokmann.ju@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] ibmvfc: Increase cancel timeout	Brian King
	During cancel testing it has been shown that 15 seconds is not nearly long enough for the VIOS to respond to a cancel under loaded situations. Increasing this timeout to 60 seconds allows time for the VIOS to cancel the outstanding commands and prevents us from escalating to a full host reset, which can take much longer. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] ibmvfc: Fix rport relogin	Brian King
	The ibmvfc driver has a bug in its SCN handling. If it receives an ELS event such asn an N-Port SCN event or an unsolicited PLOGI, or any other SCN event which causes ibmvfc_reinit_host to be called, it is possible that we will call fc_remote_port_add for a target that already has an rport added, which can result in duplicate rports getting created for the same targets. Fix this by calling fc_remote_port_rolechg in this scenario instead to report any possible role change that may have occurred. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] ibmvfc: Fix command timeout errors	Brian King
	Currently the ibmvfc driver sets the IBMVFC_CLASS_3_ERR flag in the VFC Frame if both the adapter and the device claim support for Class 3. However, this bit actually refers to Class 3 Error Recovery, which is currently not supported by the VIOS. Setting this bit can cause lots of command timeout responses from the VIOS resulting in general instability. Fix this by never setting this bit. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10	[SCSI] sg: fix device number in blktrace data	Martin Peschke
	Hi, we have run into an issue with blktrace being started for sg devices. Please apply. Thanks, Martin From: Martin Peschke <mpeschke@linux.vnet.ibm.com> The device number denoting a generic SCSI devices (sg) in a blktrace trace is broken; major and minor are always 0. It looks like sdp->device->sdev_gendev.devt is not initialized properly. The fix below uses other data to make up a valid device number, similar to the way an sg device number is generated for sysfs output. Reported-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>