linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2008-11-06	[IA64] reorder Kconfig options to match x86	Bjorn Helgaas
	No functional change, just reorder some config options and update the "Power management and ACPI" label to match the defacto x86 standard. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-11-06	[ARM] fix naming of MODULE_START / MODULE_END	Russell King
	As of 73bdf0a60e607f4b8ecc5aec597105976565a84f, the kernel needs to know where modules are located in the virtual address space. On ARM, we located this region between MODULE_START and MODULE_END. Unfortunately, everyone else calls it MODULES_VADDR and MODULES_END. Update ARM to use the same naming, so is_vmalloc_or_module_addr() can work properly. Also update the comment on mm/vmalloc.c to reflect that ARM also places modules in a separate region from the vmalloc space. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-11-06	Revert "x86: default to reboot via ACPI"	Eduardo Habkost
	This reverts commit c7ffa6c26277b403920e2255d10df849bd613380. the assumptio of this change was that this would not break any existing machine. Andrey Borzenkov reported troubles with the ACPI reboot method: the system would hang on reboot, necessiating a power cycle. Probably more systems are affected as well. Also, there are patches queued up for v2.6.29 to disable virtualization on emergency_restart() - which was the original motivation of this change. Reported-by: Andrey Borzenkov <arvidjaar@mail.ru> Bisected-by: Andrey Borzenkov <arvidjaar@mail.ru> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Avi Kivity <avi@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	x86: align DirectMap in /proc/meminfo	Hugh Dickins
	Impact: right-align /proc/meminfo consistent with other fields When the split-LRU patches added Inactive(anon) and Inactive(file) lines to /proc/meminfo, all counts were moved two columns rightwards to fit in. Now move x86's DirectMap lines two columns rightwards to line up. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	Merge branch 'iommu-fixes-2.6.28' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent
2008-11-06	AMD IOMMU: fix lazy IO/TLB flushing in unmap path	Joerg Roedel
	Lazy flushing needs to take care of the unmap path too which is not yet implemented and leads to stale IO/TLB entries. This is fixed by this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-11-06	[WATCHDOG] SAM9 watchdog - supported on all SAM9 and CAP9 processors	Andrew Victor
	The SAM9 watchdog driver is usable on the whole family of AT91SAM9 and CAP9 processors. Update the configuration to indicate this and allow the driver to be selected. Signed-off-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-11-06	[WATCHDOG] SAM9 watchdog - update for moved headers	Andrew Victor
	The architecture header files were recently moved from include/asm-arm/mach-at91/ to arch/arm/mach-at91/include/mach/. The SAM9 watchdog driver still includes a header from the old location. Signed-off-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-11-06	x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR	Suresh Siddha
	Impact: fix rare x2apic hang On x86, x2apic mode accesses for sending IPI's don't have serializing semantics. If the IPI receivner refers(in lock-free fashion) to some memory setup by the sender, the need for smp_mb() before sending the IPI becomes critical in x2apic mode. Add the smp_mb() in native_flush_tlb_others() before sending the IPI. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	md: linear: Fix a division by zero bug for very small arrays.	Andre Noll
	We currently oops with a divide error on starting a linear software raid array consisting of at least two very small (< 500K) devices. The bug is caused by the calculation of the hash table size which tries to compute sector_div(sz, base) with "base" being zero due to the small size of the component devices of the array. Fix this by requiring the hash spacing to be at least one which implies that also "base" is non-zero. This bug has existed since about 2.6.14. Cc: stable@kernel.org Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-06	x86: remove VISWS and PARAVIRT around NR_IRQS puzzle	Yinghai Lu
	Impact: fix warning message when PARAVIRT is set in config Remove stale #ifdef components from our IRQ sizing logic. x86/Voyager is the only holdout. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	cpumask: introduce new API, without changing anything	Rusty Russell
	Impact: introduce new APIs We want to deprecate cpumasks on the stack, as we are headed for gynormous numbers of CPUs. Eventually, we want to head towards an undefined 'struct cpumask' so they can never be declared on stack. 1) New cpumask functions which take pointers instead of copies. (cpus_* -> cpumask_*) 2) Several new helpers to reduce requirements for temporary cpumasks (cpumask_first_and, cpumask_next_and, cpumask_any_and) 3) Helpers for declaring cpumasks on or offstack for large NR_CPUS (cpumask_var_t, alloc_cpumask_var and free_cpumask_var) 4) 'struct cpumask' for explicitness and to mark new-style code. 5) Make iterator functions stop at nr_cpu_ids (a runtime constant), not NR_CPUS for time efficiency and for smaller dynamic allocations in future. 6) cpumask_copy() so we can allocate less than a full cpumask eventually (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask' definition eventually. 7) work_on_cpu() helper for doing task on a CPU, rather than saving old cpumask for current thread and manipulating it. 8) smp_call_function_many() which is smp_call_function_mask() except taking a cpumask pointer. Note that this patch simply introduces the new functions and leaves the obsolescent ones in place. This is to simplify the transition patches. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	Block: use round_jiffies_up()	Alan Stern
	This patch (as1159b) changes the timeout routines in the block core to use round_jiffies_up(). There's no point in rounding the timer deadline down, since if it expires too early we will have to restart it. The patch also removes some unnecessary tests when a request is removed from the queue's timer list. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	Add round_jiffies_up and related routines	Alan Stern
	This patch (as1158b) adds round_jiffies_up() and friends. These routines work like the analogous round_jiffies() functions, except that they will never round down. The new routines will be useful for timeouts where we don't care exactly when the timer expires, provided it doesn't expire too soon. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	block: fix __blkdev_get() for removable devices	Tejun Heo
	Commit 0762b8bde9729f10f8e6249809660ff2ec3ad735 moved disk_get_part() in front of recursive get on the whole disk, which caused removable devices to try disk_get_part() before rescanning after a new media is inserted, which might fail legit open attempts or give the old partition. This patch fixes the problem by moving disk_get_part() after __blkdev_get() on the whole disk. This problem was spotted by Borislav Petkov. Signed-off-by: Tejun Heo <tj@kernel.org> Tested-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	generic-ipi: fix the smp_mb() placement	Suresh Siddha
	smp_mb() is needed (to make the memory operations visible globally) before sending the ipi on the sender and the receiver (on Alpha atleast) needs smp_read_barrier_depends() in the handler before reading the call_single_queue list in a lock-free fashion. On x86, x2apic mode register accesses for sending IPI's don't have serializing semantics. So the need for smp_mb() before sending the IPI becomes more critical in x2apic mode. Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that smp_mb() doesn't mean anything on the sender, when the ipi receiver is not doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	blk: move blk_delete_timer call in end_that_request_last	Mike Anderson
	Move the calling blk_delete_timer to later in end_that_request_last to address an issue where blkdev_dequeue_request may have add a timer for the request. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	block: add timer on blkdev_dequeue_request() not elv_next_request()	Tejun Heo
	Block queue supports two usage models - one where block driver peeks at the front of queue using elv_next_request(), processes it and finishes it and the other where block driver peeks at the front of queue, dequeue the request using blkdev_dequeue_request() and finishes it. The latter is more flexible as it allows the driver to process multiple commands concurrently. These two inconsistent usage models affect the block layer implementation confusing. For some, elv_next_request() is considered the issue point while others consider blkdev_dequeue_request() the issue point. Till now the inconsistency mostly affect only accounting, so it didn't really break anything seriously; however, with block layer timeout, this inconsistency hits hard. Block layer considers elv_next_request() the issue point and adds timer but SCSI layer thinks it was just peeking and when the request can't process the command right away, it's just left there without further processing. This makes the request dangling on the timer list and, when the timer goes off, the request which the SCSI layer and below think is still on the block queue ends up in the EH queue, causing various problems - EH hang (failed count goes over busy count and EH never wakes up), WARN_ON() and oopses as low level driver trying to handle the unknown command, etc. depending on the timing. As SCSI midlayer is the only user of block layer timer at the moment, moving blk_add_timer() to elv_dequeue_request() fixes the problem; however, this two usage models definitely need to be cleaned up in the future. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	bio: define __BIOVEC_PHYS_MERGEABLE	Jeremy Fitzhardinge
	Define __BIOVEC_PHYS_MERGEABLE as the default implementation of BIOVEC_PHYS_MERGEABLE, so that its available for reuse within an arch-specific definition of BIOVEC_PHYS_MERGEABLE. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	block: remove unused ll_new_mergeable()	FUJITA Tomonori
	Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06	x86: mention ACPI in top-level Kconfig menu	Bjorn Helgaas
	Impact: clarify menuconfig text Mention ACPI in the top-level menu to give a clue as to where it lives. This matches what ia64 does. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	md: fix bug in raid10 recovery.	NeilBrown
	Adding a spare to a raid10 doesn't cause recovery to start. This is due to an silly type in commit 6c2fce2ef6b4821c21b5c42c7207cb9cf8c87eda and so is a bug in 2.6.27 and .28-rc. Thanks to Thomas Backlund for bisecting to find this. Cc: Thomas Backlund <tmb@mandriva.org> Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-06	md: revert the recent addition of a call to the BLKRRPART ioctl.	NeilBrown
	It turns out that it is only safe to call blkdev_ioctl when the device is actually open (as ->bd_disk is set to NULL on last close). And it is quite possible for do_md_stop to be called when the device is not open. So discard the call to blkdev_ioctl(BLKRRPART) which was added in commit 934d9c23b4c7e31840a895ba4b7e88d6413c81f3 It is just as easy to call this ioctl from userspace when needed (on mdadm -S) so leave it out of the kernel Signed-off-by: NeilBrown <neilb@suse.de>
2008-11-06	x86: size NR_IRQS on 32-bit systems the same way as 64-bit	Yinghai Lu
	Impact: make NR_IRQS big enough for system with lots of apic/pins If lots of IO_APIC's are there (or can be there), size the same way as 64-bit, depending on MAX_IO_APICS and NR_CPUS. This fixes the boot problem reported by Ben Hutchings on a 32-bit server with 5 IO-APICs and 240 IO-APIC pins. Signed-off-by: Yinghai <yinghai@kernel.org> Tested-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06	x86: don't allow nr_irqs > NR_IRQS	Ben Hutchings
	Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can return a value larger than NR_IRQS. This will lead to probe_irq_on() overrunning the irq_desc array. I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a Supermicro dual Xeon system. NR_IRQS is 224 but probe_nr_irqs() detects 5 IOAPICs and returns 240. Here are the log messages: Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) Tue Nov 4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23 Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24]) Tue Nov 4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47 Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48]) Tue Nov 4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71 Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72]) Tue Nov 4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95 Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96]) Tue Nov 4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119 Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Tue Nov 4 16:53:47 2008 Enabling APIC mode: Flat. Using 5 I/O APICs Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05	[JFFS2] fix race condition in jffs2_lzo_compress()	Geert Uytterhoeven
	deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However, jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the compressed data from lzo_compress_buf. Correct this by moving the mutex unlock after the copy. In addition, document what deflate_mutex actually protects. Cc: stable@kernel.org Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Richard Purdie <rpurdie@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05	net/9p: fix printk format warnings	Randy Dunlap
	Fix printk format warnings in net/9p. Built cleanly on 7 arches. net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	unsigned fid->fid cannot be negative	Roel Kluin
	Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	9p: rdma: remove duplicated #include	Huang Weiyi
	Removed duplicated #include <rdma/ib_verbs.h> in net/9p/trans_rdma.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	p9: Fix leak of waitqueue in request allocation path	Tom Tucker
	If a T or R fcall cannot be allocated, the function returns an error but neglects to free the wait queue that was successfully allocated. If it comes through again a second time this wq will be overwritten with a new allocation and the old allocation will be leaked. Also, if the client is subsequently closed, the close path will attempt to clean up these allocations, so set the req fields to NULL to avoid duplicate free. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	9p: Remove unneeded free of fcall for Flush	Tom Tucker
	T and R fcall are reused until the client is destroyed. There does not need to be a special case for Flush Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	9p: Make all client spin locks IRQ safe	Tom Tucker
	The client lock must be IRQ safe. Some of the lock acquisition paths took regular spin locks. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	9p: rdma: Set trans prior to requesting async connection ops	Tom Tucker
	The RDMA connection manager is fundamentally asynchronous. Since the async callback context is the client pointer, the transport in the client struct needs to be set prior to calling the first async op. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05	IB/mlx4: Set umem field to NULL in mlx4_ib_alloc_fast_reg_mr()	Vladimir Sokolovsky
	Set mr->umem to NULL in mlx4_ib_alloc_fast_reg_mr(). Otherwise ib_dereg_mr() may invoke ib_umem_release() on a random pointer value and get an oops. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-11-05	[SCSI] scsi_error regression: Fix idempotent command handling	Mike Christie
	Drivers want to be able to return DID_TRANSPORT_DISRUPTED and have it do the right thing for commands like tape and passthrouh as far as retries go. The LLDs previously used DID_BUS_BUSY or DID_ERROR which followed the cmd->retries limit, but DID_TRANSPORT_DISRUPTED was skipping that check so it could have caused a problem with tape commands. This patch has DID_TRANSPORT_DISRUPTED check the cmd->retries/cmd->allowed. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: Fix hexdump data in s390dbf traces	Christof Schmitt
	Fix multiple problems found in the hexdump data: - length calculation was wrong, traces were incomplete - FC payloads were dumped in different record than the output function tried to read - minor fixes in output - allow complete RSCN traces (up to 1024 bytes according to spec) Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: fix erp timeout cleanup for port open requests	Martin Petermann
	If an open port fsf request times out (in erp) the corresponding erp_action member of the fsf request need to set to NULL. If the port structure will be removed later-on there will be still a reference in the fsf request to the non existing erp_action otherwise. Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: Wait for port scan to complete when setting adapter online	Christof Schmitt
	Attaching a unit immediately after setting the adapter online should be possible. The problem right now is that the port_scan runs from a workqueue and has not finished when the set_online call returns and the sysfs structures for the ports are not available yet. Fix that by waiting for the port scan to complete. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: Fix cast warning	Christof Schmitt
	Fix leftover from last typecast patch: drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_port_enqueue’: drivers/s390/scsi/zfcp_aux.c:629: warning: format ‘%016llx’ expects type ‘long long unsigned int’, but argument 3 has type ‘u64’ Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: Fix request list handling in error path	Christof Schmitt
	Fix the handling of the request list in the error path: - Use irqsave for the lock as in the good path. - Before removing the request, check if it is still in the list, a call to dismiss_all might have changed the list in between. - zfcp_qdio_send does not change the queue counters on failure, trying revert something is wrong, so remove this. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: fix mempool usage for status_read requests	Christof Schmitt
	When allocating fsf requests without qtcb, store the pointer to the mempool in the fsf requests for later call to mempool_free. This codepath is only used by the status_read requests. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: fix req_list_locking.	Heiko Carstens
	The per adapter req_list_lock must be held with interrupts disabled, otherwise we might end up with nice deadlocks as lockdep tells us (see below). zfcp 0.0.1804: QDIO problem occurred. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.27-rc8-00035-g4a77035-dirty #86 --------------------------------------------------------- swapper/0 just changed the state of lock: (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c but this lock took another, hard-irq-unsafe lock in the past: (&adapter->req_list_lock){-+..} and interrupts could create inverse lock ordering between them. [tons of backtraces, but only the interesting part follows] the second lock's dependencies: -> (&adapter->req_list_lock){-+..} ops: 2280627634176 { initial-use at: [<0000000000071f10>] __lock_acquire+0x504/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0 [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140 [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0 [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318 [<000000000001b0d2>] kernel_thread_starter+0x6/0xc [<000000000001b0cc>] kernel_thread_starter+0x0/0xc in-softirq-W at: [<0000000000072172>] __lock_acquire+0x766/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0 [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0 [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8 [<0000000000047ba4>] tasklet_action+0x100/0x1f4 [<0000000000048b5a>] __do_softirq+0xae/0x154 [<0000000000021e4a>] do_softirq+0xea/0xf0 [<00000000000485de>] irq_exit+0xde/0xe8 [<0000000000268c64>] do_IRQ+0x160/0x1fc [<00000000000261a2>] io_return+0x0/0x8 [<000000000001b8f8>] cpu_idle+0x17c/0x224 hardirq-on-W at: [<0000000000072190>] __lock_acquire+0x784/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d702c>] _spin_lock+0x5c/0x9c [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158 [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124 [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0 [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318 [<000000000001b0d2>] kernel_thread_starter+0x6/0xc [<000000000001b0cc>] kernel_thread_starter+0x0/0xc } ... key at: [<0000000000e356c8>] __key.26629+0x0/0x8 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmit@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] zfcp: Dont clear reference from SCSI device to unit	Christof Schmitt
	It is possible that a remote port has a problem, the SCSI device gets deleted after the rport timeout and then the timeout for pending SCSI commands trigger an abort. For this case, don't delete the reference from the SCSI device to the zfcp unit, so that we can still have the reference to issue an abort request. Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] qla2xxx: Update version number to 8.02.01-k9.	Andrew Vasquez
	Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] qla2xxx: Return a FAILED status when abort mailbox-command fails.	Michael Reed
	Mike Reed noted (https://bugzilla.novell.com/show_bug.cgi?id=421330) that the driver was incorrectly returning a SUCCESS status if the driver's request to the firmware to abort a command failed. By doing so, the mid-layer believed, incorrectly, that the command has completed and has been returned (ultimately clearing scsi_cmnd.request_buffer) yet the driver still has the command. What should correctly happen is a mid-layer escalation (device-reset, etc.) of recovery during which the driver will eventually return the outstanding commands to the mid-layer. Cc: Stable Tree <stable@kernel.org> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] qla2xxx: Do not honour max_vports from firmware for 2G ISPs and below.	Shyam Sundar
	For 23XX ISPs, max_vports may return an invalid value. Do not honour it. Cc: Stable Tree <stable@kernel.org> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] qla2xxx: Use pci_disable_rom() to manipulate PCI config space.	Andrew Vasquez
	http://bugzilla.kernel.org/show_bug.cgi?id=9422 Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] qla2xxx: Correct Atmel flash-part handling.	Lalit Chandivade
	Use correct block size (4K) for erase command 0x20 for Atmel Flash. Use dword addresses for determining sector boundary. Cc: Stable Tree <stable@kernel.org> Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	[SCSI] megaraid: fix mega_internal_command oops	FUJITA Tomonori
	scsi_cmnd->cmnd was changed from a static array to a pointer post 2.6.25. It breaks mega_internal_command(): static int mega_internal_command(adapter_t adapter, megacmd_t mc, mega_passthru pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void )scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; mega_internal_command() uses scsi_cmnd allocated internally so scmd->cmnd is NULL here. This patch adds a static array for cdb to adapter_t and uses it here. This also uses scsi_allocate_command/scsi_free_command, the recommended way to allocate struct scsi_cmnd since the driver might use sense_buffer in struct scsi_cmnd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Pascal Terjan <pterjan@gmail.com> Reported-by: Pascal Terjan <pterjan@gmail.com> Acked-by: "Yang, Bo" <Bo.Yang@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05	sched: re-tune balancing	Ingo Molnar
	Impact: improve wakeup affinity on NUMA systems, tweak SMP systems Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain balancing defaults on NUMA and SMP systems. Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason why we would not want to have wakeup affinity across nodes as well. (we already do this in the standard NUMA template.) lat_ctx on a NUMA box is particularly happy about this change: before: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.60 \| 2 5.70 after: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.65 \| 2 2.07 a 2.75x speedup. pipe-test is similarly happy about it too: \| phoenix:~/sched-tests> ./pipe-test \| 18.26 usecs/loop. \| 14.70 usecs/loop. \| 14.38 usecs/loop. \| 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1 \| 8.63 usecs/loop. \| 8.59 usecs/loop. \| 9.03 usecs/loop. \| 8.94 usecs/loop. \| 8.96 usecs/loop. \| 8.63 usecs/loop. Also: - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings) - enable SD_WAKE_BALANCE on SMP domains Sysbench+postgresql improves all around the board, quite significantly: .28-rc3-11474e2c .28-rc3-11474e2c-tune ------------------------------------------------- 1: 571 688 +17.08% 2: 1236 1206 -2.55% 4: 2381 2642 +9.89% 8: 4958 5164 +3.99% 16: 9580 9574 -0.07% 32: 7128 8118 +12.20% 64: 7342 8266 +11.18% 128: 7342 8064 +8.95% 256: 7519 7884 +4.62% 512: 7350 7731 +4.93% ------------------------------------------------- SUM: 55412 59341 +6.62% So it's a win both for the runup portion, the peak area and the tail. Signed-off-by: Ingo Molnar <mingo@elte.hu>