summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2012-02-29[SCSI] isci: kill iphy->isci_port lookupsDan Williams
This field is a holdover from the OS abstraction conversion. The stable phy to port lookups are done via iphy->ownining_port under scic_lock. After this conversion to use port->lldd_port the only volatile lookup is the initial lookup in isci_port_formed(). After that point any lookup via a successfully notified domain_device is guaranteed to be valid until the domain_device is destroyed. Delete ->start_complete as it is only set once and is set as a consequence of the port going link up, by definition of getting a port formed event the port is "ready". While we are correcting port lookups also move the asd_sas_port table out from under the isci_port. This is to preclude any temptation to use container_of() to convert an asd_sas_port to an isci_port, the association is dynamic and under libsas control. Tested-by: Maciej Trela <maciej.trela@intel.com> [dmilburn@redhat.com: fix i686 compile error] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: don't recover 'gone' devices in sas_ata_hard_reset()Dan Williams
The commands that timeout when a disk is forcibly removed may trigger libata to attempt recovery of the device. If libsas has decided to remove the device don't permit ata to continue to issue resets to its last known phy. The primary motivation for this patch is hotplug testing by writing 0 to /sys/class/sas_phy/phyX/enable. Without this check this test leads to libata issuing a reset and re-enabling the device that wants to be torn down. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: fix sas_find_local_phy(), take phy referencesDan Williams
In the direct-attached case this routine returns the phy on which this device was first discovered. Which is broken if we want to support wide-targets, as this phy reference can become stale even though the port is still active. In the expander-attached case this routine tries to lookup the phy by scanning the attached sas addresses of the parent expander, and BUG_ONs if it can't find it. However since eh and the libsas workqueue run independently we can still be attempting device recovery via eh after libsas has recorded the device as detached. This is even easier to hit now that eh is blocked while device domain rediscovery takes place, and that libata is fed more timed out commands increasing the chances that it will try to recover the ata device. Arrange for dev->phy to always point to a last known good phy, it may be stale after the port is torn down, but it will catch up for wide port reconfigurations, and never be NULL. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: check for 'gone' expanders in smp_execute_task()Dan Williams
No sense in issuing or retrying commands to an expander that has been removed. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: don't mark expanders as gone when a child device is removedDan Williams
Commit 56dd2c06 "[SCSI] libsas: Don't issue commands to devices that have been hot-removed" marked the parent device of an end-device as gone when all the phys to the end device have been deleted. The expander device is still present until its parent is removed. This is a benign change until the smp_execute_task() path is taught to check ->gone. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: poll for ata device readiness after resetDan Williams
Use ata_wait_after_reset() to poll for link recovery after a reset. This combined with sas_ha->eh_mutex prevents expander rediscovery from probing phys in an intermediate state. Local discovery does not have a mechanism to filter link status changes during this timeout, so it remains the responsibility of lldds to prevent premature port teardown. Although once all lldd's support ->lldd_ata_check_ready() that could be used as a gate to local port teardown. The signature fis is re-transmitted when the link comes back so we should be revalidating the ata device class, but that is left to a future patch. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-28qla4xxx: Add missing spaces to error messagesPetr Uzel
Signed-off-by: Petr Uzel <petr.uzel@suse.cz> Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-27PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_deviceYinghai Lu
The old pci_remove_bus_device actually did stop and remove. Make the name reflect that to reduce confusion. This patch is done by sed scripts and changes back some incorrect __pci_remove_bus_device changes. Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/sfc/rx.c Overlapping changes in drivers/net/ethernet/sfc/rx.c, one to change the rx_buf->is_page boolean into a set of u16 flags, and another to adjust how ->ip_summed is initialized. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-25scsi: Use struct scsi_lun in fc/fcp.hAndy Grover
This allows us to use scsilun_to_int without an ugly cast. Fix up places that use scsilun_to_int on fcp->fc_lun accordingly. In fc target, this leaves ft_cmd.lun unused, so remove it. Signed-off-by: Andy Grover <agrover@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Kiran Patil <kiran.patil@intel.com> Cc: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-02-25[SCSI] osd_uld: Bump MAX_OSD_DEVICES from 64 to 1,048,576Boaz Harrosh
It used to be that minors where 8 bit. But now they are actually 20 bit. So the fix is simplicity itself. I've tested with 300 devices and all user-mode utils work just fine. I have also mechanically added 10,000 to the ida (so devices are /dev/osd10000, /dev/osd10001 ...) and was able to mkfs an exofs filesystem and access osds from user-mode. All the open-osd user-mode code uses the same library to access devices through their symbolic names in /dev/osdX so I'd say it's pretty safe. (Well tested) This patch is very important because some of the systems that will be deploying the 3.2 pnfs-objects code are larger than 64 OSDs and will stop to work properly when reaching that number. CC: Stable <stable@vger.kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-24Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 SCSI fixes on 20120224: "This is a set of assorted bug fixes for power management, mpt2sas, ipr, the rdac device handler and quite a big chunk for qla2xxx (plus a use after free of scsi_host in scsi_scan.c). " * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] scsi_dh_rdac: Fix for unbalanced reference count [SCSI] scsi_pm: Fix bug in the SCSI power management handler [SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost' [SCSI] qla2xxx: Update version number to 8.03.07.13-k. [SCSI] qla2xxx: Proper detection of firmware abort error code for ISP82xx. [SCSI] qla2xxx: Remove resetting memory during device initialization for ISP82xx. [SCSI] qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle. [SCSI] qla2xxx: Remove check for null fcport from host reset handler. [SCSI] qla2xxx: Correct out of bounds read of ISP2200 mailbox registers. [SCSI] qla2xxx: Remove errant clearing of MBX_INTERRUPT flag during CT-IOCB processing. [SCSI] qla2xxx: Clear options-flags while issuing stop-firmware mbx command. [SCSI] qla2xxx: Add an "is reset active" helper. [SCSI] qla2xxx: Add check for null fcport references in qla2xxx_queuecommand. [SCSI] qla2xxx: Propagate up abort failures. [SCSI] isci: Fix NULL ptr dereference when no firmware is being loaded [SCSI] ipr: fix eeh recovery for 64-bit adapters [SCSI] mpt2sas: Fix mismatch in mpt2sas_base_hard_reset_handler() mutex lock-unlock
2012-02-23Merge branch 'usb-3.3-rc4' into usb-nextGreg Kroah-Hartman
This is to pull in the xhci changes and the other fixes and device id updates that were done in Linus's tree. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-22[SCSI] scsi_dh_rdac: Fix for unbalanced reference countMoger, Babu
This patch fixes an unbalanced refcount issue. Elevating the lock for both kref_put and also for controller node deletion. Previously, controller deletion was protected but the not the kref_put. This was causing the other thread to pick up the controller structure which was already kref'd zero. This was causing the following WARN_ON and also sometimes panic. WARNING: at lib/kref.c:43 kref_get+0x2d/0x30() (Not tainted) Hardware name: IBM System x3655 -[7985AC1]- Modules linked in: fuse scsi_dh_rdac autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc 8021q garp stp llc ipv6 ib_srp(U) scsi_transport_srp scsi_tgt ib_cm(U) ib_sa(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log dm_round_robin dm_multipath uinput bnx2 ses enclosure sg ibmpex ibmaem ipmi_msghandler serio_raw k8temp hwmon amd64_edac_mod edac_core edac_mce_amd shpchp i2c_piix4 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif sata_svw pata_acpi ata_generic pata_serverworks aacraid radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mod [last unloaded: freq_table] Pid: 13735, comm: srp_daemon Not tainted 2.6.32-71.el6.x86_64 #1 Call Trace: [<ffffffff8106b857>] warn_slowpath_common+0x87/0xc0 [<ffffffff8106b8aa>] warn_slowpath_null+0x1a/0x20 [<ffffffff8125c39d>] kref_get+0x2d/0x30 [<ffffffffa01b4029>] rdac_bus_attach+0x459/0x580 [scsi_dh_rdac] [<ffffffff8135232a>] scsi_dh_handler_attach+0x2a/0x80 [<ffffffff81352c7b>] scsi_dh_notifier+0x9b/0xa0 [<ffffffff814cd7a5>] notifier_call_chain+0x55/0x80 [<ffffffff8109711a>] __blocking_notifier_call_chain+0x5a/0x80 [<ffffffff81097156>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff8132bec5>] device_add+0x515/0x640 [<ffffffff813329e4>] ? attribute_container_device_trigger+0xc4/0xe0 [<ffffffff8134f659>] scsi_sysfs_add_sdev+0x89/0x2c0 [<ffffffff8134d096>] scsi_probe_and_add_lun+0xea6/0xed0 [<ffffffff8134beb2>] ? scsi_alloc_target+0x292/0x2d0 [<ffffffff8134d1e1>] __scsi_scan_target+0x121/0x750 [<ffffffff811df806>] ? sysfs_create_file+0x26/0x30 [<ffffffff8132b759>] ? device_create_file+0x19/0x20 [<ffffffff81332838>] ? attribute_container_add_attrs+0x78/0x90 [<ffffffff814b008c>] ? klist_next+0x4c/0xf0 [<ffffffff81332e30>] ? transport_configure+0x0/0x20 [<ffffffff813329e4>] ? attribute_container_device_trigger+0xc4/0xe0 [<ffffffff8134df40>] scsi_scan_target+0xd0/0xe0 [<ffffffffa02f053a>] srp_create_target+0x75a/0x890 [ib_srp] [<ffffffff8132a130>] dev_attr_store+0x20/0x30 [<ffffffff811df145>] sysfs_write_file+0xe5/0x170 [<ffffffff8116c818>] vfs_write+0xb8/0x1a0 [<ffffffff810d40a2>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff8116d251>] sys_write+0x51/0x90 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b Signed-off-by: Babu Moger <babu.moger@netapp.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-21asm-generic: architecture independent readq/writeq for 32bit environmentHitoshi Mitake
This provides unified readq()/writeq() helper functions for 32-bit drivers. For some cases, readq/writeq without atomicity is harmful, and order of io access has to be specified explicitly. So in this patch, new two header files which contain non-atomic readq/writeq are added. - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with the order of lower address -> higher address - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with reversed order This allows us to remove some readq()s that were added drivers when the default non-atomic ones were removed in commit dbee8a0affd5 ("x86: remove 32-bit versions of readq()/writeq()") The drivers which need readq/writeq but can do with the non-atomic ones must add the line: #include <asm-generic/io-64-nonatomic-lo-hi.h> /* or hi-lo.h */ But this will be nop in 64-bit environments, and no other #ifdefs are required. So I believe that this patch can solve the problem of 1. driver-specific readq/writeq 2. atomicity and order of io access This patch is tested with building allyesconfig and allmodconfig as ARCH=x86 and ARCH=i386 on top of tip/master. Cc: Kashyap Desai <Kashyap.Desai@lsi.com> Cc: Len Brown <lenb@kernel.org> Cc: Ravi Anand <ravi.anand@qlogic.com> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: James Bottomley <James.Bottomley@parallels.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Roland Dreier <roland@purestorage.com> Cc: James Bottomley <jbottomley@parallels.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-21scsi: Fix typo in pmcraid.hMasanari Iida
Correct spelling "thresold" to "threshold" in drivers/scsi/pmraid.h Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-20bnx2fc: HSI dependent changes for 7.2.xx FWBhanu Prakash Gollapudi
with Tx only section for single cached SGEs. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-19[SCSI] libsas: async ata-ehDan Williams
Once sas_ata_hard_reset() starts honoring the 'deadline' parameter a pathological configuration could take 25 seconds per ata device (serialized) to recover. Run per-port recoveries in parallel. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: add mutex for SMP task executionJeff Skirvin
SAS does not tag SMP requests, and at least one lldd (isci) does not permit more than one in-flight request at a time. [jejb: fix sas_init_dev tab issues while we're at it] Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: Remove redundant phy state notification calls.Jeff Skirvin
In the case of an explicit sas_phy_enable call to disable a phy, the LLDD provides the calls to sas_phy_disconnected and the PHYE_LOSS_OF_SIGNAL event. NOTE: This assumes that the lldd(s) generate the notification, which appears to be the case, but only verfied on isci. Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: sas_phy_enable via transport_sas_phy_resetDan Williams
Execute the link-reset triggered by sas_phy_enable via transport_sas_phy_reset so that it can be managed by libata. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: execute transport link resets with libata-eh via host workqueueDan Williams
Link resets leave ata affiliations intact, so arrange for libsas to make an effort to avoid dropping the device due to a slow-to-recover link. Towards this end carry out reset in the host workqueue so that it can check for ata devices and kick the reset request to libata. Hard resets, in contrast, bypass libata since they are meant for associating an ata device with another initiator in the domain (tears down affiliations). Need to add a new transport_sas_phy_reset() since the current sas_phy_reset() is a utility function to libsas lldds. They are not prepared for it to loop back into eh. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: perform sas-transport resets in shost->workq contextDan Williams
Extend the sas transport class to allow transport users to attach extra data to a sas_phy (->hostdata). Use this area in libsas to move resets to workq context in preparation for scheduling ata device resets through libata-eh. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: use libata-eh-reset for sata rediscovery fis transmit failuresDan Williams
Since sata devices can take several seconds to recover the link on reset the 0.5 seconds that libsas currently waits may not be enough. Instead if we are rediscovering a phy that was previously attached to a sata device let libata handle any resets to encourage the device to transmit the initial fis. Once sas_ata_hard_reset() and lldds learn how to honor 'deadline' libsas should stop encountering phys in an intermediate state, until then this will loop until the fis is transmitted or ->attached_sas_addr gets cleared, but in the more likely initial discovery case we keep existing behavior. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: defer SAS_TASK_NEED_DEV_RESET commands to libataDan Williams
lldds use the SAS_TASK_NEED_DEV_RESET interface to request that eh perform a reset. In the sata device case defer the commands that triggered the reset to libata-eh context so it can perform its pre and post reset management. In the sas_ata_post_internal() case the reset request is falling on deaf ears as the sas_task is immediately destroyed without any reset action. Since it is currently a nop, and likely superfluous given the conversion to new-style libata-eh, just drop the request. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: let libata handle command timeoutsDan Williams
libsas-eh if it successfully aborts an ata command will hide the timeout condition (AC_ERR_TIMEOUT) from libata. The command likely completes with the all-zero task->task_status it started with. Instead, interpret a TMF_RESP_FUNC_COMPLETE as the end of the sas_task but keep the scmd around for libata-eh to handle. Tested-by: Andrzej Jakowski <andrzej.jakowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: fix timeout vs completion raceDan Williams
Until we have told the lldd to forget a task a timed out operation can return from the hardware at any time. Since completion frees the task we need to make sure that no tasks run their normal completion handler once eh has decided to manage the task. Similar to ata_scsi_cmd_error_handler() freeze completions to let eh judge the outcome of the race. Task collector mode is problematic because it presents a situation where a task can be timed out and aborted before the lldd has even seen it. For this case we need to guarantee that a task that an lldd has been told to forget does not get queued after the lldd says "never seen it". With sas_scsi_timed_out we achieve this with the ->task_queue_flush mutex, rather than adding more time. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: prevent double completion of scmds from ehDan Williams
We invoke task->task_done() to free the task in the eh case, but at this point we are prepared for scsi_eh_flush_done_q() to finish off the scmd. Introduce sas_end_task() to capture the final response status from the lldd and free the task. Also take the opportunity to kill this warning. drivers/scsi/libsas/sas_scsi_host.c: In function ‘sas_end_task’: drivers/scsi/libsas/sas_scsi_host.c:102:3: warning: case value ‘2’ not in enumerated type ‘enum exec_status’ [-Wswitch] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: close error handling vs sas_ata_task_done() raceDan Williams
Since sas_ata does not implement ->freeze(), completions for scmds and internal commands can still arrive concurrent with ata_scsi_cmd_error_handler() and sas_ata_post_internal() respectively. By the time either of those is called libata has committed to completing the qc, and the ATA_PFLAG_FROZEN flag tells sas_ata_task_done() it has lost the race. In the sas_ata_post_internal() case we take on the additional responsibility of freeing the sas_task to close the race with sas_ata_task_done() freeing the the task while sas_ata_post_internal() is in the process of invoking ->lldd_abort_task(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: kill invocation of scsi_eh_finish_cmd from sas_ata_task_doneDan Williams
Prior to the conversion to the new-style libata-eh sas_ata_task_done() may have been the last opportunity to clean up the scmd, but now libata-eh explicitly handles this case. It also races against sas-eh. If a lldd completes a task after SAS_TASK_STATE_ABORTED is set it could trigger a spurious decrement of shost->host_failed. Current lldds have the band-aid of checking SAS_TASK_STATE_ABORTED before calling ->task_done(), but better to just let the scmds escalate to libata for race free cleanup. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: use ->set_dmamode to notify lldds of NCQ parametersDan Williams
sas_discover_sata() notifies lldds of sata devices twice. Once to allow the 'identify' to be sent, and a second time to allow aic94xx (the only libsas driver that cares about sata_dev.identify) to setup NCQ parameters before the device becomes known to the midlayer. Replace this double notification and intervening 'identify' with an explicit ->lldd_ata_set_dmamode notification. With this change all ata internal commands are issued by libata, so we no longer need sas_issue_ata_cmd(). The data from the identify command only needs to be cached in one location so ata_device.id replaces domain_device.sata_dev.identify. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: prevent domain rediscovery competing with ata error handlingDan Williams
libata error handling provides for a timeout for link recovery. libsas must not rescan for previously known devices in this interval otherwise it may remove a device that is simply waiting for its link to recover. Let libata-eh make the determination of when the link is stable and prevent libsas (host workqueue) from taking action while this determination is pending. Using a mutex (ha->disco_mutex) to flush and disable revalidation while eh is running requires any discovery action that may block on eh be moved to its own context outside the lock. Probing ATA devices explicitly waits on ata-eh and the cache-flush-io issued during device removal may also pend awaiting eh completion. Essentially any rphy add/remove activity needs to run outside the lock. This adds two new cleanup states for sas_unregister_domain_devices() 'allocated-but-not-probed', and 'flagged-for-destruction'. In the 'allocated-but-not-probed' state dev->rphy points to a rphy that is known to have not been through a sas_rphy_add() event. At domain teardown check if this device is still pending probe and cleanup accordingly. Similarly if a device has already been queued for removal then sas_unregister_domain_devices has nothing to do. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: convert dev->gone to flagsDan Williams
In preparation for adding tracking of another device state "destroy". Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: remove ata_port.lock management duties from llddsDan Williams
Each libsas driver (mvsas, pm8001, and isci) has invented a different method for managing the ap->lock. The lock is held by the ata ->queuecommand() path. mvsas drops it prior to acquiring any internal locks which allows it to hold its internal lock across calls to task->task_done(). This capability is important as it is the only way the driver can flush task->task_done() instances to guarantee that it no longer has any in-flight references to a domain_device at ->lldd_dev_gone() time. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: introduce sas_drain_work()Dan Williams
When an lldd invokes ->notify_port_event() it can trigger a chain of libsas events to: 1/ form the port and find the direct attached device 2/ if the attached device is an expander perform domain discovery A call to flush_workqueue() will only flush the initial port formation work. Currently libsas users need to call scsi_flush_work() up to the max depth of chain (which will grow from 2 to 3 when ata discovery is moved to its own discovery event). Instead of open coding multiple calls switch to use drain_workqueue() to flush sas work. drain_workqueue() does not handle new work submitted during the drain so libsas needs a bit of infrastructure to hold off unchained work submissions while a drain is in flight. A lldd ->notify() event is considered 'unchained' while a sas_discover_event() is 'chained'. As Tejun notes: "For now, I think it would be best to add private wrapper in libsas to support deferring unchained work items while draining." Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: convert ha->state to flagsDan Williams
In preparation for adding new states (SAS_HA_DRAINING, SAS_HA_FROZEN), convert ha->state into a set of flags. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: replace event locks with atomic bitopsDan Williams
The locks only served to make sure the pending event bitmask was updated consistently. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: fix leak of dev->sata_dev.identify_[packet_]deviceDan Williams
These are never freed in the nominal path. A domain_device has a different lifetime than a sas_rphy we need a dev->rphy independent way of identifying sata devices. Reviewed-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: fix domain_device leakDan Williams
Arrange for the deallocation of a struct domain_device object when it no longer has: 1/ any children 2/ references by any scsi_targets 3/ references by a lldd The comment about domain_device lifetime in Documentation/scsi/libsas.txt is stale as it appears mainline never had a version of a struct domain_device that was registered as a kobject. We now manage domain_device reference counts on behalf of external agents. Reviewed-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: kill sas_slave_destroyDan Williams
Per commit 3e4ec344 "libata: kill ATA_FLAG_DISABLED" needing to set ATA_DEV_NONE is a holdover from before libsas converted to the "new-style" ata-eh. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: remove unused ata_task_resp fieldsDan Williams
Commit 1e34c838 "[SCSI] libsas: remove spurious sata control register read/write" removed the routines to fake the presence of the sata control registers, now remove the unused data structure fields to kill any remaining confusion. Acked-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] Handle disk devices which can not process medium access commandsMartin K. Petersen
We have experienced several devices which fail in a fashion we do not currently handle gracefully in SCSI. After a failure these devices will respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.) but any command accessing the storage medium will time out. The following patch adds an callback that can be used by upper level drivers to inspect the results of an error handling command. This in turn has been used to implement additional checking in the SCSI disk driver. If a medium access command fails twice but TEST UNIT READY succeeds both times in the subsequent error handling we will offline the device. The maximum number of failed commands required to take a device offline can be tweaked in sysfs. Also add a new error flag to scsi_debug which allows this scenario to be easily reproduced. [jejb: fix up integer parsing to use kstrtouint] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] mpt2sas: spell "primitive" correctly in function prototypeAndrew Morton
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] virtio-scsi: SCSI driver for QEMU based virtual machinesPaolo Bonzini
The virtio-scsi HBA is the basis of an alternative storage stack for QEMU-based virtual machines (including KVM). Compared to virtio-blk it is more scalable, because it supports many LUNs on a single PCI slot), more powerful (it more easily supports passthrough of host devices to the guest) and more easily extensible (new SCSI features implemented by QEMU should not require updating the driver in the guest). Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] hpsa: add some older controllers to the kdump blacklistTomas Henzl
Some other older controllers also do have problems to perform a kdump. Adding controllers to this list means that the driver will signal this non-ability via a resettable flag correctly. The unsupported list was created after a consultation with HP. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent ↵Mike Snitzer
TARGET_ERROR Permanent target failures are non-retryable and should be classified as TARGET_ERROR; otherwise dm-multipath will retry an IO request that will always fail at the target. A SCSI command that fails with ILLEGAL_REQUEST sense and Additional sense 0x20, 0x21, 0x24 or 0x26 represents a permanent TARGET_ERROR. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] sd: Make sure provisioning mode is reported correctlyMartin K. Petersen
The provisioning_mode parameter in sysfs did not get updated in the SD_LBP_DISABLE case. Make sure the provisioning mode is always set correctly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] Ensure discard failure gets treated as a target problemMartin K. Petersen
The error reported up the stack for a discard failure did not clearly indicate that the command was processed and subsequently failed by the target device. Return -EREMOTEIO so multipathing does not classify this condition as a path failure. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] mpt2sas: add missing allocation checkTomas Henzl
The __get_free_pages can fail, so the return value should be checked. Spotted thanks to Stanislaw. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] qla4xxx: Update driver version to 5.02.00-k14Vikas Chaudhary
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>