git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-10-27	scsi: target: core: Add support for RSOC command	Dmitry Bogdanov
	Add support for REPORT SUPPORTED OPERATION CODES command according to SPC4. Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Link: https://lore.kernel.org/r/20220906103421.22348-2-d.bogdanov@yadro.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Fix a deadlock between PM and the SCSI error handler	Bart Van Assche
	The following deadlock has been observed on multiple test setups: * ufshcd_wl_suspend() is waiting for blk_execute_rq(START STOP UNIT) to complete while ufshcd_wl_suspend() holds host_sem. * The SCSI error handler is activated, changes the host state to SHOST_RECOVERY, ufshcd_eh_host_reset_handler() and ufshcd_err_handler() are called and the latter function tries to obtain host_sem. This is a deadlock because blk_execute_rq() can't execute SCSI commands while the host is in the SHOST_RECOVERY state and because the error handler cannot make progress because host_sem is held by another thread. Fix this deadlock as follows: * Fail attempts to suspend the system while the SCSI error handler is in progress by setting the SCMD_FAIL_IF_RECOVERING flag for START STOP UNIT commands. * If the system is suspending and a START STOP UNIT command times out, handle the SCSI command timeout from inside the context of the SCSI timeout handler instead of activating the SCSI error handler. The runtime power management code is not affected by this deadlock since hba->host_sem is not touched by the runtime power management functions in the UFS driver. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-11-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Introduce the function ufshcd_execute_start_stop()	Bart Van Assche
	Open-code scsi_execute() because a later patch will modify scmd->flags and because scsi_execute() does not support setting scmd->flags. No functionality is changed. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-10-bvanassche@acm.org Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Track system suspend / resume activity	Bart Van Assche
	Add a new boolean variable that tracks whether the system is suspending, suspended or resuming. This information will be used in a later commit to fix a deadlock between the SCSI error handler and the suspend code. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-9-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Try harder to change the power mode	Bart Van Assche
	Instead of only retrying the START STOP UNIT command if a unit attention is reported, repeat it if any SCSI error is reported by the device or if the command timed out. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-8-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Reduce the START STOP UNIT timeout	Bart Van Assche
	Reduce the START STOP UNIT command timeout to one second since on Android devices a kernel panic is triggered if an attempt to suspend the system takes more than 20 seconds. One second should be enough for the START STOP UNIT command since this command completes in less than a millisecond for the UFS devices I have access to. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-7-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode()	Bart Van Assche
	Convert if (ret) { ... } if (!ret) { ... } into if (ret) { ... } else { ... }. Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-6-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: ufs: Remove an outdated comment	Bart Van Assche
	Although the host lock had to be held by ufshcd_clk_scaling_start_busy() callers when that function was introduced, that is no longer the case today. Hence remove the comment that claims that callers of this function must hold the host lock. Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: core: Support failing requests while recovering	Bart Van Assche
	The current behavior for SCSI commands submitted while error recovery is ongoing is to retry command submission after error recovery has finished. See also the scsi_host_in_recovery() check in scsi_host_queue_ready(). Add support for failing SCSI commands while host recovery is in progress. This functionality will be used to fix a deadlock in the UFS driver. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-4-bvanassche@acm.org Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: core: Change the return type of .eh_timed_out()	Bart Van Assche
	Commit 6600593cbd93 ("block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE") made it impossible for .eh_timed_out() implementations to call scsi_done() without causing a crash. Restore support for SCSI timeout handlers to call scsi_done() as follows: * Change all .eh_timed_out() handlers as follows: - Change the return type into enum scsi_timeout_action. - Change BLK_EH_RESET_TIMER into SCSI_EH_RESET_TIMER. - Change BLK_EH_DONE into SCSI_EH_NOT_HANDLED. * In scsi_timeout(), convert the SCSI_EH_* values into BLK_EH_* values. Reviewed-by: Lee Duncan <lduncan@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-3-bvanassche@acm.org Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: core: Fix a race between scsi_done() and scsi_timeout()	Bart Van Assche
	If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 15f73f5b3e59 ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: lpfc: Update lpfc version to 14.2.0.8	Justin Tee
	Update lpfc version to 14.2.0.8 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: lpfc: Create a sysfs entry called lpfc_xcvr_data for transceiver info	Justin Tee
	The DUMP_MEMORY mailbox command is implemented for page A0 and A2 to retrieve transceiver information from firmware. The mailbox command output is then formatted to print raw data values for userspace to parse via sysfs. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: lpfc: Log when congestion management limits are in effect	Justin Tee
	When bandwidth reduces from or recovers back to 100% due to congestion management, log the event. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: lpfc: Fix hard lockup when reading the rx_monitor from debugfs	Justin Tee
	During I/O and simultaneous cat of /sys/kernel/debug/lpfc/fnX/rx_monitor, a hard lockup similar to the call trace below may occur. The spin_lock_bh in lpfc_rx_monitor_report is not protecting from timer interrupts as expected, so change the strength of the spin lock to _irq. Kernel panic - not syncing: Hard LOCKUP CPU: 3 PID: 110402 Comm: cat Kdump: loaded exception RIP: native_queued_spin_lock_slowpath+91 [IRQ stack] native_queued_spin_lock_slowpath at ffffffffb814e30b _raw_spin_lock at ffffffffb89a667a lpfc_rx_monitor_record at ffffffffc0a73a36 [lpfc] lpfc_cmf_timer at ffffffffc0abbc67 [lpfc] __hrtimer_run_queues at ffffffffb8184250 hrtimer_interrupt at ffffffffb8184ab0 smp_apic_timer_interrupt at ffffffffb8a026ba apic_timer_interrupt at ffffffffb8a01c4f [End of IRQ stack] apic_timer_interrupt at ffffffffb8a01c4f lpfc_rx_monitor_report at ffffffffc0a73c80 [lpfc] lpfc_rx_monitor_read at ffffffffc0addde1 [lpfc] full_proxy_read at ffffffffb83e7fc3 vfs_read at ffffffffb833fe71 ksys_read at ffffffffb83402af do_syscall_64 at ffffffffb800430b entry_SYSCALL_64_after_hwframe at ffffffffb8a000ad Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: lpfc: Set sli4_param's cmf option to zero when CMF is turned off	Justin Tee
	Add missed clearing of phba->sli4_hba.pc_sli4_params.cmf when CMF is turned off. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: qedf: Remove set but unused variable 'page'	Jiapeng Chong
	The variable page is not used in the function, so delete it. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2348 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221009060249.40178-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: pm80xx: Remove unused reset_in_progress flag logic	Igor Pylypiv
	The reset_in_progress flag was never set. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20221007230751.309363-1-ipylypiv@google.com Reviewed-by: Andrew Konecki <awkonecki@google.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: mvsas: Use sas_task_find_rq() for tagging	John Garry
	The request associated with a SCSI command coming from the block layer has a unique tag, so use that when possible for getting a slot. Unfortunately we don't support reserved commands in the SCSI midlayer yet. As such, SMP tasks - as an example - will not have a request associated, so in the interim continue to manage those tags for that type of sas_task internally. We reserve an arbitrary 4 tags for these internal tags. Indeed, we already decrement MVS_RSVD_SLOTS by 2 for the shost can_queue when flag MVF_FLAG_SOC is set. This change was made in commit 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes"), but what those 2 slots are used for is not obvious. Also make the tag management functions static, where possible. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-8-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: mvsas: Delete mvs_tag_init()	John Garry
	All mvs_tag_init() does is zero the tag bitmap, but this is already done with the kzalloc() call to alloc the tags, so delete this unneeded function. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-7-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: pm8001: Use sas_task_find_rq() for tagging	John Garry
	The request associated with a SCSI command coming from the block layer has a unique tag, so use that when possible for getting a CCB. Unfortunately we don't support reserved commands in the SCSI midlayer yet, so in the interim continue to manage those tags internally (along with tags for private commands). Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-6-git-send-email-john.garry@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: pm8001: Remove pm8001_tag_init()	Igor Pylypiv
	In commit 5a141315ed7c ("scsi: pm80xx: Increase the number of outstanding I/O supported to 1024") the pm8001_ha->tags allocation was moved into pm8001_init_ccb_tag(). This changed the execution order of allocation. pm8001_tag_init() used to be called after the pm8001_ha->tags allocation and now it is called before the allocation. Before: pm8001_pci_probe() `--> pm8001_pci_alloc() `--> pm8001_alloc() `--> pm8001_ha->tags = kzalloc(...) `--> pm8001_tag_init(pm8001_ha); // OK: tags are allocated After: pm8001_pci_probe() `--> pm8001_pci_alloc() \| `--> pm8001_alloc() \| `--> pm8001_tag_init(pm8001_ha); // NOK: tags are not allocated \| `--> pm8001_init_ccb_tag() `--> pm8001_ha->tags = kzalloc(...) // today it is bitmap_zalloc() Since pm8001_ha->tags_num is zero when pm8001_tag_init() is called it does nothing. Tags memory is allocated with bitmap_zalloc() so there is no need to manually clear each bit with pm8001_tag_free(). Reviewed-by: Changyuan Lyu <changyuanl@google.com> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-5-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: hisi_sas: Put reserved tags in lower region of tagset	John Garry
	To be consistent with blk-mq, put the reserved tags in the lower region of the tagset. Eventually we hope to get rid of all this reserved tag management. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-4-git-send-email-john.garry@huawei.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: hisi_sas: Use sas_task_find_rq()	John Garry
	Use sas_task_find_rq() to lookup the request per task for its driver tag. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-3-git-send-email-john.garry@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: libsas: Add sas_task_find_rq()	John Garry
	blk-mq already provides a unique tag per request. Some libsas LLDDs - like hisi_sas - already use this tag as the unique per-I/O HW tag. Add a common function to provide the request associated with a sas_task for all libsas LLDDs. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1666091763-11023-2-git-send-email-john.garry@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-22	scsi: target: Remove the unused function transport_lba_64_ext()	Jiapeng Chong
	The function transport_lba_64_ext() is defined in the target_core_sbc.c file, but not called elsewhere, so remove this unused function. drivers/target/target_core_sbc.c:276:34: warning: unused function 'transport_lba_64_ext'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2427 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221018081235.124662-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Use sas_phy_match_port_addr() instead of open coding it	Jason Yan
	The SAS address comparison of asd_sas_port and expander phy is open coded. Replace it with sas_phy_match_port_addr(). Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-9-yanaijie@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Use sas_phy_addr_match() instead of open coding it	Jason Yan
	The SAS address comparison of expander phys is open coded. Replace it with sas_phy_addr_match(). Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-8-yanaijie@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Use sas_phy_match_dev_addr() instead of open coding it	Jason Yan
	The SAS address comparison of domain device and expander phy is open coded. Replace it with sas_phy_match_dev_addr(). Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-7-yanaijie@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: hisi_sas: Use sas_find_attathed_phy_id() instead of open coding it	Jason Yan
	The attached phy finding is open coded. Replace it with sas_find_attached_phy_id(). To keep things consistent, the return value of hisi_sas_dev_found() is also changed to -ENODEV after calling sas_find_attathed_phy_id() failed. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-6-yanaijie@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: mvsas: Use sas_find_attached_phy_id() instead of open coding it	Jason Yan
	The attached phy finding is open coded. Replace it with sas_find_attached_phy_id(). To keep things consistent, the return value of mvs_dev_found_notify() is also changed to -ENODEV after calling sas_find_attathed_phy_id() failed. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-5-yanaijie@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: pm8001: Use sas_find_attached_phy_id() instead of open coding it	Jason Yan
	The attached phy id finding is open coded. Replace it with sas_find_attached_phy_id(). To keep things consistent, the return value of pm8001_dev_found_notify() is also changed to -ENODEV after calling sas_find_attathed_phy_id() failed. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-4-yanaijie@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Introduce sas_find_attached_phy_id() helper	Jason Yan
	LLDDs are all implementing their own attached phy ID finding code. Factor it out to libsas. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-3-yanaijie@huawei.com Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Introduce SAS address comparison helpers	Jason Yan
	SAS address comparison is widely used in libsas. However they are all opencoded and to avoid the line spill over 80 columns, are mostly split into multi-lines. Introduce some helpers to prepare for some refactoring. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20220928070130.3657183-2-yanaijie@huawei.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: core: Release SCSI devices synchronously	Bart Van Assche
	All upstream scsi_device_put() calls happen from thread context. Hence simplify scsi_device_put() by always calling the release function synchronously. This commit prepares for constifying the SCSI host template by removing an assignment that clears the module pointer in the SCSI host template. scsi_device_dev_release_usercontext() was introduced in 2006 via commit 65110b216895 ("[SCSI] fix wrong context bugs in SCSI"). Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-9-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: core: Remove the put_device() call from scsi_device_get()	Bart Van Assche
	scsi_device_get() may be called from atomic context, e.g. by shost_for_each_device(). A later commit will allow put_device() to sleep for SCSI devices. Hence remove the put_device() call from scsi_device_get(). According to Rusty Russell's "Module Refcount and Stuff mini-FAQ", calling module_put() from atomic context is allowed since considerable time. See also https://lkml.org/lkml/2002/11/18/330. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-8-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: ufs: Simplify ufshcd_set_dev_pwr_mode()	Bart Van Assche
	Simplify the code for incrementing the SCSI device reference count in ufshcd_set_dev_pwr_mode(). This commit removes one scsi_device_put() call that happens from atomic context. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Avri Altman <avri.altman@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-7-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: core: Rework scsi_single_lun_run()	Bart Van Assche
	Use __starget_for_each_device() instead of open-coding starget_for_each_device(). Run the queues asynchronously instead of synchronously. This commit removes code that calls scsi_device_put() from atomic context. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-6-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: core: Introduce a new list for SCSI proc directory entries	Bart Van Assche
	Instead of using scsi_host_template members to track the SCSI proc directory entries, track these entries in a list. This changes the time needed for looking up the proc dir pointer from O(1) into O(n). This is considered acceptable since the number of SCSI host adapter types per host is usually small (less than ten). This change has been tested by attaching two USB storage devices to a qemu host: $ grep -aH . /proc/scsi/usb-storage/* /proc/scsi/usb-storage/7: Host scsi7: usb-storage /proc/scsi/usb-storage/7: Vendor: QEMU /proc/scsi/usb-storage/7: Product: QEMU USB HARDDRIVE /proc/scsi/usb-storage/7:Serial Number: 1-0000:00:02.1:00.0-6 /proc/scsi/usb-storage/7: Protocol: Transparent SCSI /proc/scsi/usb-storage/7: Transport: Bulk /proc/scsi/usb-storage/7: Quirks: SANE_SENSE /proc/scsi/usb-storage/8: Host scsi8: usb-storage /proc/scsi/usb-storage/8: Vendor: QEMU /proc/scsi/usb-storage/8: Product: QEMU USB HARDDRIVE /proc/scsi/usb-storage/8:Serial Number: 1-0000:00:02.1:00.0-7 /proc/scsi/usb-storage/8: Protocol: Transparent SCSI /proc/scsi/usb-storage/8: Transport: Bulk /proc/scsi/usb-storage/8: Quirks: SANE_SENSE This commit prepares for constifying most SCSI host templates. Reviewed-by: John Garry <john.garry@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: core: Fail host creation if creating the proc directory fails	Bart Van Assche
	Users expect that the contents of /proc/scsi is in sync with the contents of /sys/class/scsi_host. Hence fail host creation if creating the proc directory fails. Suggested-by: John Garry <john.garry@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: esas2r: Introduce scsi_template_proc_dir()	Bart Van Assche
	Prepare for removing the 'proc_dir' and 'present' members from the SCSI host template. This commit does not change any functionality. Reviewed-by: John Garry <john.garry@huawei.com> Cc: Bradley Grove <linuxdrivers@attotech.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: esas2r: Initialize two host template members implicitly	Bart Van Assche
	Prepare for removing the 'proc_dir' and 'present' members from the SCSI host template by implicitly initializing 'present' and 'emulated' in 'driver_template'. Reviewed-by: John Garry <john.garry@huawei.com> Cc: Bradley Grove <linuxdrivers@attotech.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221015002418.30955-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Update SATA dev FIS in sas_ata_task_done()	John Garry
	In sas_ata_task_done(), for commands which complete with error we set the SATA dev FIS status field with ATA_ERR. In ata_eh_analyze_tf() this would be interpreted as a HSM error. Set ATA_DRDY, which will lead libata to judge as a device error, which is a safer bet. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-9-git-send-email-john.garry@huawei.com Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private	John Garry
	We have no users outside libsas any longer, so make sas_alloc_task(), sas_alloc_slow_task(), and sas_free_task() private. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-8-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors	John Garry
	In commit c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes") the driver had support added to handle NCQ errors but much of what is done in this handling is duplicated from the libata EH. In that named commit we handle in 2x main steps: a. Issue read log ext10 to examine and clear the errors b. Issue SATA_ABORT all command Indeed, in libata EH, we do similar to above: a. ata_do_eh() -> ata_eh_autopsy() -> ata_eh_link_autopsy() -> ata_eh_analyze_ncq_error() -> ata_eh_read_log_10h() b. ata_do_eh() -> ata_eh_recover() which will issue a device soft reset or hard reset Since there is so much duplication, use sas_ata_device_link_abort() which will abort all pending IOs and kick of ATA EH which will do the steps, above. However we will not follow the advisory to send the SATA_ABORT all command after the autopsy in read log ext10. Indeed, in libsas EH, we already send a per-task SATA_ABORT command, and this is prior to the ATA EH kicking in and issuing the read log ext10 in the recovery process. I judge that this is ok as the SATA_ABORT command does not actually send any protocol on the link to abort I/O on the other side, so would not change any state on the disk (for the read log ext10 command). Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-7-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: pm8001: Modify task abort handling for SATA task	John Garry
	When we try to abort a SATA task, the CCB of the task which we are trying to avoid may still complete. In this case, we should not touch the task associated with that CCB as we can race with libsas freeing the last later in sas_eh_handle_sas_errors() -> sas_eh_finish_cmd() for when TASK_IS_ABORTED is returned from sas_scsi_find_task() Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-6-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: hisi_sas: Modify v3 HW SATA disk error state completion processing	Xingui Yang
	When an NCQ error occurs, the controller will abnormally complete the I/Os that are newly delivered to disk, and bit8 in CQ dw3 will be set which indicates that the SATA disk is in error state. The current processing flow is to set ts->stat to SAS_OPEN_REJECT and then sas_ata_task_done() will set FIS stat to ATA_ERR. After analyzing the I/O by ata_eh_analyze_tf(), err_mask will set to AC_ERR_HSM. If media error occurs for four times within 10 minutes and the chip rejects new I/Os for four times, NCQ will be disabled due to excessive errors, which is undesirable. Therefore, use sas_task_abort() to handle abnormally completed I/Os when SATA disk is in error state, as these abnormally completed I/Os are already processed by sas_ata_device_link_abort() and qc->flag are set to ATA_QCFLAG_FAILED. If sas_task_abort() is used, qc->err_mask will not be modified in EH. Unlike the current process flow, it will not increase the count of ECAT_TOUT_HSM and not turn off NCQ. Like other I/Os on the disk that do not have an error but do not return after the NCQ error, they are retried after the EH. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-5-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw	Xingui Yang
	When CQ header dw3 SATA_DISK_ERR is set it means this SATA disk is in error state and the current IPTT is invalid. An invalid IPTT does not correspond to any slot. In this scenario, new I/Os that delivered to disk will be rejected by the controller and all I/Os remaining in the disk should be aborted, which we add here with the sas_ata_device_link_abort() call. In hisi_sas_abort_task() we don't want to issue a soft reset as it may cause info to be lost in the target disk for the ATA EH autopsy. In this case, just release resources - the disk won't return other I/Os normally after NCQ Error, so this is safe. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-4-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()	Xingui Yang
	Each branch currently defines a slot variable independently, and it is neater to move it to the function head. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-3-git-send-email-john.garry@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-10-18	scsi: libsas: Add sas_ata_device_link_abort()	John Garry
	Similar to how AHCI handles NCQ errors in ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort(), add an NCQ error handler for LLDDs to call to initiate a link abort. This will mark all outstanding QCs as failed and kick-off EH. Note: A "force reset" argument is added for drivers which require the ATA error handling to always reset the device. A driver may require this feature for when SATA device per-SCSI cmnd resources are only released during reset for ATA EH. As such, we need an option to force reset to be done, regardless of what any EH autopsy decides. The SATA device FIS fields are set to indicate a device error from ata_eh_analyze_tf(). Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-2-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>