linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-02-05	scsi: lpfc: Remove D_ID swap log message from trace event logger	Justin Tee
	D_ID swaps are common during cable swaps in a SAN. Thus, there's no reason to log the event at a KERN_ERR level with the trace event logger. Change the log level to KERN_INFO and the normal LOG_ELS flag. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240131185112.149731-5-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-05	scsi: lpfc: Use sg_dma_len() API to get struct scatterlist's length	Justin Tee
	The sg_dma_len() API should be used to retrieve a scatterlist's length instead of directly accessing scatterlist->length. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240131185112.149731-4-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-05	scsi: lpfc: Fix possible memory leak in lpfc_rcv_padisc()	Justin Tee
	The call to lpfc_sli4_resume_rpi() in lpfc_rcv_padisc() may return an unsuccessful status. In such cases, the elsiocb is not issued, the completion is not called, and thus the elsiocb resource is leaked. Check return value after calling lpfc_sli4_resume_rpi() and conditionally release the elsiocb resource. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240131185112.149731-3-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-05	scsi: lpfc: Initialize status local variable in lpfc_sli4_repost_sgl_list()	Justin Tee
	A static code analyzer tool indicates that the local variable called status in the lpfc_sli4_repost_sgl_list() routine could be used to print garbage uninitialized values in the routine's log message. Fix by initializing to zero. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240131185112.149731-2-justintee8345@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-05	scsi: lpfc: Use unsigned type for num_sge	Hannes Reinecke
	LUNs going into "failed ready running" state observed on >1T and on even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when DIF is enabled at the host. The kernel logs: Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0 The host lpfc driver is failing to setup scatter/gather list (protection data) for the I/Os. The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes the compiler to remove the most significant bit. Use an unsigned type instead. Signed-off-by: Hannes Reinecke <hare@suse.de> [dwagner: added commit message] Signed-off-by: Daniel Wagner <dwagner@suse.de> Link: https://lore.kernel.org/r/20231220162658.12392-1-dwagner@suse.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-05	scsi: core: Move scsi_host_busy() out of host lock if it is for per-command	Ming Lei
	Commit 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") intended to fix a hard lockup issue triggered by EH. The core idea was to move scsi_host_busy() out of the host lock when processing individual commands for EH. However, a suggested style change inadvertently caused scsi_host_busy() to remain under the host lock. Fix this by calling scsi_host_busy() outside the lock. Fixes: 4373534a9850 ("scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler") Cc: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240203024521.2006455-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	Merge patch series "scsi: Allow scsi_execute users to request retries"	Martin K. Petersen
	Mike Christie <michael.christie@oracle.com> says: The following patches were made over Linus's tree which contains a fix for sd which was not in Martin's branches. The patches allow scsi_execute_cmd users to have scsi-ml retry the cmd for it instead of the caller having to parse the error and loop itself. Link: https://lore.kernel.org/r/20240123002220.129141-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Add kunit tests for scsi_check_passthrough()	Mike Christie
	Add some kunit tests for scsi_check_passthrough() so we can easily make sure we are hitting the cases for which it's difficult to replicate in hardware or even scsi_debug. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-20-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sr: Have midlayer retry get_sectorsize() errors	Mike Christie
	This has get_sectorsize() have the SCSI midlayer retry errors instead of driving them itself. There is one behavior change where we no longer retry when scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry for failures like the queue being removed, and for the case where there are no tags/reqs the block layer waits/retries for us. For possible memory allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying will probably not help. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-18-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: ses: Have midlayer retry scsi_execute_cmd() errors	Mike Christie
	This has ses have the SCSI midlayer retry scsi_execute_cmd() errors instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-17-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sd: Have midlayer retry read_capacity_10() errors	Mike Christie
	This has read_capacity_10() have the SCSI midlayer retry errors instead of driving them itself. There are 2 behavior changes with this patch: 1. There is one behavior change where we no longer retry when scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry for failures like the queue being removed, and for the case where there are no tags/reqs since the block layer waits/retries for us. For possible memory allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying will probably not help. 2. For the specific UAs we checked for and retried, we would get READ_CAPACITY_RETRIES_ON_RESET retries plus whatever retries were left from the main loop's retries. Each UA now gets READ_CAPACITY_RETRIES_ON_RESET retries, and the other errors get up to 3 retries. This is most likely ok, because READ_CAPACITY_RETRIES_ON_RESET is already 10 and is not based on anything specific like a spec or device, so the extra 3 we got from the main loop was probably just an accident and is not going to help. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-16-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sd: Have pr commands retry UAs	Mike Christie
	It's common to get a UA when doing PR commands. It could be due to a target restarting, transport level relogin or other PR commands like a release causing it. The upper layers don't get the sense and in some cases have no idea if it's a SCSI device, so this has the sd layer retry. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-15-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Have SCSI midlayer retry scsi_report_lun_scan() errors	Mike Christie
	This has scsi_report_lun_scan() have the SCSI midlayer retry errors instead of driving them itself. There is one behavior change where we no longer retry when scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry for failures like the queue being removed, and for the case where there are no tags/reqs the block layer waits/retries for us. For possible memory allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying will probably not help. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-14-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Have midlayer retry scsi_mode_sense() UAs	Mike Christie
	This has scsi_mode_sense() have the SCSI midlayer retry UAs instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-13-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: ch: Have midlayer retry ch_do_scsi() UAs	Mike Christie
	This has ch_do_scsi() have the SCSI midlayer retry UAs instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-12-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: ch: Remove unit_attention	Mike Christie
	unit_attention is not used so remove it. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-11-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sd: Have midlayer retry sd_sync_cache() errors	Mike Christie
	This has sd_sync_cache() have the SCSI midlayer retry errors instead of driving them itself. There is one behavior change where we no longer retry when scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry for failures like the queue being removed, and for the case where there are no tags/reqs the block layer waits/retries for us. For possible memory allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying will probably not help. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-10-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: spi: Have midlayer retry spi_execute() UAs	Mike Christie
	This has spi_execute() have the SCSI midlayer retry UAs instead of driving them. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-9-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: device_handler: rdac: Have midlayer retry send_mode_select() errors	Mike Christie
	This has rdac have the SCSI midlayer retry errors instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-8-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: device_handler: hp_sw: Have midlayer retry scsi_execute_cmd() errors	Mike Christie
	This has hp_sw have the SCSI midlayer retry scsi_execute_cmd() errors instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-7-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sd: Have midlayer retry sd_spinup_disk() errors	Mike Christie
	This simplifies sd_spinup_disk() so the SCSI midlayer retries errors for it. Note that we retried every UA except Medium Not Present and also if scsi_status_is_good() returned failed which would happen for all check conditions. In this patch we use SCMD_FAILURE_STAT_ANY which will trigger for the same conditions as when scsi_status_is_good() returns false and there is status. This will cover all CCs including UAs so there is no explicit failures array entry for UAs except for Medium Not Present which we don't want to retry. There is one behavior change where we no longer retry when scsi_execute_cmd() returns < 0, but we should be ok. We don't need to retry for failures like the queue being removed, and for the case where there are no tags/reqs the block layer waits/retries for us. For possible memory allocation failures from blk_rq_map_kern() we use GFP_NOIO, so retrying will probably not help. We do not handle the outside loop's retries because we want to sleep between tries and we don't support that yet. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-6-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: sd: Use separate buf for START_STOP in sd_spinup_disk()	Mike Christie
	We currently reuse the cmd buffer for the TUR and START_STOP commands which requires us to reset the buffer when retrying. This has us use separate buffers for the 2 commands so we can make them const and I think it makes it easier to handle for retries but does not add too much extra to the stack use. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-5-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Retry INQUIRY after timeout	Mike Christie
	Description from: Martin Wilck <mwilck@suse.com>: The SCSI mid layer doesn't retry commands after DID_TIME_OUT (see scsi_noretry_cmd()). Packet loss in the fabric can cause spurious timeouts during SCSI device probing, causing device probing to fail. This has been observed in FCoE uplink failover tests, for example. This patch fixes the issue by retrying the INQUIRY. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-4-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Have midlayer retry scsi_probe_lun() errors	Mike Christie
	This has scsi_probe_lun() ask the SCSI midlayer to retry UAs instead of driving them itself. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-3-michael.christie@oracle.com Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: core: Allow passthrough to request midlayer retries	Mike Christie
	For passthrough we don't retry any error which we get a check condition for. This results in a lot of callers driving their own retries for all UAs, specific UAs, NOT_READY, specific sense values or any type of failure. This adds the core code to allow passthrough users to specify what errors they want the SCSI midlayer to retry for them. We can then convert users to drop a lot of their sense parsing and retry handling. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20240123002220.129141-2-michael.christie@oracle.com Reviewed-by: John Garry <john.g.garry@oracle.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: pm8001: Convert snprintf() to sysfs_emit()	Li Zhijian
	Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). > ./drivers/scsi/pm8001/pm8001_ctl.c:883:8-16: WARNING: please use sysfs_emit No functional change intended CC: Jack Wang <jinpu.wang@cloud.ionos.com> CC: James E.J. Bottomley <jejb@linux.ibm.com> CC: Martin K. Petersen <martin.petersen@oracle.com> CC: linux-scsi@vger.kernel.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240116045151.3940401-32-lizhijian@fujitsu.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: isci: Convert snprintf() to sysfs_emit()	Li Zhijian
	Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). > ./drivers/scsi/isci/init.c:140:8-16: WARNING: please use sysfs_emit No functional change intended CC: Artur Paszkiewicz <artur.paszkiewicz@intel.com> CC: James E.J. Bottomley <jejb@linux.ibm.com> CC: Martin K. Petersen <martin.petersen@oracle.com> CC: linux-scsi@vger.kernel.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240116045151.3940401-25-lizhijian@fujitsu.com Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: ibmvscsi_tgt: Convert snprintf() to sysfs_emit()	Li Zhijian
	Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). > ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3619:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3625:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3633:8-16: WARNING: please use sysfs_emit No functional change intended CC: Michael Cyr <mikecyr@linux.ibm.com> CC: James E.J. Bottomley <jejb@linux.ibm.com> CC: Martin K. Petersen <martin.petersen@oracle.com> CC: linux-scsi@vger.kernel.org CC: target-devel@vger.kernel.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240116045151.3940401-24-lizhijian@fujitsu.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: ibmvscsi: Convert snprintf() to sysfs_emit()	Li Zhijian
	Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). > ./drivers/scsi/ibmvscsi/ibmvfc.c:3483:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi/ibmvfc.c:3493:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi/ibmvfc.c:3503:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi/ibmvfc.c:3513:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi/ibmvfc.c:3522:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/ibmvscsi/ibmvfc.c:3530:8-16: WARNING: please use sysfs_emit No functional change intended CC: Tyrel Datwyler <tyreld@linux.ibm.com> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Nicholas Piggin <npiggin@gmail.com> CC: Christophe Leroy <christophe.leroy@csgroup.eu> CC: Aneesh Kumar K.V <aneesh.kumar@kernel.org> CC: Naveen N. Rao <naveen.n.rao@linux.ibm.com> CC: James E.J. Bottomley <jejb@linux.ibm.com> CC: Martin K. Petersen <martin.petersen@oracle.com> CC: linux-scsi@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240116045151.3940401-23-lizhijian@fujitsu.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: fnic: Convert snprintf() to sysfs_emit()	Li Zhijian
	Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). > ./drivers/scsi/fnic/fnic_attrs.c:17:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/fnic/fnic_attrs.c:23:8-16: WARNING: please use sysfs_emit > ./drivers/scsi/fnic/fnic_attrs.c:31:8-16: WARNING: please use sysfs_emit No functional change intended CC: Satish Kharat <satishkh@cisco.com> CC: Sesidhar Baddela <sebaddel@cisco.com> CC: Karan Tilak Kumar <kartilak@cisco.com> CC: James E.J. Bottomley <jejb@linux.ibm.com> CC: Martin K. Petersen <martin.petersen@oracle.com> CC: linux-scsi@vger.kernel.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20240116045151.3940401-20-lizhijian@fujitsu.com Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: aacraid: aachba: Replace snprintf() with the safer scnprintf() variant	Lee Jones
	There is a general misunderstanding amongst engineers that {v}snprintf() returns the length of the data actually encoded into the destination array. However, as per the C99 standard {v}snprintf() really returns the length of the data that would have been written if there were enough space for it. This misunderstanding has led to buffer-overruns in the past. It's generally considered safer to use the {v}scnprintf() variants in their place (or even sprintf() in simple cases). So let's do that. Link: https://lwn.net/Articles/69419/ Link: https://github.com/KSPP/linux/issues/105 Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com> Cc: PMC-Sierra, Inc <aacraid@pmc-sierra.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20240111131732.1815560-6-lee@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: 53c700: Remove snprintf() from sysfs call-backs and replace with ↵	Lee Jones
	sysfs_emit() Since snprintf() has the documented, but still rather strange trait of returning the length of the data that would have been written to the array if space were available, rather than the arguably more useful length of data actually written, it is usually considered wise to use something else instead in order to avoid confusion. In the case of sysfs call-backs, new wrappers exist that do just that. [mkp: removed unrelated whitespace cleanups] Link: https://lwn.net/Articles/69419/ Link: https://github.com/KSPP/linux/issues/105 Cc: Richard Hirst <rhirst@linuxcare.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20240111131732.1815560-5-lee@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: 3w-xxxx: Remove snprintf() from sysfs call-backs and replace with ↵	Lee Jones
	sysfs_emit() Since snprintf() has the documented, but still rather strange trait of returning the length of the data that would have been written to the array if space were available, rather than the arguably more useful length of data actually written, it is usually considered wise to use something else instead in order to avoid confusion. In the case of sysfs call-backs, new wrappers exist that do just that. Link: https://lwn.net/Articles/69419/ Link: https://github.com/KSPP/linux/issues/105 Cc: Adam Radford <aradford@gmail.com> Cc: Joel Jacobson <linux@3ware.com> Cc: de Melo <acme@conectiva.com.br> Cc: Andre Hedrick <andre@suse.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20240111131732.1815560-4-lee@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: 3w-sas: Remove snprintf() from sysfs call-backs and replace with ↵	Lee Jones
	sysfs_emit() Since snprintf() has the documented, but still rather strange trait of returning the length of the data that would have been written to the array if space were available, rather than the arguably more useful length of data actually written, it is usually considered wise to use something else instead in order to avoid confusion. In the case of sysfs call-backs, new wrappers exist that do just that. Link: https://lwn.net/Articles/69419/ Link: https://github.com/KSPP/linux/issues/105 Cc: Adam Radford <aradford@gmail.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20240111131732.1815560-3-lee@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-29	scsi: 3w-9xxx: Remove snprintf() from sysfs call-backs and replace with ↵	Lee Jones
	sysfs_emit() Since snprintf() has the documented, but still rather strange trait of returning the length of the data that would have been written to the array if space were available, rather than the arguably more useful length of data actually written, it is usually considered wise to use something else instead in order to avoid confusion. In the case of sysfs call-backs, new wrappers exist that do just that. Link: https://lwn.net/Articles/69419/ Link: https://github.com/KSPP/linux/issues/105 Cc: Adam Radford <aradford@gmail.com> Signed-off-by: Lee Jones <lee@kernel.org> Link: https://lore.kernel.org/r/20240111131732.1815560-2-lee@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: mpt3sas: Update driver version to 48.100.00.00	Ranjan Kumar
	Update driver version to 48.100.00.00. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20231228114810.11923-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: mpt3sas: Reload SBR without rebooting HBA	Ranjan Kumar
	Add a new IOCTL command MPT3ENABLEDIAGSBRRELOAD. As a part of firmware update operation, applications use this IOCTL command to set the SBR reload bit in the Host Diagnostic register. This permits HBA firmware to be updated without powercycling the system. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312280909.MZyhxwBL-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202312281141.jDyPezRn-lkp@intel.com/ Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20231228114810.11923-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	Merge patch series "scsi: hisi_sas: Minor fixes and cleanups"	Martin K. Petersen
	chenxiang <chenxiang66@hisilicon.com> says: This series contains some fixes and cleanups including: - Fix a deadlock issue related to automatic debugfs; - Remove redundant checks for automatic debugfs; - Check whether debugfs is enabled before removing or releasing it; - Remove hisi_hba->timer for v3 hw; Link: https://lore.kernel.org/r/1705904747-62186-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: hisi_sas: Remove hisi_hba->timer for v3 hw	Xiang Chen
	hisi_hba->timer is not used for v3 hw but there are two places that some operations related to hisi_hba->timer are called by v3 hw: - Deleting the timer in function hisi_sas_v3_hw() which is only for v3 hw; - Deleting the timer in function hisi_sas_controller_reset_prepare() which is common for v1/v2/v3 hw. We can remove the timer in the first case, but for the second scenario we need to remove it only for v3 hw, so check hw->sht which is NULL only for v3 hw before deleting hisi_hba->timer. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-5-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: hisi_sas: Check whether debugfs is enabled before removing or releasing it	Yihang Li
	hisi_sas debugfs remove should be executed only when debugfs is enabled. Check whether debugfs is enabled and then remove it only if enabled. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: hisi_sas: Remove redundant checks for automatic debugfs dump	Yihang Li
	In commit 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger"), the memory allocation time of the DFX is changed from device initialization to dump occurs, so .debugfs_itct is not a valid address and do not need to check. The parameter hisi_sas_debugfs_enable is enough to check whether automatic debugfs dump is triggered, so remove redunant checks. Fixes: 63f0733d07ce ("scsi: hisi_sas: Allocate DFX memory during dump trigger") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-24	scsi: hisi_sas: Fix a deadlock issue related to automatic dump	Yihang Li
	If we issue a disabling PHY command, the device attached with it will go offline, if a 2 bit ECC error occurs at the same time, a hung task may be found: [ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas] [ 4613.691518] Call trace: [ 4613.694678] __switch_to+0xf8/0x17c [ 4613.698872] __schedule+0x660/0xee0 [ 4613.703063] schedule+0xac/0x240 [ 4613.706994] schedule_timeout+0x500/0x610 [ 4613.711705] __down+0x128/0x36c [ 4613.715548] down+0x240/0x2d0 [ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main] [ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas] [ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas] [ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main] [ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main] [ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas] [ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas] [ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas] [ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas] [ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas] [ 4613.784716] process_one_work+0x248/0x950 [ 4613.789424] worker_thread+0x318/0x934 [ 4613.793878] kthread+0x190/0x200 [ 4613.797810] ret_from_fork+0x10/0x18 [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main] [ 4613.841491] Call trace: [ 4613.844647] __switch_to+0xf8/0x17c [ 4613.848852] __schedule+0x660/0xee0 [ 4613.853052] schedule+0xac/0x240 [ 4613.856984] schedule_timeout+0x500/0x610 [ 4613.861695] __down+0x128/0x36c [ 4613.865542] down+0x240/0x2d0 [ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main] [ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main] [ 4613.883019] process_one_work+0x248/0x950 [ 4613.887732] worker_thread+0x318/0x934 [ 4613.892204] kthread+0x190/0x200 [ 4613.896118] ret_from_fork+0x10/0x18 [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas] [ 4613.939549] Call trace: [ 4613.942702] __switch_to+0xf8/0x17c [ 4613.946892] __schedule+0x660/0xee0 [ 4613.951083] schedule+0xac/0x240 [ 4613.955015] schedule_timeout+0x500/0x610 [ 4613.959725] wait_for_common+0x200/0x610 [ 4613.964349] wait_for_completion+0x3c/0x5c [ 4613.969146] flush_workqueue+0x198/0x790 [ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas] [ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas] [ 4613.985708] process_one_work+0x248/0x950 [ 4613.990420] worker_thread+0x318/0x934 [ 4613.994868] kthread+0x190/0x200 [ 4613.998800] ret_from_fork+0x10/0x18 This is because when the device goes offline, we obtain the hisi_hba semaphore and send the ABORT_DEV command to the device. However, the internal abort timed out due to the 2 bit ECC error and triggers automatic dump. In addition, since the hisi_hba semaphore has been obtained, the dump cannot be executed and the controller cannot be reset. Therefore, the deadlocks occur on the following circular dependencies: hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ... -> hisi_sas_internal_abort_timeout() -> down(). The deadlock is triggered only when the timeout occurs during device goes offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario where a device goes offline from other scenarios. Fixes: 2ff07b5c6fe9 ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: fnic: Clean up some inconsistent indenting	Jiapeng Chong
	No functional modification involved. drivers/scsi/fnic/fnic_scsi.c:1964 fnic_abort_cmd() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7930 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20240118020128.24432-1-jiapeng.chong@linux.alibaba.com Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: mpi3mr: Use ida to manage mrioc ID	Guixin Liu
	To ensure that the same ID is not obtained during concurrent execution of the probe, an ida is used to manage the mrioc's ID. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20231229040331.52518-1-kanie@linux.alibaba.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: ibmvscsi_tgt: Replace deprecated strncpy() with strscpy()	Justin Stitt
	strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We don't need the NUL-padding behavior that strncpy() provides as vscsi is NUL-allocated in ibmvscsis_probe() which proceeds to call ibmvscsis_adapter_info(): \| vscsi = kzalloc(sizeof(vscsi), GFP_KERNEL); ibmvscsis_probe() -> ibmvscsis_handle_crq() -> ibmvscsis_parse_command() -> ibmvscsis_mad() -> ibmvscsis_process_mad() -> ibmvscsis_adapter_info() Following the same idea, `partition_name` is defiend as: \| static char partition_name[PARTITION_NAMELEN] = "UNKNOWN"; ... which is NUL-padded already, meaning strscpy() is the best option. Considering the above, a suitable replacement is strscpy() [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. However, for cap->name and info let's use strscpy_pad() as they are allocated via dma_alloc_coherent(): \| cap = dma_alloc_coherent(&vscsi->dma_dev->dev, olen, &token, \| GFP_ATOMIC); & \| info = dma_alloc_coherent(&vscsi->dma_dev->dev, sizeof(info), &token, \| GFP_ATOMIC); Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20231212-strncpy-drivers-scsi-ibmvscsi_tgt-ibmvscsi_tgt-c-v2-1-bdb9a7cd96c8@google.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: megaraid: Remove redundant assignment to variable 'retval'	Colin Ian King
	The variable 'retval' is being assigned a value that is not being read afterwards. The assignment is redundant and can be removed. Cleans up clang scan warning: Although the value stored to 'retval' is used in the enclosing expression, the value is never actually read from 'retval' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20240118121441.2533620-1-colin.i.king@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: storvsc: Fix ring buffer size calculation	Michael Kelley
	Current code uses the specified ring buffer size (either the default of 128 Kbytes or a module parameter specified value) to encompass the one page ring buffer header plus the actual ring itself. When the page size is 4K, carving off one page for the header isn't significant. But when the page size is 64K on ARM64, only half of the default 128 Kbytes is left for the actual ring. While this doesn't break anything, the smaller ring size could be a performance bottleneck. Fix this by applying the VMBUS_RING_SIZE macro to the specified ring buffer size. This macro adds a page for the header, and rounds up the size to a page boundary, using the page size for which the kernel is built. Use this new size for subsequent ring buffer calculations. For example, on ARM64 with 64K page size and the default ring size, this results in the actual ring being 128 Kbytes, which is intended. Cc: stable@vger.kernel.org # 5.15.x Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240122170956.496436-1-mhklinux@outlook.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: core: Move scsi_host_busy() out of host lock for waking up EH handler	Ming Lei
	Inside scsi_eh_wakeup(), scsi_host_busy() is called & checked with host lock every time for deciding if error handler kthread needs to be waken up. This can be too heavy in case of recovery, such as: - N hardware queues - queue depth is M for each hardware queue - each scsi_host_busy() iterates over (N * M) tag/requests If recovery is triggered in case that all requests are in-flight, each scsi_eh_wakeup() is strictly serialized, when scsi_eh_wakeup() is called for the last in-flight request, scsi_host_busy() has been run for (N * M - 1) times, and request has been iterated for (NM - 1) (N * M) times. If both N and M are big enough, hard lockup can be triggered on acquiring host lock, and it is observed on mpi3mr(128 hw queues, queue depth 8169). Fix the issue by calling scsi_host_busy() outside the host lock. We don't need the host lock for getting busy count because host the lock never covers that. [mkp: Drop unnecessary 'busy' variables pointed out by Bart] Cc: Ewan Milne <emilne@redhat.com> Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240112070000.4161982-1-ming.lei@redhat.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Tested-by: Sathya Prakash Veerichetty <safhya.prakash@broadcom.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: core: Safe warning about bad dev info string	Petr Mladek
	Both "model" and "strflags" are passed to "%s" even when one or both are NULL. It is safe because vsprintf() would detect the NULL pointer and print "(null)". But it is a kernel-specific feature and compiler warns about it: <warning> In file included from include/linux/kernel.h:19, from arch/x86/include/asm/percpu.h:27, from arch/x86/include/asm/current.h:6, from include/linux/sched.h:12, from include/linux/blkdev.h:5, from drivers/scsi/scsi_devinfo.c:3: drivers/scsi/scsi_devinfo.c: In function 'scsi_dev_info_list_add_str': >> include/linux/printk.h:434:44: warning: '%s' directive argument is null [-Wformat-overflow=] 434 \| #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__) \| ^ include/linux/printk.h:430:3: note: in definition of macro 'printk_index_wrap' 430 \| _p_func(_fmt, ##__VA_ARGS__); \ \| ^~~~~~~ drivers/scsi/scsi_devinfo.c:551:4: note: in expansion of macro 'printk' 551 \| printk(KERN_ERR "%s: bad dev info string '%s' '%s'" \| ^~~~~~ drivers/scsi/scsi_devinfo.c:552:14: note: format string is defined here 552 \| " '%s'\n", __func__, vendor, model, \| ^~ </warning> Do not rely on the kernel specific behavior and print the message a safe way. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401112002.AOjwMNM0-lkp@intel.com/ Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240111162419.12406-1-pmladek@suse.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Chris Down <chris@chrisdown.name> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-01-23	scsi: core: Move autosuspend timer delay to Scsi_Host	Peter Wang
	The runtime suspend timer delay is a const value in scsi_host_template which a host driver cannot modify at runtime. Move the delay to Scsi_Host to allow a driver to update it. Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20240109124015.31359-2-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>