summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-02scsi: documentation: scsi_eh: updates for EH changesRandy Dunlap
SCSI_SOFTIRQ and scsi_softirq() are no longer used. Change to block layer equivalents. scsi_setup_cmd_retry() has been deleted. Remove references to it. SCSI_EH_CANCEL_CMD has been deleted. Remove references to it. scsi_eh_abort_cmds() has been deleted. Remove references to it. [mkp: fixed START STOP UNIT] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241219214928.1170302-1-rdunlap@infradead.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: ufs: qcom: Power down the controller/device during system suspend for ↵Manivannan Sadhasivam
SM8550/SM8650 SoCs SM8550 and SM8650 SoCs doesn't support UFS PHY retention. So once these SoCs reaches the low power state (CX power collapse) during system suspend, all the PHY hardware state gets lost. This leads to the UFS resume failure: ufshcd-qcom 1d84000.ufs: ufshcd_uic_hibern8_exit: hibern8 exit failed. ret = 5 ufshcd-qcom 1d84000.ufs: __ufshcd_wl_resume: hibern8 exit failed 5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: 5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x84 returns 5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error 5 With the default system suspend level of UFS_PM_LVL_3, the power domain for UFS PHY needs to be kept always ON to retain the state. But this would prevent these SoCs from reaching the CX power collapse state, leading to poor power saving during system suspend. So to fix this issue without affecting the power saving, set 'ufs_qcom_drvdata::no_phy_retention' to true which sets 'hba->spm_lvl' to UFS_PM_LVL_5 to allow both the controller and device (in turn the PHY) to be powered down during system suspend for these SoCs by default. Cc: stable@vger.kernel.org # 6.3 Fixes: 35cf1aaab169 ("arm64: dts: qcom: sm8550: Add UFS host controller and phy nodes") Fixes: 10e024671295 ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes") Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-4-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: ufs: qcom: Allow passing platform specific OF dataManivannan Sadhasivam
In order to allow platform specific flags and configurations, introduce the platform specific OF data and move the existing quirk UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 and SM8650 SoCs. Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-3-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: ufs: core: Honor runtime/system PM levels if set by host controller ↵Manivannan Sadhasivam
drivers Otherwise, the default levels will override the levels set by the host controller drivers. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-2-63c4b95a70b9@linaro.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: ufs: qcom: Power off the PHY if it was already powered on in ↵Manivannan Sadhasivam
ufs_qcom_power_up_sequence() PHY might already be powered on during ufs_qcom_power_up_sequence() in a couple of cases: 1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk 2. Resuming from spm_lvl = 5 suspend In those cases, it is necessary to call phy_power_off() and phy_exit() in ufs_qcom_power_up_sequence() function to power off the PHY before calling phy_init() and phy_power_on(). Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is not handled. So to satisfy both cases, call phy_power_off() and phy_exit() if the phy_count is non-zero. And with this change, the reinit_notify() callback is no longer needed. This fixes the below UFS resume failure with spm_lvl = 5: ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5 Cc: stable@vger.kernel.org # 6.3 Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device") Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: qedi: Use kthread_create_on_cpu()Frederic Weisbecker
Use the proper API instead of open coding it. However it looks like qedi_percpu_io_thread() kthread could be replaced by the use of a high prio workqueue instead. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20241211154035.75565-6-frederic@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: bnx2i: Use kthread_create_on_cpu()Frederic Weisbecker
Use the proper API instead of open coding it. However it looks like bnx2i_percpu_io_thread() kthread could be replaced by the use of a high prio workqueue instead. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20241211154035.75565-5-frederic@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: bnx2fc: Use kthread_create_on_cpu()Frederic Weisbecker
Use the proper API instead of open coding it. However it looks like bnx2fc_percpu_io_thread() kthread could be replaced by the use of a high prio workqueue instead. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20241211154035.75565-4-frederic@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-02scsi: MAINTAINERS: Remove myself as isci driver maintainerArtur Paszkiewicz
I'm leaving Intel and I could not find a new maintainer for this so mark it as orphan. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Link: https://lore.kernel.org/r/20241210113028.13810-1-artur.paszkiewicz@intel.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: Constify struct pci_device_idChristophe JAILLET
'struct pci_device_id' is not modified in these drivers. Constifying this structure moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 70237 9137 320 79694 1374e drivers/scsi/3w-9xxx.o After: ===== text data bss dec hex filename 70461 8913 320 79694 1374e drivers/scsi/3w-9xxx.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/fc61b1946488c1ea8f7a17a06cf40fbd05dcc6de.1733590049.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: storvsc: Don't assume cpu_possible_mask is denseMichael Kelley
Current code allocates the stor_chns array with size num_possible_cpus(). This code assumes cpu_possible_mask is dense, which is not true in the general case per [1]. If cpu_possible_mask is sparse, the array might be indexed by a value beyond the size of the array. However, the configurations that Hyper-V provides to guest VMs on x86 and ARM64 hardware, in combination with how architecture specific code assigns Linux CPU numbers, *does* always produce a dense cpu_possible_mask. So the dense assumption is not currently causing failures. But for robustness against future changes in how cpu_possible_mask is populated, update the code to no longer assume dense. The correct approach is to allocate and initialize the array using size "nr_cpu_ids". While this leaves unused array entries corresponding to holes in cpu_possible_mask, the holes are assumed to be minimal and hence the amount of memory wasted by unused entries is minimal. [1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/ Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20241003035333.49261-5-mhklinux@outlook.com Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: MAINTAINERS: Update zfcp entrySteffen Maier
Nihar takes over the zfcp maintainer work. Update the MAINTAINERS entry accordingly. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Nihar Panda <niharp@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20241205141932.1227039-4-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: zfcp: Clarify zfcp_port refcount ownership during "link" testSteffen Maier
Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20241205141932.1227039-3-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: zfcp: Correct kdoc parameter description for sending ELS and CTFedor Loshakov
Since commit 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els requests") there are no more such structures as zfcp_send_els and zfcp_send_ct. Instead there is now one common fsf structure to hold zfcp data for ct and els requests. Fix parameter description for zfcp_fsf_send_ct() and zfcp_fsf_send_els() accordingly. Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20241205141932.1227039-2-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: Eliminate scsi_register() and scsi_unregister() usage & docsRandy Dunlap
scsi_mid_low_api.rst refers to scsi_register() and scsi_unregister() but these functions don't exist. They have been replaced by more meaningful names. Update one driver (megaraid_mbox.c) that uses "scsi_unregister" in a warning message. Update scsi_mid_low_api.rst to eliminate references to scsi_register() and scsi_unregister(). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241205041839.164404-1-rdunlap@infradead.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: Chandrakanth patil <chandrakanth.patil@broadcom.com> Cc: megaraidlinux.pdl@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: docs: Remove init_this_scsi_driver()Randy Dunlap
Finish removing mention of init_this_scsi_driver() that was removed ages ago. Fixes: 83c9f08e6c6a ("scsi: remove the old scsi_module.c initialization model") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241205031307.130441-1-rdunlap@infradead.org Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: ufs: core: Update compl_time_stamp_local_clock after completing a cqeliuderong
lrbp->compl_time_stamp_local_clock is set to zero after sending a sqe but it is not updated after completing a cqe. Thus the printed information in ufshcd_print_tr() will always be zero. Update lrbp->cmpl_time_stamp_local_clock after completing a cqe. Log sample: ufshcd-qcom 1d84000.ufshc: UPIU[8] - issue time 8750227249 us ufshcd-qcom 1d84000.ufshc: UPIU[8] - complete time 0 us Fixes: c30d8d010b5e ("scsi: ufs: core: Prepare for completion in MCQ") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: liuderong <liuderong@oppo.com> Link: https://lore.kernel.org/r/1733470182-220841-1-git-send-email-liuderong@oppo.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-09scsi: ufs: core: Do not hold any lock in ufshcd_hba_stop()Avri Altman
This change is motivated by Bart's suggestion in [1], which enables to further reduce the SCSI host lock usage in the UFS driver. The reason why it makes sense, because although the legacy interrupt is disabled by some but not all ufshcd_hba_stop() callers, it is safe to nest disable_irq() calls as it checks the irq depth. [1] https://lore.kernel.org/linux-scsi/c58e4fce-0a74-4469-ad16-f1edbd670728@acm.org/ Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241128072542.219170-1-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04Merge patch series "Replace the "slave_*" function names"Martin K. Petersen
Bart Van Assche <bvanassche@acm.org> says: Hi Martin, The text "slave_" in multiple function names does not make it clear what the purpose of these functions is. Hence this patch series that renames all SCSI functions that have the word "slave" in their function name. Please consider this patch series for the next merge window. Thanks, Bart. Link: https://lore.kernel.org/r/20241022180839.2712439-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: core: Update API documentationBart Van Assche
Since the .slave_alloc(), .slave_destroy() and .slave_configure() methods have been renamed in struct scsi_host_template, also rename these in the API documentation. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-6-bvanassche@acm.org Reviewed-by: Damien Le Maol <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: core: Remove the .slave_configure() methodBart Van Assche
Now that all SCSI drivers have been converted from .slave_configure() to .sdev_configure(), remove support for .slave_configure() from the SCSI core. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: Convert SCSI drivers to .sdev_configure()Bart Van Assche
The only difference between the .sdev_configure() and .slave_configure() methods is that the former accepts an additional 'limits' argument. Convert all SCSI drivers that define a .slave_configure() method to .sdev_configure(). This patch prepares for removing the .slave_configure() method. No functionality has been changed. Acked-by: Geoff Levand <geoff@infradead.org> # for ps3rom Acked-by: Khalid Aziz <khalid@gonehiking.org> # for the BusLogic driver Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: Rename .device_configure() into .sdev_configure()Bart Van Assche
Improve naming consistency with the .sdev_prep() and .sdev_destroy() methods by renaming .device_configure() into .sdev_configure(). Cc: Christoph Hellwig <hch@lst.de> Acked-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: Rename .slave_alloc() and .slave_destroy()Bart Van Assche
Rename .slave_alloc() into .sdev_init() and .slave_destroy() into .sdev_destroy(). The new names make it clear that these are actions on SCSI devices. Make this change in the SCSI core, SCSI drivers and also in the ATA drivers. No functionality has been changed. This patch has been created as follows: * Change the text "slave_alloc" into "sdev_init" in all source files except those in drivers/net/ and Documentation/. * Change the text "slave_destroy" into "sdev_destroy" in all source files except those in drivers/net/ and Documentation/. * Rename lpfc_no_slave() into lpfc_no_sdev(). * Manually adjust whitespace where necessary to restore vertical alignment (dc395x driver and include/linux/libata.h). Acked-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: pm80xx: Improve debugging for aborted commandsVishakha Channapattan
Improves the debugging capabilities of the driver by adding more context to debug messages: 1. Introduce a new function to show pending commands. 2. Include the tag number in NCQ EH path debug messages. 3. Add logging for ata_tag along with pm80xx tag to map I/Os aborted with ATA logs. Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20241126225546.975441-1-salomondush@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: pm80xx: Increase reserved tags from 8 to 128Igor Pylypiv
Increase the number of reserved tags to prevent command processing failures when the driver is under stress. 8 reserved tags are quickly getting all used up leading to errors when command completions are delayed. The driver needs ~512 ccbs/tags for maximum I/O utilization: 16 (max disks) * 32 (max SATA queue depth) = ~512 ccbs/tags. By reserving 128 tags the driver will still have plenty of tags/ccbs left: 1024 (max ccbs) - 128 (reserved slot) = 896 tags/ccbs left. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20241126224923.973528-1-salomondush@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: pm80xx: Use dynamic tag numbers for PHY start and stopJolly Shah
Other commands were not aware if tag 0x01 was in use or not which meant multiple commands could share the same tag number. Prevent tag 0x01 from being used by multiple commands at the same time. Signed-off-by: Jolly Shah <jollys@google.com> Signed-off-by: Terrence Adams <tadamsjr@google.com> Link: https://lore.kernel.org/r/20241125213343.3272478-1-tadamsjr@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: pm80xx: Do not use libsas port IDIgor Pylypiv
libsas port IDs can differ from the controller's port IDs. Using libsas port ID to index pm8001_ha->port array is a bug. Remove sas_find_local_port_id(). We can use pm8001_ha->phy[phy_id].port to get the port ID. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Terrence Adams <tadamsjr@google.com> Link: https://lore.kernel.org/r/20241121194915.3039073-1-tadamsjr@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: scsi_debug: Fix hrtimer support for ndelayJohn Garry
Since commit 771f712ba5b0 ("scsi: scsi_debug: Fix cmd duration calculation"), ns_from_boot value is only evaluated in schedule_resp() for polled requests. However, ns_from_boot is also required for hrtimer support for when ndelay is less than INCLUSIVE_TIMING_MAX_NS, so fix up the logic to decide when to evaluate ns_from_boot. Fixes: 771f712ba5b0 ("scsi: scsi_debug: Fix cmd duration calculation") Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241202130045.2335194-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN ↵Cathy Avery
as an error This partially reverts commit 812fe6420a6e ("scsi: storvsc: Handle additional SRB status values"). HyperV does not support MAINTENANCE_IN resulting in FC passthrough returning the SRB_STATUS_DATA_OVERRUN value. Now that SRB_STATUS_DATA_OVERRUN is treated as an error, multipath ALUA paths go into a faulty state as multipath ALUA submits RTPG commands via MAINTENANCE_IN. [ 3.215560] hv_storvsc 1d69d403-9692-4460-89f9-a8cbcc0f94f3: tag#230 cmd 0xa3 status: scsi 0x0 srb 0x12 hv 0xc0000001 [ 3.215572] scsi 1:0:0:32: alua: rtpg failed, result 458752 Make MAINTENANCE_IN return success to avoid the error path as is currently done with INQUIRY and MODE_SENSE. Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Cathy Avery <cavery@redhat.com> Link: https://lore.kernel.org/r/20241127181324.3318443-1-cavery@redhat.com Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: Add missing post notify for power mode changePeter Wang
When the power mode change is successful but the power mode hasn't actually changed, the post notification was missed. Similar to the approach with hibernate/clock scale/hce enable, having pre/post notifications in the same function will make it easier to maintain. Additionally, supplement the description of power parameters for the pwr_change_notify callback. Fixes: 7eb584db73be ("ufs: refactor configuring power mode") Cc: stable@vger.kernel.org #6.11.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241122024943.30589-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: sg: Fix slab-use-after-free read in sg_release()Suraj Sonawane
Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN: BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30 kernel/locking/lockdep.c:5838 __mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912 sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407 In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is called before releasing the open_rel_lock mutex. The kref_put() call may decrement the reference count of sfp to zero, triggering its cleanup through sg_remove_sfp(). This cleanup includes scheduling deferred work via sg_remove_sfp_usercontext(), which ultimately frees sfp. After kref_put(), sg_release() continues to unlock open_rel_lock and may reference sfp or sdp. If sfp has already been freed, this results in a slab-use-after-free error. Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the open_rel_lock mutex. This ensures: - No references to sfp or sdp occur after the reference count is decremented. - Cleanup functions such as sg_remove_sfp() and sg_remove_sfp_usercontext() can safely execute without impacting the mutex handling in sg_release(). The fix has been tested and validated by syzbot. This patch closes the bug reported at the following syzkaller link and ensures proper sequencing of resource cleanup and mutex operations, eliminating the risk of use-after-free errors in sg_release(). Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling") Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: sysfs: Prevent div by zeroGwendal Grignou
Prevent a division by 0 when monitoring is not enabled. Fixes: 1d8613a23f3c ("scsi: ufs: core: Introduce HBA performance monitor sysfs nodes") Cc: stable@vger.kernel.org Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Link: https://lore.kernel.org/r/20241120062522.917157-1-gwendal@chromium.org Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Update version to 10.02.09.400-kNilesh Javali
Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Supported speed displayed incorrectly for VPortsAnil Gurumurthy
The fc_function_template for vports was missing the .show_host_supported_speeds. The base port had the same. Add .show_host_supported_speeds to the vport template as well. Cc: stable@vger.kernel.org Fixes: 2c3dfe3f6ad8 ("[SCSI] qla2xxx: add support for NPIV") Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Fix NVMe and NPIV connect issueQuinn Tran
NVMe controller fails to send connect command due to failure to locate hw context buffer for NVMe queue 0 (blk_mq_hw_ctx, hctx_idx=0). The cause of the issue is NPIV host did not initialize the vha->irq_offset field. This field is given to blk-mq (blk_mq_pci_map_queues) to help locate the beginning of IO Queues which in turn help locate NVMe queue 0. Initialize this field to allow NVMe to work properly with NPIV host. kernel: nvme nvme5: Connect command failed, errno: -18 kernel: nvme nvme5: qid 0: secure concatenation is not supported kernel: nvme nvme5: NVME-FC{5}: create_assoc failed, assoc_id 2e9100 ret 401 kernel: nvme nvme5: NVME-FC{5}: reset: Reconnect attempt failed (401) kernel: nvme nvme5: NVME-FC{5}: Reconnect attempt in 2 seconds Cc: stable@vger.kernel.org Fixes: f0783d43dde4 ("scsi: qla2xxx: Use correct number of vectors for online CPUs") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Remove check req_sg_cnt should be equal to rsp_sg_cntSaurav Kashyap
Firmware supports multiple sg_cnt for request and response for CT commands, so remove the redundant check. A check is there where sg_cnt for request and response should be same. This is not required as driver and FW have code to handle multiple and different sg_cnt on request and response. Cc: stable@vger.kernel.org Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Move FCE Trace buffer allocation to user controlQuinn Tran
Currently FCE Tracing is enabled to log additional ELS events. Instead, user will enable or disable this feature through debugfs. Modify existing DFS knob to allow user to enable or disable this feature. echo [1 | 0] > /sys/kernel/debug/qla2xxx/qla2xxx_??/fce cat /sys/kernel/debug/qla2xxx/qla2xxx_??/fce Cc: stable@vger.kernel.org Fixes: df613b96077c ("[SCSI] qla2xxx: Add Fibre Channel Event (FCE) tracing support.") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: csiostor: Fix typo doesnt->doesn'tPrateek Singh Rathore
Signed-off-by: Prateek Singh Rathore <prateek.singh.rathore@gmail.com> Link: https://lore.kernel.org/r/20241123113038.11188-1-prateek.singh.rathore@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04Merge patch series "Untie the host lock entanglement - part 2"Martin K. Petersen
Avri Altman <avri.altman@wdc.com> says: Here is the 2nd part in the sequel, watering down the scsi host lock usage in the ufs driver. This work is motivated by a comment made by Bart [1], of the abuse of the scsi host lock in the ufs driver. Its Precursor [2] removed the host lock around some of the host register accesses. This part replaces the scsi host lock by dedicated locks serializing access to the clock gating and clock scaling members. Changes compared to v4: - split patch 1 into 2 parts (Bart) - use scoped_guard() for the host_lock as well (Bart) - remove irrelevant comment and use lockdep_assert_held instead (Bart) - improve @lock documentation (Bart) Changes compared to v3: - Keep the host lock when checking ufshcd_state (Bean) Changes compared to v2: - Use clang-format to fix formating (Bart) - Use guard() in ufshcd_clkgate_enable_store (Bart) - Elaborate commit log (Bart) Changes compared to v1: - use the guard() & scoped_guard() macros (Bart) - re-order struct ufs_clk_scaling and struct ufs_clk_gating (Bart) [1] https://lore.kernel.org/linux-scsi/0b031b8f-c07c-42ef-af93-7336439d3c37@acm.org/ [2] https://lore.kernel.org/linux-scsi/20241024075033.562562-1-avri.altman@wdc.com/ Link: https://lore.kernel.org/r/20241124070808.194860-1-avri.altman@wdc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: Introduce a new clock_scaling lockAvri Altman
Introduce a new clock scaling lock to serialize access to some of the clock scaling members instead of the host_lock. here also, simplify the code with the guard() macro and co. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241124070808.194860-5-avri.altman@wdc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: Introduce a new clock_gating lockAvri Altman
Introduce a new clock gating lock to serialize access to some of the clock gating members instead of the host_lock. While at it, simplify the code with the guard() macro and co for automatic cleanup of the new lock. There are some explicit spin_lock_irqsave()/spin_unlock_irqrestore() snaking instances I left behind because I couldn't make heads or tails of it. Additionally, move the trace_ufshcd_clk_gating() call from inside the region protected by the lock as it doesn't needs protection. Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241124070808.194860-4-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: Prepare to introduce a new clock_gating lockAvri Altman
Remove hba->clk_gating.active_reqs check from ufshcd_is_ufs_dev_busy() function to separate clock gating logic from general device busy checks. Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241124070808.194860-3-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: ufs: core: Introduce ufshcd_has_pending_tasks()Avri Altman
Prepare to remove hba->clk_gating.active_reqs check from ufshcd_is_ufs_dev_busy(). Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241124070808.194860-2-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: bsg: Replace zero-length array with flexible array memberThorsten Blum
Replace the deprecated zero-length array with a modern flexible array member in the struct iscsi_bsg_host_vendor_reply. Link: https://github.com/KSPP/linux/issues/78 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20241110223323.42772-2-thorsten.blum@linux.dev Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: fnic: Use vcalloc() instead of vmalloc() and memset(0)Thorsten Blum
Use vcalloc() instead of vmalloc() followed by memset(0) to simplify the functions fnic_trace_buf_init() and fnic_fc_trace_init(). Compile-tested only. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20241107104300.1252-1-thorsten.blum@linux.dev Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Fix use after free on unloadQuinn Tran
System crash is observed with stack trace warning of use after free. There are 2 signals to tell dpc_thread to terminate (UNLOADING flag and kthread_stop). On setting the UNLOADING flag when dpc_thread happens to run at the time and sees the flag, this causes dpc_thread to exit and clean up itself. When kthread_stop is called for final cleanup, this causes use after free. Remove UNLOADING signal to terminate dpc_thread. Use the kthread_stop as the main signal to exit dpc_thread. [596663.812935] kernel BUG at mm/slub.c:294! [596663.812950] invalid opcode: 0000 [#1] SMP PTI [596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x86_64 #1 [596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 [596663.812974] RIP: 0010:__slab_free+0x17d/0x360 ... [596663.813008] Call Trace: [596663.813022] ? __dentry_kill+0x121/0x170 [596663.813030] ? _cond_resched+0x15/0x30 [596663.813034] ? _cond_resched+0x15/0x30 [596663.813039] ? wait_for_completion+0x35/0x190 [596663.813048] ? try_to_wake_up+0x63/0x540 [596663.813055] free_task+0x5a/0x60 [596663.813061] kthread_stop+0xf3/0x100 [596663.813103] qla2x00_remove_one+0x284/0x440 [qla2xxx] Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Fix abort in bsg timeoutQuinn Tran
Current abort of bsg on timeout prematurely clears the outstanding_cmds[]. Abort does not allow FW to return the IOCB/SRB. In addition, bsg_job_done() is not called to return the BSG (i.e. leak). Abort the outstanding bsg/SRB and wait for the completion. The completion IOCB will wake up the bsg_timeout thread. If abort is not successful, then driver will forcibly call bsg_job_done() and free the srb. Err Inject: - qaucli -z - assign CT Passthru IOCB's NportHandle with another initiator nport handle to trigger timeout. Remote port will drop CT request. - bsg_job_done is properly called as part of cleanup kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f0838 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-507c:7: Abort command issued - hdl=4b, type=5 kernel: qla2xxx [0000:21:00.1]-5040:7: ELS-CT pass-through-ct pass-through error hdl=4b comp_status-status=0x5 error subcode 1=0x0 error subcode 2=0xaf882e80. kernel: qla2xxx [0000:21:00.1]-7009:7: qla2x00_bsg_job_done: sp hdl 4b, result=70000 bsg ptr ffff9971a42f0838 kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=4b rval=0 kernel: qla2xxx [0000:21:00.1]-708a:7: bsg abort success. bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=0x4b kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f43b8 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-7012:7: qla_bsg_found : 2206 : Error Inject 2. kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f43b8 sp=ffff99762c304440 handle=5e rval=5 kernel: qla2xxx [0000:21:00.1]-704f:7: bsg abort fail. bsg=ffff9971a42f43b8 sp=ffff99762c304440 rval=5. kernel: qla2xxx [0000:21:00.1]-7051:7: qla_bsg_found bsg_job_done : bsg ffff9971a42f43b8 result 0xfffffffa sp ffff99762c304440. Cc: stable@vger.kernel.org Fixes: c449b4198701 ("scsi: qla2xxx: Use QP lock to search for bsg") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Update driver version to 8.12.0.3.50Ranjan Kumar
Update driver version to 8.12.0.3.50. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-6-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Handling of fault code for insufficient powerRanjan Kumar
Before retrying initialization, check and abort if the fault code indicates insufficient power. Also mark the controller as unrecoverable instead of issuing reset in the watch dog timer if the fault code indicates insufficient power. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>