linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-11-08	scsi: ufs: qcom-ufs: dt-bindings: Document the SM8650 UFS Controller	Neil Armstrong
	Document the UFS Controller on the SM8650 Platform. Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231030-topic-sm8650-upstream-bindings-ufs-v3-1-a96364463fd5@linaro.org Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: sd: Fix sshdr use in sd_suspend_common()	Mike Christie
	If scsi_execute_cmd() returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. sd_sync_cache() will only access the sshdr if it's been setup because it calls scsi_status_is_check_condition() before accessing it. However, the sd_sync_cache() caller, sd_suspend_common(), does not check. sd_suspend_common() is only checking for ILLEGAL_REQUEST which it's using to determine if the command is supported. If it's not it just ignores the error. So to fix its sshdr use this patch just moves that check to sd_sync_cache() where it converts ILLEGAL_REQUEST to success/0. sd_suspend_common() was ignoring that error and sd_shutdown() doesn't check for errors so there will be no behavior changes. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231106231304.5694-2-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: scsi_debug: Delete some bogus error checking	Dan Carpenter
	Smatch complains that "dentry" is never initialized. These days everyone initializes all their stack variables to zero so this means that it will trigger a warning every time this function is run. Really, debugfs functions are not supposed to be checked for errors in normal code. For example, if we updated this code to check the correct variable then it would print a warning if CONFIG_DEBUGFS was disabled. We don't want that. Just delete the check. Fixes: f084fe52c640 ("scsi: scsi_debug: Add debugfs interface to fail target reset") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/c602c9ad-5e35-4e18-a47f-87ed956a9ec2@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: scsi_debug: Fix some bugs in sdebug_error_write()	Dan Carpenter
	There are two bug in this code: 1) If count is zero, then it will lead to a NULL dereference. The kmalloc() will successfully allocate zero bytes and the test for "if (buf[0] == '-')" will read beyond the end of the zero size buffer and Oops. 2) The code does not ensure that the user's string is properly NUL terminated which could lead to a read overflow. Fixes: a9996d722b11 ("scsi: scsi_debug: Add interface to manage error injection for a single device") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/7733643d-e102-4581-8d29-769472011c97@moroto.mountain Reviewed-by: Wenchao Hao <haowenchao2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: ufs: core: Fix racing issue between ufshcd_mcq_abort() and ISR	Peter Wang
	If command timeout happens and cq complete IRQ is raised at the same time, ufshcd_mcq_abort clears lprb->cmd and a NULL pointer deref happens in the ISR. Error log: ufshcd_abort: Device abort task at tag 18 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 pc : [0xffffffe27ef867ac] scsi_dma_unmap+0xc/0x44 lr : [0xffffffe27f1b898c] ufshcd_release_scsi_cmd+0x24/0x114 Fixes: f1304d442077 ("scsi: ufs: mcq: Added ufshcd_mcq_abort()") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20231106075117.8995-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: ufs: core: Expand MCQ queue slot to DeviceQueueDepth + 1	Naomi Chu
	The UFSHCI 4.0 specification mandates that there should always be at least one empty slot in each queue for distinguishing between full and empty states. Enlarge 'hwq->max_entries' to 'DeviceQueueDepth + 1' to allow UFSHCI 4.0 controllers to fully utilize MCQ queue slots. Fixes: 4682abfae2eb ("scsi: ufs: core: mcq: Allocate memory for MCQ mode") Signed-off-by: Naomi Chu <naomi.chu@mediatek.com> Link: https://lore.kernel.org/r/20231102052426.12006-2-naomi.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Chun-Hung <chun-hung.wu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08	scsi: qla2xxx: Fix system crash due to bad pointer access	Quinn Tran
	User experiences system crash when running AER error injection. The perturbation causes the abort-all-I/O path to trigger. The driver assumes all I/O on this path is FCP only. If there is both NVMe & FCP traffic, a system crash happens. Add additional check to see if I/O is FCP or not before access. PID: 999019 TASK: ff35d769f24722c0 CPU: 53 COMMAND: "kworker/53:1" 0 [ff3f78b964847b58] machine_kexec at ffffffffae86973d 1 [ff3f78b964847ba8] __crash_kexec at ffffffffae9be29d 2 [ff3f78b964847c70] crash_kexec at ffffffffae9bf528 3 [ff3f78b964847c78] oops_end at ffffffffae8282ab 4 [ff3f78b964847c98] exc_page_fault at ffffffffaf2da502 5 [ff3f78b964847cc0] asm_exc_page_fault at ffffffffaf400b62 [exception RIP: qla2x00_abort_srb+444] RIP: ffffffffc07b5f8c RSP: ff3f78b964847d78 RFLAGS: 00010046 RAX: 0000000000000282 RBX: ff35d74a0195a200 RCX: ff35d76886fd03a0 RDX: 0000000000000001 RSI: ffffffffc07c5ec8 RDI: ff35d74a0195a200 RBP: ff35d76913d22080 R8: ff35d7694d103200 R9: ff35d7694d103200 R10: 0000000100000000 R11: ffffffffb05d6630 R12: 0000000000010000 R13: ff3f78b964847df8 R14: ff35d768d8754000 R15: ff35d768877248e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 6 [ff3f78b964847d70] qla2x00_abort_srb at ffffffffc07b5f84 [qla2xxx] 7 [ff3f78b964847de0] __qla2x00_abort_all_cmds at ffffffffc07b6238 [qla2xxx] 8 [ff3f78b964847e38] qla2x00_abort_all_cmds at ffffffffc07ba635 [qla2xxx] 9 [ff3f78b964847e58] qla2x00_terminate_rport_io at ffffffffc08145eb [qla2xxx] 10 [ff3f78b964847e70] fc_terminate_rport_io at ffffffffc045987e [scsi_transport_fc] 11 [ff3f78b964847e88] process_one_work at ffffffffae914f15 12 [ff3f78b964847ed0] worker_thread at ffffffffae9154c0 13 [ff3f78b964847f10] kthread at ffffffffae91c456 14 [ff3f78b964847f50] ret_from_fork at ffffffffae8036ef Cc: stable@vger.kernel.org Fixes: f45bca8c5052 ("scsi: qla2xxx: Fix double scsi_done for abort path") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20231030064912.37912-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: ufs: core: Leave space for '\0' in utf8 desc string	Daniel Mentz
	utf16s_to_utf8s does not NULL terminate the output string. For us to be able to add a NULL character when utf16s_to_utf8s returns, we need to make sure that there is space for such NULL character at the end of the output buffer. We can achieve this by passing an output buffer size to utf16s_to_utf8s that is one character less than what we allocated. Other call sites of utf16s_to_utf8s appear to be using the same technique where they artificially reduce the buffer size by one to leave space for a NULL character or line feed character. Fixes: 4b828fe156a6 ("scsi: ufs: revamp string descriptor reading") Reviewed-by: Mars Cheng <marscheng@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Yen-lin Lai <yenlinlai@google.com> Signed-off-by: Daniel Mentz <danielmentz@google.com> Link: https://lore.kernel.org/r/20231017182026.2141163-1-danielmentz@google.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: ufs: core: Conversion to bool not necessary	Bragatheswaran Manickavel
	A logical evaluation already results in bool. There is no need for using a ternary operator based evaluation and bool conversion of the outcome. Issue identified using boolconv.cocci Coccinelle semantic patch. Signed-off-by: Bragatheswaran Manickavel <bragathemanick0908@gmail.com> Link: https://lore.kernel.org/r/20231024183401.48888-1-bragathemanick0908@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: ufs: core: Fix race between force complete and ISR	Alice Chao
	While error handler force complete command (Thread A) and completion IRQ raising (Thread B) of the same command, it may cause race condition. Below is racing step (from 1 to 6): ufshcd_mcq_compl_pending_transfer (Thread A) 1 if (cmd && !test_bit(SCMD_STATE_COMPLETE, &cmd->state)) { 5 spin_lock_irqsave(&hwq->cq_lock, flags); // wait lock release set_host_byte(cmd, DID_REQUEUE); 6 ufshcd_release_scsi_cmd(hba, lrbp); // access null pointer scsi_done(cmd); spin_unlock_irqrestore(&hwq->cq_lock, flags); } ufshcd_mcq_poll_cqe_lock (Thread B) 2 spin_lock_irqsave(&hwq->cq_lock, flags); ufshcd_mcq_poll_cqe_nolock() ufshcd_compl_one_cqe() 3 ufshcd_release_scsi_cmd() // lrbp->cmd = NULL; 4 spin_unlock_irqrestore(&hwq->cq_lock, flags); Signed-off-by: Alice Chao <alice.chao@mediatek.com> Link: https://lore.kernel.org/r/20231024084324.12197-1-alice.chao@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: megaraid: Fix up debug message in megaraid_abort_and_reset()	Hannes Reinecke
	Found by Smatch. Fixes: 5bcd3bfbda02 ("scsi: megaraid: Pass in NULL scb for host reset") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231023073021.21954-1-hare@suse.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: aic79xx: Fix up NULL command in ahd_done()	Hannes Reinecke
	Found by smatch. Fixes: c67e63800446 ("scsi: aic79xx: Do not reference SCSI command when resetting device") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231023073014.21438-1-hare@suse.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: message: fusion: Initialize return value in mptfc_bus_reset()	Hannes Reinecke
	Detected by smatch. Fixes: 17865dc2eccc ("scsi: message: fusion: Open-code mptfc_block_error_handler() for bus reset") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231023073005.20766-1-hare@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: mpt3sas: Fix loop logic	Ranjan Kumar
	The retry loop continues to iterate until the count reaches 30, even after receiving the correct value. Exit loop when a non-zero value is read. Fixes: 4ca10f3e3174 ("scsi: mpt3sas: Perform additional retries if doorbell read returns 0") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20231020105849.6350-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: snic: Remove useless code in snic_dr_clean_pending_req()	Su Hui
	Return error code directly to save space and be more clear. Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231020023326.43898-1-suhui@nfschina.com Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: core: Add comment to target_destroy in scsi_host_template	Wenchao Hao
	Add comment to indicate that the callback function target_destroy in the scsi_host_template must not sleep. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231018113746.1940197-3-haowenchao2@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: core: Clean up scsi_dev_queue_ready()	Wenchao Hao
	This is just a cleanup for scsi_dev_queue_ready() to avoid a redundant goto and if statement. No functional change. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231018113746.1940197-2-haowenchao2@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: pmcraid: Add missing scsi_device_put() in ↵	Hannes Reinecke
	pmcraid_eh_target_reset_handler() When breaking out of an shost_for_each_device() loop one needs to do an explicit scsi_device_put(). Fixes: c2a14ab3b9b3 ("scsi: pmcraid: Select device in pmcraid_eh_target_reset_handler()") Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231023072957.20191-1-hare@suse.de Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: target: core: Fix kernel-doc comment	Yang Li
	Fix kernel-doc comment to silence the warnings: drivers/target/target_core_transport.c:1930: warning: Excess function parameter 'cmd' description in 'target_submit' drivers/target/target_core_transport.c:1930: warning: Function parameter or member 'se_cmd' not described in 'target_submit' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6844 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20231017030913.89973-1-yang.lee@linux.alibaba.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24	scsi: pmcraid: Fix kernel-doc comment	Yang Li
	Fix kernel-doc comment to silence the warnings: drivers/scsi/pmcraid.c:2697: warning: Excess function parameter 'scsi_cmd' description in 'pmcraid_reset_device' drivers/scsi/pmcraid.c:2697: warning: Function parameter or member 'scsi_dev' not described in 'pmcraid_reset_device' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6843 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20231017025853.67562-1-yang.lee@linux.alibaba.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: core: Handle depopulation and restoration in progress	Douglas Gilbert
	The default handling of the NOT READY sense key is to wait for the device to become ready. The "wait" is assumed to be relatively short. However there is a sub-class of NOT READY that have the "... in progress" phrase in their additional sense code and these can take much longer. Following on from commit 505aa4b6a883 ("scsi: sd: Defer spinning up drive while SANITIZE is in progress") we now have element depopulation and restoration that can take a long time. For example, over 24 hours for a 20 TB, 7200 rpm hard disk to depopulate 1 of its 20 elements. Add handling of ASC/ASCQ: 0x4,0x24 (depopulation in progress) and ASC/ASCQ: 0x4,0x25 (depopulation restoration in progress) to sd.c . The scsi_lib.c has incomplete handling of these two messages, so complete it. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20231015050650.131145-1-dgilbert@interlog.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: ufs: core: Add support for parsing OPP	Manivannan Sadhasivam
	OPP framework can be used to scale the clocks along with other entities such as regulators, performance state etc... So let's add support for parsing OPP from devicetree. OPP support in devicetree is added through the "operating-points-v2" property which accepts the OPP table defining clock frequency, regulator voltage, power domain performance state etc... Since the UFS controller requires multiple clocks to be controlled for proper working, devm_pm_opp_set_config() has been used which supports scaling multiple clocks through custom ufshcd_opp_config_clks() callback. It should be noted that the OPP support is not compatible with the old "freq-table-hz" property. So only one can be used at a time even though the UFS core supports both. Co-developed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20231012172129.65172-4-manivannan.sadhasivam@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: ufs: core: Add OPP support for scaling clocks and regulators	Manivannan Sadhasivam
	UFS core is only scaling the clocks during devfreq scaling and initialization. But for an optimum power saving, regulators should also be scaled along with the clocks. So let's use the OPP framework which supports scaling clocks, regulators, and performance state using OPP table defined in devicetree. For accomodating the OPP support, the existing APIs (ufshcd_scale_clks, ufshcd_is_devfreq_scaling_required and ufshcd_devfreq_scale) are modified to accept "freq" as an argument which in turn used by the OPP helpers. The OPP support is added along with the old freq-table based clock scaling so that the existing platforms work as expected. Co-developed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20231012172129.65172-3-manivannan.sadhasivam@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: ufs: dt-bindings: common: Add OPP table	Krzysztof Kozlowski
	Except scaling UFS and bus clocks, it's necessary to scale also the voltages of regulators or power domain performance state levels. Adding Operating Performance Points table allows to adjust power domain performance state, depending on the UFS clock speed. OPPv2 deprecates previous property limited to clock scaling: freq-table-hz. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20231012172129.65172-2-manivannan.sadhasivam@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	Merge patch series "scsi: scsi_debug: Add error injection for single device"	Martin K. Petersen
	Wenchao Hao <haowenchao2@huawei.com> says: The original error injection mechanism was based on scsi_host which could not inject fault for a single SCSI device. This patchset provides the ability to inject errors for a single SCSI device. Now we support inject timeout errors, queuecommand errors, and hostbyte, driverbyte, statusbyte, and sense data for specific SCSI Command. Two new error injection is defined to make abort command or reset LUN failed. Besides error injection for single device, this patchset add a new interface to make reset target failed for each scsi_target. The first two patch add a debugfs interface to add and inquiry single device's error injection info; the third patch defined how to remove an injection which has been added. The following 5 patches use the injection info and generate the related error type. The last two just add a new interface to make reset target failed and control scsi_device's allow_restart flag. Link: https://lore.kernel.org/r/20231010092051.608007-1-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Add param to control sdev's allow_restart	Wenchao Hao
	Add new module param "allow_restart" to control scsi_device's allow_restart flag. This flag determines if EH is triggered after a command completes with sense_key 0x6, ASC 0x4 and ASCQ 0x2. EH would be triggered if allow_restart=1 in this condition. The new param can be used with the error injection capability to test how commands completing with sense_key 0x6, ASC 0x4 and ASCQ 0x2 are handled. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-11-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Add debugfs interface to fail target reset	Wenchao Hao
	The interface is found at /sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t> identifies the target to inject errors on. It's a simple bool type interface which would make this target's reset fail if set to 'Y'. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-10-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Add new error injection type: Reset LUN failed	Wenchao Hao
	Add error injection type 4 to make scsi_debug_device_reset() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x4 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0x12" > ${error} will make the device return FAILED when trying to reset LUN with inquiry command 10 times. error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "4 -10 0xff" > ${error} will make the device return FAILED when trying to reset LUN 10 times. Usually we do not care about what command it is when trying to perform reset LUN, so 0xff could be applied. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-9-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Add new error injection type: Abort Failed	Wenchao Hao
	Add error injection type 3 to make scsi_debug_abort() return FAILED. Fail abort command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x3 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "3 -10 0x12" > ${error} will make the device return FAILED when aborting inquiry command 10 times. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-8-haowenchao2@huawei.com Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Set command result and sense data if error is injected	Wenchao Hao
	If a fail command error is injected, set the command's status and sense data then finish this SCSI command. Set SCSI command's status and sense data format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x2 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error Count \| \| \| \| 0: the rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x8 \| Host byte in scsi_cmd::status \| \| \| \| [scsi_cmd::status has 32 bits holding these 3 bytes] \| +--------+------+-------------------------------------------------------+ \| 5 \| x8 \| Driver byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 6 \| x8 \| SCSI Status byte in scsi_cmd::status \| +--------+------+-------------------------------------------------------+ \| 7 \| x8 \| SCSI Sense Key in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 8 \| x8 \| SCSI ASC in scsi_cmnd \| +--------+------+-------------------------------------------------------+ \| 9 \| x8 \| SCSI ASCQ in scsi_cmnd \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >${error} will make device's read command return with media error with additional sense of "Unrecovered read error" (UNC): Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-7-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Return failed value if error is injected	Wenchao Hao
	If a fail queuecommand error is injected, return the failed value defined in the rule from queuecommand. Make queuecommand return format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x1 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| 4 \| x32 \| The queuecommand() return value we want \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "1 1 0x12 0x1055" > ${error} will make each INQUIRY command sent to that device return 0x1055 (SCSI_MLQUEUE_HOST_BUSY). Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-6-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Time out command if the error is injected	Wenchao Hao
	If a timeout error is injected, return 0 from scsi_debug_queuecommand to make the command time out. Time out SCSI command format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error type, fixed to 0x0 \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ Examples: error=/sys/kernel/debug/scsi_debug/0:0:0:1/error echo "0 -10 0x12" > ${error} will make the device's inquiry command time out 10 times. echo "0 1 0x12" > ${error} will make the device's inquiry time out each time it is invoked on this device. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Define grammar to remove added error injection	Wenchao Hao
	The grammar to remove error injection is a line with fixed 3 columns separated by spaces. First column is fixed to "-". It tells this is a removal operation. Second column is the error code to match. Third column is the scsi command to match. For example the following command would remove timeout injection of inquiry command: echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-4-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Add interface to manage error injection for a single device	Wenchao Hao
	This new facility uses the debugfs pseudo file system which is typically mounted under the /sys/kernel/debug directory and requires root permissions to access. The interface file is found at /sys/kernel/debug/scsi_debug/<h:c:t:l>/error where <h:c:t:l> identifies the device (logical unit (LU)) to inject errors on. For the following description the ${error} environment variable is assumed to be set to/sys/kernel/debug/scsi_debug/1:0:0:0/error where 1:0:0:0 is a pseudo device (LU) owned by the scsi_debug driver. Rules are written to ${error} in the normal sysfs fashion (e.g. 'echo "0 -2 0x12" > ${error}'). More than one rule can be active on a device at a time and inactive rules (i.e. those whose error count is 0) remain in the rule listing. The existing rules can be read with 'cat ${error}' with oneline output for each rule. The interface format is line-by-line, each line is an error injection rule. Each rule contains integers separated by spaces, the first three columns correspond to "Error code", "Error count" and "SCSI command", other columns depend on Error code. General rule format: +--------+------+-------------------------------------------------------+ \| Column \| Type \| Description \| +--------+------+-------------------------------------------------------+ \| 1 \| u8 \| Error code \| \| \| \| 0: timeout SCSI command \| \| \| \| 1: fail queuecommand, make queuecommand return \| \| \| \| given value \| \| \| \| 2: fail command, finish command with SCSI status, \| \| \| \| sense key and ASC/ASCQ values \| \| \| \| 3: make abort commands for specific command fail \| \| \| \| 4: make reset lun for specific command fail \| +--------+------+-------------------------------------------------------+ \| 2 \| s32 \| Error count \| \| \| \| 0: this rule will be ignored \| \| \| \| positive: the rule will always take effect \| \| \| \| negative: the rule takes effect n times where -n is \| \| \| \| the value given. Ignored after n times \| +--------+------+-------------------------------------------------------+ \| 3 \| x8 \| SCSI command opcode, 0xff for all commands \| +--------+------+-------------------------------------------------------+ \| ... \| xxx \| Error type specific fields \| +--------+------+-------------------------------------------------------+ Notes: - When multiple error inject rules are added for the same SCSI command, the one with smaller error code will take effect (and the others will be ignored). - If the same error (i.e. same Error code and SCSI command) is added, the older one will be overwritten.. - Currently, the basic types are (u8/u16/u32/u64/s8/s16/s32/s64) and the hexadecimal types (x8/x16/x32/x64). - Where a hexadecimal value is expected (e.g. Column 3: SCSI command opcode) the "0x" prefix is optional on the value (e.g. the INQUIRY opcode can be given as '0x12' or '12'). - When the Error count is negative, reading ${error} will show that value incrementing, stopping when it gets to 0. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-3-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16	scsi: scsi_debug: Create scsi_debug directory in the debugfs filesystem	Wenchao Hao
	Create directory scsi_debug in the root of the debugfs filesystem. Prepare to add interface for manage error injection. Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20231010092051.608007-2-haowenchao2@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	Merge patch series "lpfc: Update lpfc to revision 14.2.0.15"	Martin K. Petersen
	Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.15 This patch set contains error handling fixes, ELS bug fixes, and logging improvements. The patches were cut against Martin's 6.7/scsi-queue tree. Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Update lpfc version to 14.2.0.15	Justin Tee
	Update lpfc version to 14.2.0.15. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Introduce LOG_NODE_VERBOSE messaging flag	Justin Tee
	The preexisting LOG_NODE message flag frequently spams a subset of the same log messages during normal FC driver operations. When analyzing driver logs, this sometimes leads to difficulty in troubleshooting. Because LOG_IP log message flag is unused, convert it to a new LOG_NODE_VERBOSE flag. The LOG_NODE_VERBOSE shall specifically be used for diagnosing issues that require precise ndlp tracking detail. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Validate ELS LS_ACC completion payload	Justin Tee
	A WCQE success completion status does not guarantee valid LS_ACC receipt for ELS commands. So, introduce a small helper routine that validates ELS LS_ACC frames in ELS cmpl routines. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Reject received PRLIs with only initiator fcn role for NPIV ports	Justin Tee
	Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI requests. For an NPIV port, there is no point to PRLI_ACC if the received PRLI request has the initiator function bit set and the target function bit unset. Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT in such cases. NPIV ports are expected to send PRLI_ACC only if the Target function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Treat IOERR_SLI_DOWN I/O completion status the same as pci offline	Justin Tee
	During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is set by the driver for all outstanding I/Os. In such hardware error attention cases, we can treat the situation exactly the same as pci_channel_offline. Thus, add IOERR_SLI_DOWN status to the same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: lpfc: Remove unnecessary zero return code assignment in ↵	Justin Tee
	lpfc_sli4_hba_setup In order to enter the !rc if statement block in question, rc had to have been zero to begin with. Thus, the rc = 0 assignment is unnecessary and can be removed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	Merge patch series "megaraid_sas: Driver version update to 07.727.03.00-rc1"	Martin K. Petersen
	Chandrakanth patil <chandrakanth.patil@broadcom.com> says: This set of patches includes critical fixes, and updates to the maintainer list. Link: https://lore.kernel.org/r/20231003110021.168862-1-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: megaraid_sas: Revision of Maintainer List	Chandrakanth patil
	Given my active involvement in megaraid_sas development, I am including myself in the maintainers list. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-5-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: megaraid_sas: Driver version update to 07.727.03.00-rc1	Chandrakanth patil
	Driver version update. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-4-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: megaraid_sas: Log message when controller reset is requested but not ↵	Chandrakanth patil
	issued The driver now includes the print message 'IO is completed, no reset is required' when a reset is requested but not issued. This message is displayed only when pending SCSI IO is completed before issuing the reset. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-3-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: megaraid_sas: Increase register read retry rount from 3 to 30 for ↵	Chandrakanth patil
	selected registers In BMC environments with concurrent access to multiple registers, certain registers occasionally yield a value of 0 even after 3 retries due to hardware errata. As a fix, we have extended the retry count from 3 to 30. The same errata applies to the mpt3sas driver, and a similar patch has been accepted. Please find more details in the mpt3sas patch reference link. Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Fixes: 272652fcbf1a ("scsi: megaraid_sas: add retry logic in megasas_readl") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-2-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	Merge patch series "scsi: sshdr and retry fixes"	Martin K. Petersen
	Mike Christie <michael.christie@oracle.com> says: The following patches were made over Linus tree (Martin's 6.7 branch was missing some changes to sd.c). They only contain the sshdr and rdac retry fixes from the "Allow scsi_execute users to control retries" patchset. The patches in this set are reviewed and tested but the changes to how we do retries will take a little longer and require more testing, so I broke up the series to make them easier to review. Link: https://lore.kernel.org/r/20231004210013.5601-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: sr: Fix sshdr use in sr_get_events	Mike Christie
	If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-13-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13	scsi: sd: Fix sshdr use in cache_type_store	Mike Christie
	If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-12-michael.christie@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>