summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-22scsi: ata: libata: Set read/write commands CDL indexDamien Le Moal
For devices supporting the command duration limits feature, translate the dld field of read and write operation to set the command duration limit index field of the command task file when the duration limit feature is enabled. The function ata_set_tf_cdl() is introduced to do this. For unqueued (non NCQ) read and write operations, this function sets the command duration limit index set as the lower 3 bits of the feature field. For queued NCQ read/write commands, the index is set as the lower 3 bits of the auxiliary field. The flag ATA_QCFLAG_HAS_CDL is introduced to indicate that a command taskfile has a non zero cdl field. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-19-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata: Add ATA feature control sub-page translationDamien Le Moal
Add support for the ATA feature control sub-page of the control mode page to enable/disable the command duration limits feature using the cdl_ctrl field of the ATA feature control sub-page. Both mode sense and mode select translation are supported. For mode sense, the ata device flag ATA_DFLAG_CDL_ENABLED is used to cache the status of the command duration limits feature. Enabling this feature is done using a SET FEATURES command with a cdl action set to 1 when the page cdl_ctrl field value is 0x2 (T2A and T2B pages supported). If this field is 0, CDL is disabled using the SET FEATURES command with a cdl action set to 0. Since a device CDL and NCQ priority features should not be used simultaneously, ata_mselect_control_ata_feature() returns an error when attempting to enable CDL with the device priority feature enabled. Conversely, the function ata_ncq_prio_enable_store() used to enable the use of the device NCQ priority feature through sysfs is modified to return an error if the device CDL feature is enabled. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-18-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata-scsi: Add support for CDL pages mode senseDamien Le Moal
Modify ata_scsiop_mode_sense() and ata_msense_control() to support mode sense access to the T2A and T2B sub-pages of the control mode page. ata_msense_control() is modified to support sub-pages. The T2A sub-page is generated using the read descriptors of the command duration limits log page 18h. The T2B sub-page is generated using the write descriptors of the same log page. With the addition of these sub-pages, getting all sub-pages of the control mode page is also supported by increasing the value of ATA_SCSI_RBUF_SIZE from 576B up to 2048B to ensure that all sub-pages fit in the fill buffer. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-17-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata-scsi: Handle CDL bits in ata_scsiop_maint_in()Damien Le Moal
For a scsi MAINTENANCE_IN/MI_REPORT_SUPPORTED_OPERATION_CODES operation, add the translation of the rwcdlp and cdlp bits for the READ 16 and WRITE 16 commands. If the ATA device does not support command duration limits, these bits are always 0. If the ATA device supports command duration limits, the rwcdlp bit is set to 1 for READ 16 and WRITE 16 and the cdlp bits are set to 0x1 for READ 16 and 0x2 for WRITE 16. These correspond to the T2A mode page containing the read descriptors and to the T2B mode page containing the write descriptors, as defined in SAT-5. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-16-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata: Detect support for command duration limitsDamien Le Moal
Use the supported capabilities identify device data log page to detect if a device supports the command duration limits feature. For devices supporting this feature, set the device flag ATA_DFLAG_CDL. To support SCSI-ATA translation, retrieve the command duration limits log page 18h and cache this page content using the cdl array added to the ata_device data structure. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-15-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata: Change ata_eh_request_sense() to not set CHECK_CONDITIONNiklas Cassel
Currently, ata_eh_request_sense() unconditionally sets the scsicmd->result to SAM_STAT_CHECK_CONDITION. For Command Duration Limits policy 0xD: The device shall complete the command without error (SAM_STAT_GOOD) with the additional sense code set to DATA CURRENTLY UNAVAILABLE. It is perfectly fine to have sense data for a command that returned completion without error. In order to support for CDL policy 0xD, we have to remove this assumption that having sense data means that the command failed (SAM_STAT_CHECK_CONDITION). Change ata_eh_request_sense() to not set SAM_STAT_CHECK_CONDITION, and instead move the setting of SAM_STAT_CHECK_CONDITION to the single caller that wants SAM_STAT_CHECK_CONDITION set, that way ata_eh_request_sense() can be reused in a follow-up patch that adds support for CDL policy 0xD. The only caller of ata_eh_request_sense() is protected by: if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)), so we can remove this duplicated check from ata_eh_request_sense() itself. Additionally, ata_eh_request_sense() is only called from ata_eh_analyze_tf(), which is only called when iteratating the QCs using ata_qc_for_each_raw(), which does not include the internal tag, so cmd can never be NULL (all non-internal commands have qc->scsicmd set), so remove the !cmd check as well. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-14-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: ata: libata-scsi: Remove unnecessary !cmd checksNiklas Cassel
There is no need to check if !cmd as this can only happen for ATA internal commands which uses the ATA internal tag (32). Most users of ata_scsi_set_sense() are from _xlat functions that translate a scsicmd to an ATA command. These obviously have a qc->scsicmd. ata_scsi_qc_complete() can also call ata_scsi_set_sense() via ata_gen_passthru_sense() / ata_gen_ata_sense(), called via ata_scsi_qc_complete(). This callback is only called for translated commands, so it also has a qc->scsicmd. ata_eh_analyze_ncq_error(): the NCQ error log can only contain a 0-31 value, so it will never be able to get the ATA internal tag (32). ata_eh_request_sense(): only called by ata_eh_analyze_tf(), which is only called when iteratating the QCs using ata_qc_for_each_raw(), which does not include the internal tag. Since there is no existing call site where cmd can be NULL, remove the !cmd check from ata_scsi_set_sense() and ata_scsi_set_sense_information(). Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-13-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: sd: Handle read/write CDL timeout failuresNiklas Cassel
Commands using a duration limit descriptor that has limit policies set to a value other than 0x0 may be failed by the device if one of the limits are exceeded. For such commands, since the failure is the result of the user duration limit configuration and workload, the commands should not be retried and terminated immediately. Furthermore, to allow the user to differentiate these "soft" failures from hard errors due to hardware problem, a different error code than EIO should be returned. There are 2 cases to consider: (1) The failure is due to a limit policy failing the command with a check condition sense key, that is, any limit policy other than 0xD. For this case, scsi_check_sense() is modified to detect failures with the ABORTED COMMAND sense key and the COMMAND TIMEOUT BEFORE PROCESSING or COMMAND TIMEOUT DURING PROCESSING or COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR RECOVERY additional sense code. For these failures, a SUCCESS disposition is returned so that scsi_finish_command() is called to terminate the command. (2) The failure is due to a limit policy set to 0xD, which result in the command being terminated with a GOOD status, COMPLETED sense key, and DATA CURRENTLY UNAVAILABLE additional sense code. To handle this case, the scsi_check_sense() is modified to return a SUCCESS disposition so that scsi_finish_command() is called to terminate the command. In addition, scsi_decide_disposition() has to be modified to see if a command being terminated with GOOD status has sense data. This is as defined in SCSI Primary Commands - 6 (SPC-6), so all according to spec, even if GOOD status commands were not checked before. If scsi_check_sense() detects sense data representing a duration limit, scsi_check_sense() will set the newly introduced SCSI ML byte SCSIML_STAT_DL_TIMEOUT. This SCSI ML byte is checked in scsi_noretry_cmd(), so that a command that failed because of a CDL timeout cannot be retried. The SCSI ML byte is also checked in scsi_result_to_blk_status() to complete the command request with the BLK_STS_DURATION_LIMIT status, which result in the user seeing ETIME errors for the failed commands. Co-developed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-12-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: sd: Set read/write command CDL indexDamien Le Moal
Introduce the command duration limits helper function sd_cdl_dld() to set the DLD bits of READ/WRITE 16 and READ/WRITE 32 commands to indicate to the device the command duration limit descriptor to apply to the commands. When command duration limits are enabled, sd_cdl_dld() obtains the index of the descriptor to apply to the command using the hints field of the request IO priority value (hints IOPRIO_HINT_DEV_DURATION_LIMIT_1 to IOPRIO_HINT_DEV_DURATION_LIMIT_7). If command duration limits is disabled (which is the default), the limit index "0" is always used to indicate "no limit" for a command. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-11-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Allow enabling and disabling command duration limitsDamien Le Moal
Add the sysfs scsi_device attribute cdl_enable to allow a user to enable or disable a device command duration limits feature. CDL is disabled by default. This feature must be explicitly enabled by a user by setting the cdl_enable attribute to 1. The new function scsi_cdl_enable() does not do anything beside setting the cdl_enable field of struct scsi_device in the case of a (real) SCSI device (e.g. a SAS HDD). For ATA devices, the command duration limits feature needs to be enabled/disabled using the ATA feature sub-page of the control mode page. To do so, the scsi_cdl_enable() function checks if this mode page is supported using scsi_mode_sense(). If it is, scsi_mode_select() is used to enable and disable CDL. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-10-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Detect support for command duration limitsDamien Le Moal
Introduce the function scsi_cdl_check() to detect if a device supports command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32 and WRITE 32 commands are checked using the function scsi_report_opcode() to probe the rwcdlp and cdlp bits as they indicate the mode page defining the command duration limits descriptors that apply to the command being tested. If any of these commands support CDL, the field cdl_supported of struct scsi_device is set to 1 to indicate that the device supports CDL. Support for CDL for a device is advertizes through sysfs using the new cdl_supported device attribute. This attribute value is 1 for a device supporting CDL and 0 otherwise. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-9-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Support Service Action in scsi_report_opcode()Damien Le Moal
The REPORT_SUPPORTED_OPERATION_CODES command allows checking for support of commands that have the same opcode but different service actions, such as READ 32 and WRITE 32. However, the current implementation of scsi_report_opcode() only allows checking an operation code without a service action differentiation. Add the "sa" argument to scsi_report_opcode() to allow passing a service action. If a non-zero service action is specified, the reporting options field value is set to 3 to have the service action field taken into account by the device. If no service action field is specified (zero), the reporting options field is set to 1 as before. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-8-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Support retrieving sub-pages of mode pagesDamien Le Moal
Allow scsi_mode_sense() to retrieve sub-pages of mode pages by adding the subpage argument. Change all the current caller sites to specify the subpage 0. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-7-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Rename and move get_scsi_ml_byte()Niklas Cassel
SCSI has two different getters: - get_XXX_byte() (in scsi_cmnd.h) which takes a struct scsi_cmnd *, and - XXX_byte() (in scsi.h) which takes a scmd->result. The proper name for get_scsi_ml_byte() should thus be without the get_ prefix, as it takes a scmd->result. Rename the function to rectify this. (This change was suggested by Mike Christie.) Additionally, move get_scsi_ml_byte() to scsi_priv.h since both scsi_lib.c and scsi_error.c will need to use this helper in a follow-up patch. Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-6-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: core: Allow libata to complete successful commands via EHNiklas Cassel
In SCSI, we get the sense data as part of the completion, for ATA however, we need to fetch the sense data as an extra step. For an aborted ATA command the sense data is fetched via libata's ->eh_strategy_handler(). For Command Duration Limits policy 0xD: The device shall complete the command without error with the additional sense code set to DATA CURRENTLY UNAVAILABLE. In order to handle this policy in libata, we intend to send a successful command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the sense data for the good command. This is similar to how we handle an aborted ATA command, just that we need to read the Successful NCQ Commands log instead of the NCQ Command Error log. When we get a SATA completion with successful commands, ATA_SENSE will be set, indicating that some commands in the completion have sense data. The sense_valid bitmask in the Sense Data for Successful NCQ Commands log will inform exactly which commands that had sense data, which might be a subset of all the commands that was completed in the same completion. (Yet all will have ATA_SENSE set, since the status is per completion.) The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will not set the scmd->result to DID_TIME_OUT for these commands. However, the successful commands that did not have sense data, must not get their result marked as DID_TIME_OUT by SCSI EH. Add a new flag SCMD_FORCE_EH_SUCCESS, which tells SCSI EH to not mark a command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. This will be used by libata in a subsequent commit. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-5-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: block: Introduce BLK_STS_DURATION_LIMITDamien Le Moal
Introduce the new block I/O status BLK_STS_DURATION_LIMIT for LLDDs to report command that failed due to a command duration limit being exceeded. This new status is mapped to the ETIME error code to allow users to differentiate "soft" duration limit failures from other more serious hardware related errors. If we compare BLK_STS_DURATION_LIMIT with BLK_STS_TIMEOUT: -BLK_STS_DURATION_LIMIT means that the drive gave a reply indicating that the command duration limit was exceeded before the command could be completed. This I/O status is mapped to ETIME for user space. -BLK_STS_TIMEOUT means that the drive never gave a reply at all. This I/O status is mapped to ETIMEDOUT for user space. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-4-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: block: Introduce ioprio hintsDamien Le Moal
I/O priorities currently only use 6-bits of the 16-bits ioprio value: the 3-upper bits are used to define up to 8 priority classes (4 of which are valid) and the 3 lower bits of the value are used to define a priority level for the real-time and best-effort class. The remaining 10-bits between the I/O priority class and level are unused, and in fact, cannot be used by the user as doing so would either result in the value being completely ignored, or in an error returned by ioprio_check_cap(). Use these 10-bits of an ioprio value to allow a user to specify I/O hints. An I/O hint is defined as a 10-bitsvalue, allowing up to 1023 different hints to be specified, with the value 0 being reserved as the "no hint" case. An I/O hint can apply to any I/O that specifies a valid priority class other than NONE, regardless of the I/O priority level specified. To do so, the macros IOPRIO_PRIO_HINT() and IOPRIO_PRIO_VALUE_HINT() are introduced in include/uapi/linux/ioprio.h to respectively allow a user to get and set a hint in an ioprio value. To support the ATA and SCSI command duration limits feature, 7 hints are defined: IOPRIO_HINT_DEV_DURATION_LIMIT_1 to IOPRIO_HINT_DEV_DURATION_LIMIT_7, allowing a user to specify which command duration limit descriptor should be applied to the commands serving an I/O. Specifying these hints has for now no effect whatsoever if the target block devices do not support the command duration limits feature. However, in the future, block I/O schedulers can be modified to optimize I/O issuing order based on these hints, even for devices that do not support the command duration limits feature. Given that the 7 duration limits hints defined have no effect on any block layer component, the actual definition of the duration limits implied by these hints remains at the device level. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-3-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22scsi: block: ioprio: Clean up interface definitionDamien Le Moal
The I/O priority user interface defines the 16-bits ioprio values as the combination of the upper 3-bits for an I/O priority class and the lower 13-bits as priority data. However, the kernel only uses the lower 3-bits of the priority data to define priority levels for the RT and BE priority classes. The data part of an ioprio value is completely ignored for the IDLE and NONE classes. This is enforced by checks done in ioprio_check_cap(), which is called for all paths that allow defining an I/O priority for I/Os: the per-context ioprio_set() system call, aio interface and io_uring interface. Clarify this fact in the uapi ioprio.h header file and introduce the IOPRIO_PRIO_LEVEL_MASK and IOPRIO_PRIO_LEVEL() macros for users to define and get priority levels in an ioprio value. The coarser macro IOPRIO_PRIO_DATA() is retained for backward compatibility with old applications already using it. There is no functional change introduced with this. In-kernel users of the IOPRIO_PRIO_DATA() macro which are explicitly handling I/O priority data as a priority level are modified to use the new IOPRIO_PRIO_LEVEL() macro without any functional change. Since f2fs is the only user of this macro not explicitly using that value as a priority level, it is left unchanged. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-2-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-22Merge patch series "Use block pr_ops in LIO"Martin K. Petersen
Mike Christie <michael.christie@oracle.com> says: The patches in this thread allow us to use the block pr_ops with LIO's target_core_iblock module to support cluster applications in VMs. They were built over Linus's tree. They also apply over linux-next and Martin's tree and Jens's trees. Currently, to use windows clustering or linux clustering (pacemaker + cluster labs scsi fence agents) in VMs with LIO and vhost-scsi, you have to use tcmu or pscsi or use a cluster aware FS/framework for the LIO pr file. Setting up a cluster FS/framework is pain and waste when your real backend device is already a distributed device, and pscsi and tcmu are nice for specific use cases, but iblock gives you the best performance and allows you to use stacked devices like dm-multipath. So these patches allow iblock to work like pscsi/tcmu where they can pass a PR command to the backend module. And then iblock will use the pr_ops to pass the PR command to the real devices similar to what we do for unmap today. The patches are separated in the following groups: Patch 1 - 2: - Add block layer callouts for reading reservations and rename reservation error code. Patch 3 - 5: - SCSI support for new callouts. Patch 6: - DM support for new callouts. Patch 7 - 13: - NVMe support for new callouts. Patch 14 - 18: - LIO support for new callouts. This patchset has been tested with the libiscsi PGR ops and with window's failover cluster verification test. Note that for scsi backend devices we need this patchset: https://lore.kernel.org/linux-scsi/20230123221046.125483-1-michael.christie@oracle.com/T/#m4834a643ffb5bac2529d65d40906d3cfbdd9b1b7 to handle UAs. To reduce the size of this patchset that's being done separately to make reviewing easier. And to make merging easier this patchset and the one above do not have any conflicts so can be merged in different trees. Link: https://lore.kernel.org/r/20230407200551.12660-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: dc395x: Documentation: Reword original driver attributionBagas Sanjaya
The Linux kernel isn't in 2.6.x anymore, but rather the major version has advanced much (currently 6.x). Reword the attribution. Also, replace 404'ed 2.4 driver link with web.archive.org snapshot [1]. Link: https://web.archive.org/web/20140129181343/http://www.garloff.de/kurt/linux/dc395/ [1] Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230510093933.19985-4-bagasdotme@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: dc395x: Documentation: Replace non-functional twibble.org listBagas Sanjaya
Sync mailing list address in the documentation to follow MAINTAINERS. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230510093933.19985-3-bagasdotme@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: MAINTAINERS: Drop DC395x list and siteBagas Sanjaya
Emails to DC395x list bounce (550 error) and visiting the site returns 404 page. Drop both twibble.org links. The driver should now be covered by linux-scsi list. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230510093933.19985-2-bagasdotme@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: MAINTAINERS: Add a libsas entryJason Yan
John has been reviewing libsas patches for years. And I have been contributing to libsas for years and I am interested in reviewing and testing libsas patches too. So add a libsas entry and add John and me as reviewer. Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20230516110131.388634-1-yanaijie@huawei.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Acked-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: qla2xxx: Replace all non-returning strlcpy() with strscpy()Azeem Shaikh
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025404.2843867-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: qla4xxx: Replace all non-returning strlcpy() with strscpy()Azeem Shaikh
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025355.2835898-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: target: Replace all non-returning strlcpy() with strscpy()Azeem Shaikh
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025322.2804923-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: bfa: Replace all non-returning strlcpy() with strscpy()Azeem Shaikh
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516013345.723623-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16Merge patch series "scsi: hisi_sas: Some misc changes"Martin K. Petersen
Xiang Chen <chenxiang66@hisilicon.com> says: This series contains some fixes including: - Configure initial value of some registers according to HBA model - Change DMA setup lock timeout from 100ms to 2.5s - Fix warnings detected by sparse Link: https://lore.kernel.org/r/1684118481-95908-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: hisi_sas: Fix warnings detected by sparseXingui Yang
This patch fixes the following warning: drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:2168:43: sparse: sparse: restricted __le32 degrades to integer Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304161254.NztCVZIO-lkp@intel.com/ Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: hisi_sas: Change DMA setup lock timeout to 2.5sXingui Yang
DMA setup lock timeout protection is added when DMA setup frames are received. It's a function outside the protocol and used to prevent SATA disk I/Os from being delivered for a long time. The default value is 100ms, it's too strict and easily triggered timeout when the disk is overloaded or faulty. Based on the average I/O latency of 300 disks, we adjust the value to 2.5s. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: hisi_sas: Configure initial value of some registers according to HBA modelYihang Li
For SAS HBAs of 920 and previous version, we use init_reg_v3_hw() to set some registers which are related to HW boards. For SAS HBAs of 920B and later version, those HW registers are set through firmware. And different HBA models are distinguished through pci_dev->revision. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: megaraid_sas: Convert union megasas_sgl to flex-arraysKees Cook
In the ongoing effort to replace all fake flexible arrays with true flexible arrays, replace the sge32, sge64, and sge_skinny members of union megasas_sgl with true flexible arrays. No binary differences are seen after this change; sizes were already being manually calculated using the member struct sizes directly. Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: megaraidlinux.pdl@broadcom.com Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230511220957.never.919-kees@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-16scsi: ufs: hwmon: Constify pointers to hwmon_channel_infoKrzysztof Kozlowski
Statically allocated array of pointers to hwmon_channel_info can be made const for safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230511175204.281038-1-krzysztof.kozlowski@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08Merge patch series "smartpqi updates"Martin K. Petersen
Don Brace <don.brace@microchip.com> says: These patches are based on Martin Petersen's 6.4/scsi-queue tree https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git 6.4/scsi-queue This set of changes consists of: * Map entire BAR 0. The driver was mapping up to and including the controller registers, but not all of BAR 0. * Add PCI IDs to support new controllers. * Clean up some code by removing unnecessary NULL checks. This cleanup is a result of a Coverity report. * Correct a rare memory leak whenever pqi_sas_port_add_rhpy() returns an error. This was Suggested by: Yang Yingliang <yangyingliang@huawei.com> * Remove atomic operations on variable raid_bypass_cnt. Accuracy is not required for driver operation. Change type from atomic_t to unsigned int. * Correct a rare drive hot-plug removal issue where we get a NULL io_request. We added a check for this condition. * Turn on NCQ priority for AIO requests to disks comprising RAID devices. * Correct byte aligned writew() operations on some ARM servers. Changed the writew() to two writeb() operations. * Change how the driver checks for a sanitize operation in progress. We were using TEST UNIT READY. We removed the TEST UNIT READY code and are now using the controller's firmware information in order to avoid issues caused by drives failing to complete TEST UNIT READY. * Some customers have been requesting that we add the NUMA node to /sys/block/sd<scsi device>/device like the nvme driver does. * Update the copyright information to match the current year. * Bump the driver version to 2.1.22-040. Link: https://lore.kernel.org/r/20230428153712.297638-1-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08Merge patch series "scsi: pm80xx: Enhanced debug logs for HW events"Martin K. Petersen
Pranav Prasad <pranavpp@google.com> says: This patch series enhances debug logs for pm80xx HW events, and provides a minor fix in the case of a hard reset. The log enhancement involves changing the log severity level to enable logging for HW events which consequently help debug disk discovery issues. 1. Changed log severity level from MSG to EVENT for HW events. Enhanced the HW event logs by adding the phyid. 2. Enabled INIT logging. 3. Log portid along with the PHY_UP event. 4. Print phyid and portid sent as part of device registration request. 5. Log port state during HW events. 6. Update phy_state and phy_attached to correct values after a hard reset. Link: https://lore.kernel.org/r/20230418190101.696345-1-pranavpp@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08Merge patch series "scsi: libsas: remove empty branches and code simplification"Martin K. Petersen
Jason Yan <yanaijie@huawei.com> says: Three patches to remove two empty branches and a little code simplification. Link: https://lore.kernel.org/r/20230421093744.1583609-1-yanaijie@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08Merge patch series "qla2xxx driver update"Martin K. Petersen
Nilesh Javali <njavali@marvell.com> says: Please apply the qla2xxx driver enhancement and bug fixes to the scsi tree at your earliest convenience. Link: https://lore.kernel.org/r/20230428075339.32551-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08Merge patch series "lpfc: Update lpfc to revision 14.2.0.12"Martin K. Petersen
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.12 This patch set contains fixes flagged by code analyzer tools, introduces a new CQE status to handle DMA errors, and replaces the usage of blk interrupts with threaded interrupts. The patches were cut against Martin's 6.4/scsi-queue tree. Link: https://lore.kernel.org/r/20230417191558.83100-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: ufs: ufs-mediatek: Delete some dead codeDan Carpenter
There is already a test for "if (val == state)" earlier so it's not possible here. Delete the dead code. Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/68fce64f-4970-45f1-807e-6c0eecdfcdc2@kili.mountain Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: qedf: Fix NULL dereference in error handlingJinhong Zhu
Smatch reported: drivers/scsi/qedf/qedf_main.c:3056 qedf_alloc_global_queues() warn: missing unwind goto? At this point in the function, nothing has been allocated so we can return directly. In particular the "qedf->global_queues" have not been allocated so calling qedf_free_global_queues() will lead to a NULL dereference when we check if (!gl[i]) and "gl" is NULL. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Jinhong Zhu <jinhongzhu@hust.edu.cn> Link: https://lore.kernel.org/r/20230502140022.2852-1-jinhongzhu@hust.edu.cn Reviewed-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: ufs: core: Change the module parameter macro of use_mcq_modeKeoseong Park
mcq_mode_ops uses only param_{set,get}_bool(). Therefore, convert module_param_cb() to module_param() and remove the mcq_mode_ops. Signed-off-by: Keoseong Park <keosung.park@samsung.com> Link: https://lore.kernel.org/r/20230427094420epcms2p1043333a3e0c0cf58e66164e0b83b3b02@epcms2p1 Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: mpi3mr: Use -ENOMEM instead of -1 in mpi3mr_expander_add()Harshit Mogalapalli
smatch warnings: drivers/scsi/mpi3mr/mpi3mr_transport.c:1449 mpi3mr_expander_add() warn: returning -1 instead of -ENOMEM is sloppy No functional change. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202303202027.ZeDQE5Ug-lkp@intel.com/ Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20230419064256.2532069-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Update version to 2.1.22-040Don Brace
Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-13-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Update copyright to 2023Don Brace
Update copyright to current year. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-12-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Add sysfs entry for NUMA node in /sys/block/sdX/deviceDon Brace
Although NUMA node is a PCIe device level attribute, it was requested the NUMA node be added for each exposed device similar to NVMe disks. Example for NVMe: /sys/block/nvme1c1n1/device/numa_node Example for smartpqi: /sys/block/sdh/device/numa_node cat /sys/block/sdh/device/numa_node 0 Reviewed-by: David Strahan <david.strahan@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-11-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Stop sending driver-initiated TURsKevin Barnett
Stop sending driver-initiated TURs to physical devices during driver load/rescan. Note: This does not affect SML initiated TURs. Some Linux kernels can cause lengthy delays in OS boot if the kernel detects that a drive is being sanitized/erased. We were using TURs to detect if a sanitize/erase was in progress. Some devices do not return the TUR in a timely manner, causing driver load/rescan stalls. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-10-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Fix byte aligned writew for ARM serversDon Brace
Correct OOPs on ARM servers during driver init. The driver attempts to update FW with max_feature_supported value using a writew() kernel call using a byte aligned address. This fails on some ARM systems. Change the writew() to two writeb() calls to update this value. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-9-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Add support for RAID NCQ priorityGilbert Wu
Enable NCQ priority feature for the RAID path when AIO path is disabled. Move function pqi_is_io_high_priority() up to avoid adding a prototype. Remove unused argument ctrl_info. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-8-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Validate block layer host tagMurthy Bhat
Prevent OS crashes when a drive is hot removed during I/O stress test. The I/O request pointer can be invalid if block layer provides incorrect multi-queue host tag. This can lead to invalid I/O request pointer dereference. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-7-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-05-08scsi: smartpqi: Remove contention for raid_bypass_cntMike McGowen
Reduce CPU contention when incrementing variable raid_bypass_cnt. Remove the atomic operations for this variable by changing the atomic to an unsigned int and replace atomic operations with standard operations. The value is only checked that it is increasing and accuracy is not required. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-6-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>