summaryrefslogtreecommitdiff
path: root/include/linux/blkdev.h
AgeCommit message (Collapse)Author
2021-10-19block: change plugging to use a singly linked listJens Axboe
Use a singly linked list for the blk_plug. This saves 8 bytes in the blk_plug struct, and makes for faster list manipulations than doubly linked lists. As we don't use the doubly linked lists for anything, singly linked is just fine. This yields a bump in default (merging enabled) performance from 7.0 to 7.1M IOPS, and ~7.5M IOPS with merging disabled. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: move the SECTOR_SIZE related definitions to blk_types.hChristoph Hellwig
Ensure these are always available for inlines in the various block layer headers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211018101130.1838532-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: add a struct io_comp_batch argument to fops->iopoll()Jens Axboe
struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: provide helpers for rq_list manipulationJens Axboe
Instead of open-coding the list additions, traversal, and removal, provide a basic set of helpers. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: cache request queue in bdevPavel Begunkov
There are tons of places where we need to get a request_queue only having bdev, which turns into bdev->bd_disk->queue. There are probably a hundred of such places considering inline helpers, and enough of them are in hot paths. Cache queue pointer in struct block_device and make use of it in bdev_get_queue(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a3bfaecdd28956f03629d0ca5c63ebc096e1c809.1634219547.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: switch polling to be bio basedChristoph Hellwig
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18io_uring: don't sleep when polling for I/OChristoph Hellwig
There is no point in sleeping for the expected I/O completion timeout in the io_uring async polling model as we never poll for a specific I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: replace the spin argument to blk_iopoll with a flags argumentChristoph Hellwig
Switch the boolean spin argument to blk_poll to passing a set of flags instead. This will allow to control polling behavior in a more fine grained way. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-10-hch@lst.de [axboe: adapt to changed io_uring iopoll] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: pre-allocate requests if plug is started and is a batchJens Axboe
The caller typically has a good (or even exact) idea of how many requests it needs to submit. We can make the request/tag allocation a lot more efficient if we just allocate N requests/tags upfront when we queue the first bio from the batch. Provide a new plug start helper that allows the caller to specify how many IOs are expected. This sets plug->nr_ios, and we can use that for smarter request allocation. The plug provides a holding spot for requests, and request allocation will check it before calling into the normal request allocation path. The blk_finish_plug() is called, check if there are unused requests and free them. This should not happen in normal operations. The exception is if we get merging, then we may be left with requests that need freeing when done. This raises the per-core performance on my setup from ~5.8M to ~6.1M IOPS. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: bump max plugged deferred size from 16 to 32Jens Axboe
Particularly for NVMe with efficient deferred submission for many requests, there are nice benefits to be seen by bumping the default max plug count from 16 to 32. This is especially true for virtualized setups, where the submit part is more expensive. But can be noticed even on native hardware. Reduce the multiple queue factor from 4 to 2, since we're changing the default size. While changing it, move the defines into the block layer private header. These aren't values that anyone outside of the block layer uses, or should use. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18blk-mq: Change shared sbitmap naming to shared tagsJohn Garry
Now that shared sbitmap support really means shared tags, rename symbols to match that. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1633429419-228500-15-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18blk-mq: Use shared tags for shared sbitmap supportJohn Garry
Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: move struct request to blk-mq.hChristoph Hellwig
struct request is only used by blk-mq drivers, so move it and all related declarations to blk-mq.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: move integrity handling out of <linux/blkdev.h>Christoph Hellwig
Split the integrity/metadata handling definitions out into a new header. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: move a few merge helpers out of <linux/blkdev.h>Christoph Hellwig
These are block-layer internal helpers, so move them to block/blk.h and block/blk-merge.c. Also update a comment a bit to use better grammar. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: drop unused includes in <linux/blkdev.h>Christoph Hellwig
Drop various include not actually used in blkdev.h itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: move elevator.h to block/Christoph Hellwig
Except for the features passed to blk_queue_required_elevator_features, elevator.h is only needed internally to the block layer. Move the ELEVATOR_F_* definitions to blkdev.h, and the move elevator.h to block/, dropping all the spurious includes outside of that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: remove the struct blk_queue_ctx forward declarationChristoph Hellwig
This type doesn't exist at all, so no need to forward declare it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: remove the cmd_size field from struct request_queueChristoph Hellwig
Entirely unused. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18block: remove the unused blk_queue_state enumChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (ufs, qla2xxx, target, smartpqi, lpfc, mpt3sas). The core change causing the most churn was replacing the command request field request with a macro, allowing us to offset map to it and remove the redundant field; the same was also done for the tag field. The most impactful change is the final removal of scsi_ioctl, which has been deprecated for over a decade" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (293 commits) scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1 scsi: ufs: ufs-exynos: Fix static checker warning scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Use the proper SCSI midlayer interfaces for PI scsi: lpfc: Copyright updates for 14.0.0.1 patches scsi: lpfc: Update lpfc version to 14.0.0.1 scsi: lpfc: Add bsg support for retrieving adapter cmf data scsi: lpfc: Add cmf_info sysfs entry scsi: lpfc: Add debugfs support for cm framework buffers scsi: lpfc: Add support for maintaining the cm statistics buffer scsi: lpfc: Add rx monitoring statistics scsi: lpfc: Add support for the CM framework scsi: lpfc: Add cmfsync WQE support scsi: lpfc: Add support for cm enablement buffer scsi: lpfc: Add cm statistics buffer support scsi: lpfc: Add EDC ELS support scsi: lpfc: Expand FPIN and RDF receive logging scsi: lpfc: Add MIB feature enablement support scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware scsi: fc: Add EDC ELS definition ...
2021-08-24block: Add alternative_gpt_sector() operationDmitry Osipenko
Add alternative_gpt_sector() block device operation which specifies alternative location of a GPT entry. This allows us to support Android devices that have GPT entry at a non-standard location and can't be repartitioned easily. Reviewed-by: Christoph Hellwig <hch@lst.de> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210820004536.15791-2-digetx@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-23block: add an explicit ->disk backpointer to the request_queueChristoph Hellwig
Replace the magic lookup through the kobject tree with an explicit backpointer, given that the device model links are set up and torn down at times when I/O is still possible, leading to potential NULL or invalid pointer dereferences. Fixes: edb0872f44ec ("block: move the bdi from the request_queue to the gendisk") Reported-by: syzbot <syzbot+aa0801b6b32dca9dda82@syzkaller.appspotmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/20210816134624.GA24234@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-11block: move some macros to blkdev.hGuoqing Jiang
Move them (PAGE_SECTORS_SHIFT, PAGE_SECTORS and SECTOR_MASK) to the generic header file to remove redundancy. Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn> Link: https://lore.kernel.org/r/20210721025315.1729118-1-guoqing.jiang@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09block: return ELEVATOR_DISCARD_MERGE if possibleMing Lei
When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because both block core and related drivers(nvme, virtio-blk) doesn't handle mixed discard io merge(traditional IO merge together with discard merge) well. Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation, so both blk-mq and drivers just need to handle multi-range discard. Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Fixes: 2705dfb20947 ("block: fix discard request merge") Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09block: move the bdi from the request_queue to the gendiskChristoph Hellwig
The backing device information only makes sense for file system I/O, and thus belongs into the gendisk and not the lower level request_queue structure. Move it there. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09block: add a queue_has_disk helperChristoph Hellwig
Add a helper to check if a gendisk is associated with a request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-09block: pass a gendisk to blk_queue_update_readaheadChristoph Hellwig
.. and rename the function to disk_update_readahead. This is in preparation for moving the BDI from the request_queue to the gendisk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: remove bdputChristoph Hellwig
Now that we've stopped using inode references for anything meaninful in the block layer get rid of the helper to put it and just open code the call to iput on the block_device inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Link: https://lore.kernel.org/r/20210722075402.983367-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02block: remove bdgrabChristoph Hellwig
All callers are gone, and no one should grab a pure inode reference to a block device anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210722075402.983367-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-30scsi: block: Remove the remaining SG_IO-related fields from struct request_queueChristoph Hellwig
Move the sg_timeout and sg_reserved_size fields into the bsg_device and scsi_device structures as they have nothing to do with generic block I/O. Note that these values are now separate for bsg vs. SCSI device node access, but that just matches how /dev/sg vs the other nodes has always behaved. Link: https://lore.kernel.org/r/20210729064845.1044147-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-30scsi: block: Remove BLK_SCSI_MAX_CMDSChristoph Hellwig
This was used for the table based SCSI passthough permission checking that is gone now. Link: https://lore.kernel.org/r/20210729064845.1044147-3-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-30scsi: bsg: Simplify device registrationChristoph Hellwig
Use the per-device cdev_device_interface to store the bsg data in the char device inode, and thus remove the need to embedd the bsg_class_device structure in the request_queue. Link: https://lore.kernel.org/r/20210729064845.1044147-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: scsi_ioctl: Move the "block layer" SCSI ioctl handling to drivers/scsiChristoph Hellwig
Merge the ioctl handling in block/scsi_ioctl.c into its only caller in drivers/scsi/scsi_ioctl.c. Link: https://lore.kernel.org/r/20210724072033.1284840-19-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: scsi_ioctl: Simplify SCSI passthrough permission checkingChristoph Hellwig
Remove the separate command filter structure and just use a switch statement (which also cought two duplicate commands), return a bool and give the function a sensible name. Link: https://lore.kernel.org/r/20210724072033.1284840-18-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: bsg: Move bsg_scsi_ops to drivers/scsi/Christoph Hellwig
Move the SCSI-specific bsg code in the SCSI midlayer instead of in the common bsg code. This just keeps the common bsg code block/ and also allows building it as a module. Link: https://lore.kernel.org/r/20210724072033.1284840-15-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: block: Add a queue_max_bytes() helperChristoph Hellwig
Return the max_sectors value in bytes. Lifted from scsi_ioctl.c. Link: https://lore.kernel.org/r/20210724072033.1284840-13-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: scsi_ioctl: Remove scsi_verify_blk_ioctl()Christoph Hellwig
Manually verify that the device is not a partition and the caller has admin privіleges at the beginning of the sr ioctl method and open code the trivial check for sd as well. Link: https://lore.kernel.org/r/20210724072033.1284840-11-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-28scsi: scsi_ioctl: Remove scsi_cmd_blk_ioctl()Christoph Hellwig
Open code scsi_cmd_blk_ioctl() in its two callers. Link: https://lore.kernel.org/r/20210724072033.1284840-10-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-24Merge tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request (Christoph): - tracing fix (Keith Busch) - fix multipath head refcounting (Hannes Reinecke) - Write Zeroes vs PI fix (me) - drop a bogus WARN_ON (Zhihao Cheng) - Increase max blk-cgroup policy size, now that mq-deadline uses it too (Oleksandr) * tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-block: nvme: set the PRACT bit when using Write Zeroes with T10 PI nvme: fix nvme_setup_command metadata trace event nvme: fix refcounting imbalance when all paths are down nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING block: increase BLKCG_MAX_POLS
2021-07-17block: increase BLKCG_MAX_POLSOleksandr Natalenko
After mq-deadline learned to deal with cgroups, the BLKCG_MAX_POLS value became too small for all the elevators to be registered properly. The following issue is seen: ``` calling bfq_init+0x0/0x8b @ 1 blkcg_policy_register: BLKCG_MAX_POLS too small initcall bfq_init+0x0/0x8b returned -28 after 507 usecs ``` which renders BFQ non-functional. Increase BLKCG_MAX_POLS to allow enough space for everyone. Fixes: 08a9ad8bf607 ("block/mq-deadline: Add cgroup support") Link: https://lore.kernel.org/lkml/8988303.mDXGIdCtx8@natalenko.name/ Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Link: https://lore.kernel.org/r/20210717123328.945810-1-oleksandr@natalenko.name Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-09Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more block updates from Jens Axboe: "A combination of changes that ended up depending on both the driver and core branch (and/or the IDE removal), and a few late arriving fixes. In detail: - Fix io ticks wrap-around issue (Chunguang) - nvme-tcp sock locking fix (Maurizio) - s390-dasd fixes (Kees, Christoph) - blk_execute_rq polling support (Keith) - blk-cgroup RCU iteration fix (Yu) - nbd backend ID addition (Prasanna) - Partition deletion fix (Yufen) - Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph) - Removal of now dead block request types due to IDE removal (Christoph) - Loop probing and control device cleanups (Christoph) - Device uevent fix (Christoph) - Misc cleanups/fixes (Tetsuo, Christoph)" * tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits) blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs block: fix the problem of io_ticks becoming smaller nvme-tcp: can't set sk_user_data without write_lock loop: remove unused variable in loop_set_status() block: remove the bdgrab in blk_drop_partitions block: grab a device refcount in disk_uevent s390/dasd: Avoid field over-reading memcpy() dasd: unexport dasd_set_target_state block: check disk exist before trying to add partition ubd: remove dead code in ubd_setup_common nvme: use return value from blk_execute_rq() block: return errors from blk_execute_rq() nvme: use blk_execute_rq() for passthrough commands block: support polling through blk_execute_rq block: remove REQ_OP_SCSI_{IN,OUT} block: mark blk_mq_init_queue_data static loop: rewrite loop_exit using idr_for_each_entry loop: split loop_lookup loop: don't allow deleting an unspecified loop device loop: move loop_ctl_mutex locking into loop_add ...
2021-06-30Merge tag 'for-5.14/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Various DM persistent-data library improvements and fixes that benefit both the DM thinp and cache targets. - A few small DM kcopyd efficiency improvements. - Significant zoned related block core, DM core and DM zoned target changes that culminate with adding zoned append emulation (which is required to properly fix DM crypt's zoned support). - Various DM writecache target changes that improve efficiency. Adds an optional "metadata_only" feature that only promotes bios flagged with REQ_META. But the most significant improvement is writecache's ability to pause writeback, for a confiurable time, if/when the working set is larger than the cache (and the cache is full) -- this ensures performance is no worse than the slower origin device. * tag 'for-5.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (35 commits) dm writecache: make writeback pause configurable dm writecache: pause writeback if cache full and origin being written directly dm io tracker: factor out IO tracker dm btree remove: assign new_root only when removal succeeds dm zone: fix dm_revalidate_zones() memory allocation dm ps io affinity: remove redundant continue statement dm writecache: add optional "metadata_only" parameter dm writecache: add "cleaner" and "max_age" to Documentation dm writecache: write at least 4k when committing dm writecache: flush origin device when writing and cache is full dm writecache: have ssd writeback wait if the kcopyd workqueue is busy dm writecache: use list_move instead of list_del/list_add in writecache_writeback() dm writecache: commit just one block, not a full page dm writecache: remove unused gfp_t argument from wc_add_block() dm crypt: Fix zoned block device support dm: introduce zone append emulation dm: rearrange core declarations for extended use from dm-zone.c block: introduce BIO_ZONE_WRITE_LOCKED bio flag block: introduce bio zone helpers block: improve handling of all zones reset operation ...
2021-06-30block: return errors from blk_execute_rq()Keith Busch
The synchronous blk_execute_rq() had not provided a way for its callers to know if its request was successful or not. Return the blk_status_t result of the request. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-4-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-30block: remove REQ_OP_SCSI_{IN,OUT}Christoph Hellwig
With the legacy IDE driver gone drivers now use either REQ_OP_DRV_* or REQ_OP_SCSI_*, so unify the two concepts of passthrough requests into a single one. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-04block: introduce bio zone helpersDamien Le Moal
Introduce the helper functions bio_zone_no() and bio_zone_is_seq(). Both are the BIO counterparts of the request helpers blk_rq_zone_no() and blk_rq_zone_is_seq(), respectively returning the number of the target zone of a bio and true if the BIO target zone is sequential. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-06-01block: unexport blk_alloc_queueChristoph Hellwig
blk_alloc_queue is just an internal helper now, unexport it and remove it from the public header. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: Use request queue-wide tags for tagset-wide sbitmapJohn Garry
The tags used for an IO scheduler are currently per hctx. As such, when q->nr_hw_queues grows, so does the request queue total IO scheduler tag depth. This may cause problems for SCSI MQ HBAs whose total driver depth is fixed. Ming and Yanhui report higher CPU usage and lower throughput in scenarios where the fixed total driver tag depth is appreciably lower than the total scheduler tag depth: https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@huawei.com/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b In that scenario, since the scheduler tag is got first, much contention is introduced since a driver tag may not be available after we have got the sched tag. Improve this scenario by introducing request queue-wide tags for when a tagset-wide sbitmap is used. The static sched requests are still allocated per hctx, as requests are initialised per hctx, as in blk_mq_init_request(..., hctx_idx, ...) -> set->ops->init_request(.., hctx_idx, ...). For simplicity of resizing the request queue sbitmap when updating the request queue depth, just init at the max possible size, so we don't need to deal with the possibly with swapping out a new sbitmap for old if we need to grow. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1620907258-30910-3-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-15Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for shared tag set exit (Bart) - Correct ioctl range for zoned ioctls (Damien) - Removed dead/unused function (Lin) - Fix perf regression for shared tags (Ming) - Fix out-of-bounds issue with kyber and preemption (Omar) - BFQ merge fix (Paolo) - Two error handling fixes for nbd (Sun) - Fix weight update in blk-iocost (Tejun) - NVMe pull request (Christoph): - correct the check for using the inline bio in nvmet (Chaitanya Kulkarni) - demote unsupported command warnings (Chaitanya Kulkarni) - fix corruption due to double initializing ANA state (me, Hou Pu) - reset ns->file when open fails (Daniel Wagner) - fix a NULL deref when SEND is completed with error in nvmet-rdma (Michal Kalderon) - Fix kernel-doc warning (Bart) * tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block: block/partitions/efi.c: Fix the efi_partition() kernel-doc header blk-mq: Swap two calls in blk_mq_exit_queue() blk-mq: plug request for shared sbitmap nvmet: use new ana_log_size instead the old one nvmet: seset ns->file when open fails nbd: share nbd_put and return by goto put_nbd nbd: Fix NULL pointer in flush_workqueue blkdev.h: remove unused codes blk_account_rq block, bfq: avoid circular stable merges blk-iocost: fix weight updates of inner active iocgs nvmet: demote fabrics cmd parse err msg to debug nvmet: use helper to remove the duplicate code nvmet: demote discovery cmd parse err msg to debug nvmet-rdma: Fix NULL deref when SEND is completed with error nvmet: fix inline bio check for passthru nvmet: fix inline bio check for bdev-ns nvme-multipath: fix double initialization of ANA state kyber: fix out of bounds access when preempted block: uapi: fix comment about block device ioctl
2021-05-12blkdev.h: remove unused codes blk_account_rqLin Feng
Last users of blk_account_rq gone with patch commit a1ce35fa49852db ("block: remove dead elevator code") and now it gets no caller, it can be safely removed. Signed-off-by: Lin Feng <linf@wangsu.com> Link: https://lore.kernel.org/r/20210512100124.173769-1-linf@wangsu.com Signed-off-by: Jens Axboe <axboe@kernel.dk>