summaryrefslogtreecommitdiff
path: root/drivers/nvme/host/pci.c
AgeCommit message (Collapse)Author
2022-12-06nvme-pci: cleanup nvme_suspend_queueChristoph Hellwig
Remove the unused returne value, pass a dev + qid instead of the queue as that is better for the callers as well as the function itself, and remove the entirely pointless kerneldoc comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2022-12-06nvme-pci: remove nvme_pci_disableChristoph Hellwig
nvme_pci_disable has a single caller, fold it into that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Eric Curtin <ecurtin@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2022-12-06nvme-pci: remove nvme_disable_admin_queueChristoph Hellwig
nvme_disable_admin_queue has only a single caller, and just calls two other funtions, so remove it to clean up the remove path a little more. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Eric Curtin <ecurtin@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2022-12-06nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrlChristoph Hellwig
Many of the callers decide which one to use based on a bool argument and there is at least some code to be shared, so merge these two. Also move a comment specific to a single callsite to that callsite. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hector Martin <marcan@marcan.st>
2022-12-06nvme: introduce nvme_start_requestSagi Grimberg
In preparation for nvme-multipath IO stats accounting, we want the accounting to happen in a centralized place. The request completion is already centralized, but we need a common helper to request I/O start. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-12-06nvme: use kstrtobool() instead of strtobool()Christophe JAILLET
strtobool() is the same as kstrtobool(). However, the latter is more used within the kernel. In order to remove strtobool() and slightly simplify kstrtox.h, switch to the other function name. While at it, include the corresponding header file (<linux/kstrtox.h>) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-30nvme-pci: clear the prp2 field when not usedLei Rao
If the prp2 field is not filled in nvme_setup_prp_simple(), the prp2 field is garbage data. According to nvme spec, the prp2 is reserved if the data transfer does not cross a memory page boundary, so clear it to zero if it is not used. Signed-off-by: Lei Rao <lei.rao@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-18nvme: rename the queue quiescing helpersChristoph Hellwig
Naming the nvme helpers that wrap the block quiesce functionality _start/_stop is rather confusing. Switch to using the quiesce naming used by the block layer instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-11-16nvme-pci: don't unbind the driver on reset failureChristoph Hellwig
Unbind a device driver when a reset fails is very unusual behavior. Just shut the controller down and leave it in dead state if we fail to reset it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-11-16nvme-pci: split the initial probe from the rest pathChristoph Hellwig
nvme_reset_work is a little fragile as it needs to handle both resetting a live controller and initializing one during probe. Split out the initial probe and open code it in nvme_probe and leave nvme_reset_work to just do the live controller reset. This fixes a recently introduced bug where nvme_dev_disable causes a NULL pointer dereferences in blk_mq_quiesce_tagset because the tagset pointer is not set when the reset state is entered directly from the new state. The separate probe code can skip the reset state and probe directly and fixes this. To make sure the system isn't single threaded on enabling nvme controllers, set the PROBE_PREFER_ASYNCHRONOUS flag in the device_driver structure so that the driver core probes in parallel. Fixes: 98d81f0df70c ("nvme: use blk_mq_[un]quiesce_tagset") Reported-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-16nvme-pci: move the HMPRE check into nvme_setup_host_memChristoph Hellwig
Check that a HMB is wanted into the allocation helper instead of the caller. This makes life simpler for an upcoming second caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-11-16nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000Tiago Dias Ferreira
Added a quirk to fix the Netac NV7000 SSD reporting duplicate NGUIDs. Cc: <stable@vger.kernel.org> Signed-off-by: Tiago Dias Ferreira <tiagodfer@gmail.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-15nvme-pci: simplify nvme_dbbuf_dma_allocChristoph Hellwig
Move the OACS check and the error checking into nvme_dbbuf_dma_alloc so that an upcoming second caller doesn't have to duplicate this boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-11-15nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enableChristoph Hellwig
nvme_pci_configure_admin_queue is called right after nvme_pci_enable, and it's work is undone by nvme_dev_disable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: set constant paramters in nvme_pci_alloc_ctrlChristoph Hellwig
Move setting of low-level constant parameters from nvme_reset_work to nvme_pci_alloc_ctrl. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: factor out a nvme_pci_alloc_dev helperChristoph Hellwig
Add a helper that allocates the nvme_dev structure up to the point where we can call nvme_init_ctrl. This pairs with the free_ctrl method and can thus be used to cleanup the teardown path and make it more symmetric. Note that this now calls nvme_init_ctrl a lot earlier during probing, which also means the per-controller character device shows up earlier. Due to the controller state no commnds can be send on it, but it might make sense to delay the cdev registration until nvme_init_ctrl_finish. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: factor the iod mempool creation into a helperChristoph Hellwig
Add a helper to create the iod mempool. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: move more teardown work to nvme_removeChristoph Hellwig
nvme_dbbuf_dma_free frees dma coherent memory, so it must not be called after ->remove has returned. Fortunately there is no way to use it after shutdown as no more I/O is possible so it can be moved. Similarly the iod_mempool can't be used for a device kept alive after shutdown, so move it next to freeing the PRP pools. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: put the admin queue in nvme_dev_remove_adminChristoph Hellwig
Once the controller is shutdown no one can access the admin queue. Tear it down in nvme_dev_remove_admin, which matches the flow in the other drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme: simplify transport specific device attribute handlingChristoph Hellwig
Allow the transport driver to override the attribute groups for the control device, so that the PCIe driver doesn't manually have to add a group after device creation and keep track of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme: move OPAL setup from PCIe to coreChristoph Hellwig
Nothing about the TCG Opal support is PCIe transport specific, so move it to the core code. For this nvme_init_ctrl_finish grows a new was_suspended argument that allows the transport driver to tell the OPAL code if the controller came out of a suspend cycle. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Tested-by Gerd Bayer <gbayer@linxu.ibm.com>
2022-11-15nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron NitroBean Huo
Added a quirk to fix Micron Nitro NVMe reporting duplicate NGUIDs. Cc: <stable@vger.kernel.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-09nvme: quiet user passthrough command errorsKeith Busch
The driver is spamming the kernel logs for entirely harmless errors from user space submitting unsupported commands. Just silence the errors. The application has direct access to command status, so there's no need to log these. And since every passthrough command now uses the quiet flag, move the setting to the common initializer. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Tested-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-02nvme-pci: don't unquiesce the I/O queues in nvme_remove_dead_ctrlChristoph Hellwig
nvme_remove_dead_ctrl schedules nvme_remove to be called, which will call nvme_dev_disable and unquiesce the I/O queues. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-02nvme: split nvme_kill_queuesChristoph Hellwig
nvme_kill_queues does two things: 1) mark the gendisk of all namespaces dead 2) unquiesce all I/O queues These used to be be intertwined due to block layer issues, but aren't any more. So move the unquiscing of the I/O queues into the callers, and rename the rest of the function to the now more descriptive nvme_mark_namespaces_dead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-02nvme-pci: refactor the tagset handling in nvme_reset_workChristoph Hellwig
The code to create, update or delete a tagset and namespaces in nvme_reset_work is a bit convoluted. Refactor it with a two high-level conditionals for first probe vs reset and I/O queues vs no I/O queues to make the code flow more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-3-hch@lst.de [axboe: fix whitespace issue] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-25nvme-pci: remove an extra queue referenceChristoph Hellwig
Now that blk_mq_destroy_queue does not release the queue reference, there is no need for a second admin queue reference to be held by the nvme_dev. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-25blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queueChristoph Hellwig
The fact that blk_mq_destroy_queue also drops a queue reference leads to various places having to grab an extra reference. Move the call to blk_put_queue into the callers to allow removing the extra references. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20221018135720.670094-2-hch@lst.de [axboe: fix fabrics_q vs admin_q conflict in nvme core.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-19nvme-pci: disable write zeroes on various Kingston SSDXander Li
Kingston SSDs do support NVMe Write_Zeroes cmd but take long time to process. The firmware version is locked by these SSDs, we can not expect firmware improvement, so disable Write_Zeroes cmd. Signed-off-by: Xander Li <xander_li@kingston.com.tw> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-10-12nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDsXi Ruoyao
ZHITAI TiPro5000 SSDs has the same APST sleep problem as its cousin, TiPro7000. The quirk for TiPro7000 has been added in commit 6b961bce50e4 ("nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs"), use the same quirk for TiPro5000. The ASPT data from "nvme id-ctrl /dev/nvme1": vid : 0x1e49 ssvid : 0x1e49 sn : ZTA21T0KA2227304LM mn : ZHITAI TiPlus5000 1TB fr : ZTA09139 [...] ps 0 : mp:6.50W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 1 : mp:5.80W operational enlat:0 exlat:0 rrt:1 rrl:1 rwt:1 rwl:1 idle_power:- active_power:- ps 2 : mp:3.60W operational enlat:0 exlat:0 rrt:2 rrl:2 rwt:2 rwl:2 idle_power:- active_power:- ps 3 : mp:0.0500W non-operational enlat:5000 exlat:10000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0025W non-operational enlat:8000 exlat:45000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:- Reported-and-tested-by: Chang Feng <flukehn@gmail.com> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-10-12nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760Abhijit
Add a quirk to fix Lexar NM760 SSD drives reporting duplicate nsids. Signed-off-by: Abhijit <abhijit@abhijittomar.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-10-07Merge tag 'for-6.1/passthrough-2022-10-04' of git://git.kernel.dk/linuxLinus Torvalds
Pull passthrough updates from Jens Axboe: "With these changes, passthrough NVMe support over io_uring now performs at the same level as block device O_DIRECT, and in many cases 6-8% better. This contains: - Add support for fixed buffers for passthrough (Anuj, Kanchan) - Enable batched allocations and freeing on passthrough, similarly to what we support on the normal storage path (me) - Fix from Geert fixing an issue with !CONFIG_IO_URING" * tag 'for-6.1/passthrough-2022-10-04' of git://git.kernel.dk/linux: io_uring: Add missing inline to io_uring_cmd_import_fixed() dummy nvme: wire up fixed buffer support for nvme passthrough nvme: pass ubuffer as an integer block: extend functionality to map bvec iterator block: factor out blk_rq_map_bio_alloc helper block: rename bio_map_put to blk_mq_map_bio_put nvme: refactor nvme_alloc_request nvme: refactor nvme_add_user_metadata nvme: Use blk_rq_map_user_io helper scsi: Use blk_rq_map_user_io helper block: add blk_rq_map_user_io io_uring: introduce fixed buffer support for io_uring_cmd io_uring: add io_uring_cmd_import_fixed nvme: enable batched completions of passthrough IO nvme: split out metadata vs non metadata end_io uring_cmd completions block: allow end_io based requests in the completion batch handling block: change request end_io handler to pass back a return value block: enable batched allocation for blk_mq_alloc_request() block: kill deprecated BUG_ON() in the flush handling
2022-10-07Merge tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - NVMe pull requests via Christoph: - handle number of queue changes in the TCP and RDMA drivers (Daniel Wagner) - allow changing the number of queues in nvmet (Daniel Wagner) - also consider host_iface when checking ip options (Daniel Wagner) - don't map pages which can't come from HIGHMEM (Fabio M. De Francesco) - avoid unnecessary flush bios in nvmet (Guixin Liu) - shrink and better pack the nvme_iod structure (Keith Busch) - add comment for unaligned "fake" nqn (Linjun Bao) - print actual source IP address through sysfs "address" attr (Martin Belanger) - various cleanups (Jackie Liu, Wolfram Sang, Genjian Zhang) - handle effects after freeing the request (Keith Busch) - copy firmware_rev on each init (Keith Busch) - restrict management ioctls to admin (Keith Busch) - ensure subsystem reset is single threaded (Keith Busch) - report the actual number of tagset maps in nvme-pci (Keith Busch) - small fabrics authentication fixups (Christoph Hellwig) - add common code for tagset allocation and freeing (Christoph Hellwig) - stop using the request_queue in nvmet (Christoph Hellwig) - set min_align_mask before calculating max_hw_sectors (Rishabh Bhatnagar) - send a rediscover uevent when a persistent discovery controller reconnects (Sagi Grimberg) - misc nvmet-tcp fixes (Varun Prakash, zhenwei pi) - MD pull request via Song: - Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. - Raid10 performance optimization, by Yu Kuai. - sbitmap wakeup hang fixes (Hugh, Keith, Jan, Yu) - IO scheduler switching quisce fix (Keith) - s390/dasd block driver updates (Stefan) - support for recovery for the ublk driver (ZiyangZhang) - rnbd drivers fixes and updates (Guoqing, Santosh, ye, Christoph) - blk-mq and null_blk map fixes (Bart) - various bcache fixes (Coly, Jilin, Jules) - nbd signal hang fix (Shigeru) - block writeback throttling fix (Yu) - optimize the passthrough mapping handling (me) - prepare block cgroups to being gendisk based (Christoph) - get rid of an old PSI hack in the block layer, moving it to the callers instead where it belongs (Christoph) - blk-throttle fixes and cleanups (Yu) - misc fixes and cleanups (Liu Shixin, Liu Song, Miaohe, Pankaj, Ping-Xiang, Wolfram, Saurabh, Li Jinlin, Li Lei, Lin, Li zeming, Miaohe, Bart, Coly, Gaosheng * tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux: (162 commits) sbitmap: fix lockup while swapping block: add rationale for not using blk_mq_plug() when applicable block: adapt blk_mq_plug() to not plug for writes that require a zone lock s390/dasd: use blk_mq_alloc_disk blk-cgroup: don't update the blkg lookup hint in blkg_conf_prep nvmet: don't look at the request_queue in nvmet_bdev_set_limits nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all blk-mq: use quiesced elevator switch when reinitializing queues block: replace blk_queue_nowait with bdev_nowait nvme: remove nvme_ctrl_init_connect_q nvme-loop: use the tagset alloc/free helpers nvme-loop: store the generic nvme_ctrl in set->driver_data nvme-loop: initialize sqsize later nvme-fc: use the tagset alloc/free helpers nvme-fc: store the generic nvme_ctrl in set->driver_data nvme-fc: keep ctrl->sqsize in sync with opts->queue_size nvme-rdma: use the tagset alloc/free helpers nvme-rdma: store the generic nvme_ctrl in set->driver_data nvme-tcp: use the tagset alloc/free helpers nvme-tcp: store the generic nvme_ctrl in set->driver_data ...
2022-09-30Merge tag 'block-6.0-2022-09-29' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "A single NVMe pull request via Christoph with a few fixes that should go into the 6.0 release: - Fix IOC_PR_CLEAR and IOC_PR_RELEASE ioctls for nvme devices (Michael Kelley) - Disable Write Zeroes on Phison E3C/E4C (Tina Hsu)" * tag 'block-6.0-2022-09-29' of git://git.kernel.dk/linux: nvme-pci: disable Write Zeroes on Phison E3C/E4C nvme: Fix IOC_PR_CLEAR and IOC_PR_RELEASE ioctls for nvme devices
2022-09-30block: change request end_io handler to pass back a return valueJens Axboe
Everything is just converted to returning RQ_END_IO_NONE, and there should be no functional changes with this patch. In preparation for allowing the end_io handler to pass ownership back to the block layer, rather than retain ownership of the request. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-30Merge branch 'for-6.1/block' into for-6.1/passthroughJens Axboe
* for-6.1/block: (162 commits) sbitmap: fix lockup while swapping block: add rationale for not using blk_mq_plug() when applicable block: adapt blk_mq_plug() to not plug for writes that require a zone lock s390/dasd: use blk_mq_alloc_disk blk-cgroup: don't update the blkg lookup hint in blkg_conf_prep nvmet: don't look at the request_queue in nvmet_bdev_set_limits nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all blk-mq: use quiesced elevator switch when reinitializing queues block: replace blk_queue_nowait with bdev_nowait nvme: remove nvme_ctrl_init_connect_q nvme-loop: use the tagset alloc/free helpers nvme-loop: store the generic nvme_ctrl in set->driver_data nvme-loop: initialize sqsize later nvme-fc: use the tagset alloc/free helpers nvme-fc: store the generic nvme_ctrl in set->driver_data nvme-fc: keep ctrl->sqsize in sync with opts->queue_size nvme-rdma: use the tagset alloc/free helpers nvme-rdma: store the generic nvme_ctrl in set->driver_data nvme-tcp: use the tagset alloc/free helpers nvme-tcp: store the generic nvme_ctrl in set->driver_data ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-27nvme-pci: report the actual number of tagset mapsKeith Busch
We've been reporting 2 maps regardless of whether the module parameter asked for anything beyond the default queues. A consequence of this means that blk-mq will reinitialize the all the hardware contexts and io schedulers on every controller reset when the mapping is exactly the same as before. This unnecessary overhead is adding several milliseconds on a reset for environments that don't need it. Report the actual number of mappings in use. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-27nvme-pci: set min_align_mask before calculating max_hw_sectorsRishabh Bhatnagar
If swiotlb is force enabled dma_max_mapping_size ends up calling swiotlb_max_mapping_size which takes into account the min align mask for the device. Set the min align mask for nvme driver before calling dma_max_mapping_size while calculating max hw sectors. Signed-off-by: Rishabh Bhatnagar <risbhat@amazon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-27nvme-pci: disable Write Zeroes on Phison E3C/E4CTina Hsu
E3C/E4C SSDs do support the Write Zeroes command in theory, but have very bad performance when using it. As the firmware has been frozen for these products we can not expect firmware improvements for it, so disable Write Zeroes. Signed-off-by: Tina Hsu <tina_hsu@phison.corp-partner.google.com> [hch: update the commit message] Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-19nvme-pci: move iod dma_len fill gapsKeith Busch
The 32-bit field, dma_len, packs better in the iod struct above the dma_addr_t on 64-bit systems. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-19nvme-pci: iod npages fits in s8Keith Busch
The largest allowed transfer is 4MB, which can use at most 1025 PRPs. Each PRP is 8 bytes, so the maximum number of 4k nvme pages needed for the iod_list is 3, which fits in an 's8' type. While modifying this field, change the name to "nr_allocations" to better represent that this is referring to the number of units allocated from a dma_pool. Also introduce a BUILD_BUG_ON to ensure we never accidently increase the largest transfer limit beyond 127 chained prp lists. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-19nvme-pci: iod's 'aborted' is a boolKeith Busch
It's only true or false, so make this a bool to reflect that and save some space in nvme_iod. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-19nvme-pci: remove nvme_queue from nvme_iodKeith Busch
We can get the nvme_queue from the req just as easily, so remove the duplicate path to the same structure to save some space. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-02Merge tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Christoph: - error handling fix for the new auth code (Hannes Reinecke) - fix unhandled tcp states in nvmet_tcp_state_change (Maurizio Lombardi) - add NVME_QUIRK_BOGUS_NID for Lexar NM610 (Shyamin Ayesh) - Add documentation for the ublk driver merged in this merge window (Ming) * tag 'block-6.0-2022-09-02' of git://git.kernel.dk/linux-block: Documentation: document ublk nvmet-tcp: fix unhandled tcp states in nvmet_tcp_state_change() nvmet-auth: add missing goto in nvmet_setup_auth() nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610
2022-08-31nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610Shyamin Ayesh
Lexar NM610 reports bogus eui64 values that appear to be the same across all drives. Quirk them out so they are not marked as "non globally unique" duplicates. Signed-off-by: Shyamin Ayesh <me@shyamin.com> [patch formatting] Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-08-22block: Change the return type of blk_mq_map_queues() into voidBart Van Assche
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0, change their return type into void. Most callers ignore the returned value anyway. Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Wang <jasowang@redhat.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.garry@huawei.com> Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20220815170043.19489-3-bvanassche@acm.org [axboe: fold in fix from Bart] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-13Merge tag 'block-6.0-2022-08-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request - print nvme connect Linux error codes properly (Amit Engel) - fix the fc_appid_store return value (Christoph Hellwig) - fix a typo in an error message (Christophe JAILLET) - add another non-unique identifier quirk (Dennis P. Kliem) - check if the queue is allocated before stopping it in nvme-tcp (Maurizio Lombardi) - restart admin queue if the caller needs to restart queue in nvme-fc (Ming Lei) - use kmemdup instead of kmalloc + memcpy in nvme-auth (Zhang Xiaoxu) - __alloc_disk_node() error handling fix (Rafael) * tag 'block-6.0-2022-08-12' of git://git.kernel.dk/linux-block: block: Do not call blk_put_queue() if gendisk allocation fails nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S70 nvme-tcp: check if the queue is allocated before stopping it nvme-fabrics: Fix a typo in an error message nvme-fabrics: parse nvme connect Linux error codes nvmet-auth: use kmemdup instead of kmalloc + memcpy nvme-fc: fix the fc_appid_store return value nvme-fc: restart admin queue if the caller needs to restart queue
2022-08-11nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S70Dennis P. Kliem
ADATA XPG GAMMIX S70 reports bogus eui64 values that appear to be the same across all drives. Quirk them out so they are not marked as "non globally unique" duplicates. Signed-off-by: Dennis P. Kliem <dpkliem@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-08-06Merge tag 'dma-mapping-5.20-2022-08-06' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - convert arm32 to the common dma-direct code (Arnd Bergmann, Robin Murphy, Christoph Hellwig) - restructure the PCIe peer to peer mapping support (Logan Gunthorpe) - allow the IOMMU code to communicate an optional DMA mapping length and use that in scsi and libata (John Garry) - split the global swiotlb lock (Tianyu Lan) - various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang, Lukas Bulwahn, Robin Murphy) * tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping: (45 commits) swiotlb: fix passing local variable to debugfs_create_ulong() dma-mapping: reformat comment to suppress htmldoc warning PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() RDMA/rw: drop pci_p2pdma_[un]map_sg() RDMA/core: introduce ib_dma_pci_p2p_dma_supported() nvme-pci: convert to using dma_map_sgtable() nvme-pci: check DMA ops when indicating support for PCI P2PDMA iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg iommu: Explicitly skip bus address marked segments in __iommu_map_sg() dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support dma-direct: support PCI P2PDMA pages in dma-direct map_sg dma-mapping: allow EREMOTEIO return code for P2PDMA transfers PCI/P2PDMA: Introduce helpers for dma_map_sg implementations PCI/P2PDMA: Attempt to set map_type if it has not been set lib/scatterlist: add flag for indicating P2PDMA segments in an SGL swiotlb: clean up some coding style and minor issues dma-mapping: update comment after dmabounce removal scsi: sd: Add a comment about limiting max_sectors to shost optimal limit ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit ...
2022-08-02nvme-pci: split nvme_dev_addChristoph Hellwig
Split nvme_dev_add into a helper to actually allocate the tag set, and one that just update the number of queues. Add a local variable for the tag_set to clean up the code a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>