summaryrefslogtreecommitdiff
path: root/drivers/nvme/host/tcp.c
AgeCommit message (Collapse)Author
2019-11-04nvme: introduce nvme_is_aen_req functionIsrael Rukshin
This function improves code readability and reduces code duplication. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-25nvme-tcp: fix wrong stop condition in io_workWunderlich, Mark
Allow the do/while statement to continue if current time is not after the proposed time 'deadline'. Intent is to allow loop to proceed for a specific time period. Currently the loop, as coded, will exit after first pass. Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-09-12nvme: tcp: remove redundant assignment to variable retColin Ian King
The variable ret is being initialized with a value that is never read and is being re-assigned immediately afterwards. The assignment is redundant and hence can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-09-12nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failedSagi Grimberg
This is a more appropriate error status for a transport error detected by us (the host). Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme: make fabrics command run on a separate request queueSagi Grimberg
We have a fundamental issue that fabric commands use the admin_q. The reason is, that admin-connect, register reads and writes and admin commands cannot be guaranteed ordering while we are running controller resets. For example, when we reset a controller we perform: 1. disable the controller 2. teardown the admin queue 3. re-establish the admin queue 4. enable the controller In order to perform (3), we need to unquiesce the admin queue, however we may have some admin commands that are already pending on the quiesced admin_q and will immediate execute when we unquiesce it before we execute (4). The host must not send admin commands to the controller before enabling the controller. To fix this, we have the fabric commands (admin connect and property get/set, but not I/O queue connect) use a separate fabrics_q and make sure to quiesce the admin_q before we disable the controller, and unquiesce it only after we enable the controller. This fixes the error prints from nvmet in a controller reset storm test: kernel: nvmet: got cmd 6 while CC.EN == 0 on qid = 0 Which indicate that the host is sending an admin command when the controller is not enabled. Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme-tcp: Add TOS for tcp transportIsrael Rukshin
TOS provide clients the ability to segregate traffic flows for different type of data. One of the TOS usage is bandwidth management which allows setting bandwidth limits for QoS classes, e.g. 80% bandwidth to controllers at QoS class A and 20% to controllers at QoS class B. usage examples: nvme connect --tos=0 --transport=tcp --traddr=10.0.1.1 --nqn=test-nvme Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme-tcp: Use struct nvme_ctrl directlyIsrael Rukshin
This patch doesn't change any functionality. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme-tcp: support simple pollingSagi Grimberg
Simple polling support via socket busy_poll interface. Although we do not shutdown interrupts but simply hammer the socket poll, we can sometimes find completions faster than the normal interrupt driven RX path. We add per queue nr_cqe counter that resets every time RX path is invoked such that .poll callback can return it to stay consistent with the semantics. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme: don't pass cap to nvme_disable_ctrlSagi Grimberg
All seem to call it with ctrl->cap so no need to pass it at all. Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme: move sqsize setting to the coreSagi Grimberg
nvme_enable_ctrl reads the cap register right after, so no need to do that locally in the transport driver. Have sqsize setting in nvme_init_identify. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme-tcp: Use protocol specific operations while reading socketPotnuri Bharat Teja
Using socket specific read_sock() calls instead of directly calling tcp_read_sock() helps lld module registered handlers if any, to be called from nvme-tcp host. This patch therefore replaces the tcp_read_sock() with socket specific prot_ops. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-29nvme-tcp: cleanup nvme_tcp_recv_pduSagi Grimberg
Can return directly in the switch statement Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-08-04nvme: wait until all completed request's complete fn is calledMing Lei
When aborting in-flight request for recovering controller, we have to make sure that queue's complete function is called on completed request before moving on. Otherwise, for example, the warning of WARN_ON_ONCE(qp->mrs_used > 0) in ib_destroy_qp_user() may be triggered on nvme-rdma. Fix this issue by using blk_mq_tagset_wait_completed_request. Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-09nvme-tcp: don't use sendpage for SLAB pagesMikhail Skorzhinskii
According to commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects") and previous discussion, tcp_sendpage should not be used for pages that is managed by SLAB, as SLAB is not taking page reference counters into consideration. Signed-off-by: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-30nvme-tcp: fix queue mapping when queue count is limitedSagi Grimberg
When the controller supports less queues than requested, we should make sure that queue mapping does the right thing and not assume that all queues are available. This fixes a crash when the controller supports less queues than requested. The rules are: 1. if no write queues are requested, we assign the available queues to the default queue map. The default and read queue maps share the existing queues. 2. if write queues are requested: - first make sure that read queue map gets the requested nr_io_queues count - then grant the default queue map the minimum between the requested nr_write_queues and the remaining queues. If there are no available queues to dedicate to the default queue map, fallback to (1) and share all the queues in the existing queue map. Also, provide a log indication on how we constructed the different queue maps. Reported-by: Harris, James R <james.r.harris@intel.com> Tested-by: Jim Harris <james.r.harris@intel.com> Cc: <stable@vger.kernel.org> # v5.0+ Suggested-by: Roy Shterman <roys@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-05-01nvme-tcp: fix possible null deref on a timed out io queue connectSagi Grimberg
If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-25nvme-tcp: rename function to have nvme_tcp prefixSagi Grimberg
usually nvme_ prefix is for core functions. While we're cleaning up, remove redundant empty lines Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-25nvme-tcp: fix a NULL deref when an admin connect times outSagi Grimberg
If we timeout the admin startup sequence we might not yet have an I/O tagset allocated which causes the teardown sequence to crash. Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags if the tagset wasn't allocated. Fixes: 39d57757467b ("nvme-tcp: fix timeout handler") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-03-28nvme-tcp: fix an endianess miss-annotationChristoph Hellwig
nvme_tcp_end_request just takes the status value and the converts it to little endian as well as shifting for the phase bit. Fixes: 43ce38a6d823 ("nvme-tcp: support C2HData with SUCCESS flag") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2019-03-13nvme-tcp: support C2HData with SUCCESS flagSagi Grimberg
A C2HData PDU with the SUCCESS flag set indicates that the I/O was completed by the controller successfully and means that a subsequent completion response capsule PDU will be ommitted. If we see this flag, fisrt we check that LAST_PDU flag is set as well, and then we complete the request when the data transfer (and data digest verification if its on) is done. While we're at it, reuse a bit of code with nvme_fail_request. Reported-by: Steve Blightman <steve.blightman@oracle.com> Suggested-by: Oliver Smith-Denny <osmithde@cisco.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Oliver Smith-Denny <osmithde@cisco.com> Tested-by: Oliver Smith-Denny <osmithde@cisco.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-04nvme: remove the .stop_ctrl calloutSagi Grimberg
It is used now just to flush error recovery and reconnect work items in the RDMA and TCP transports, which can simply be moved to the corresponding teardown routines. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-23nvme-tcp: fix timeout handlerSagi Grimberg
Currently, we have several problems with the timeout handler: 1. If we timeout on the controller establishment flow, we will hang because we don't execute the error recovery (and we shouldn't because the create_ctrl flow needs to fail and cleanup on its own) 2. We might also hang if we get a disconnet on a queue while the controller is already deleting. This racy flow can cause the controller disable/shutdown admin command to hang. We cannot complete a timed out request from the timeout handler without mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work). So we serialize it in the timeout handler and teardown io and admin queues to guarantee that no one races with us from completing the request. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-09nvme-tcp: don't ask if controller is fabricsSagi Grimberg
For sure we are a fabric driver. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-09nvme-tcp: remove dead codeSagi Grimberg
We should never touch the opal device from the transport driver. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-fabrics: allow nvmf_connect_io_queue to pollSagi Grimberg
Preparation for polling support for fabrics. Polling support means that our completion queues are not generating any interrupts which means we need to poll for the nvmf io queue connect as well. Reviewed by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"Colin Ian King
There is a spelling mistake in a dev_info message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-tcp: fix endianess annotationsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-13nvme-tcp: support separate queue maps for read and writeSagi Grimberg
Allow NVMF_OPT_NR_WRITE_QUEUES to describe additional write queues. In addition, implement .map_queues that will apply 2 queue maps for read and write queue sets. Note that with the separate queue map, HCTX_TYPE_READ will always use nr_io_queues and HCTX_TYPE_DEFAULT will use nr_write_queues. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvme-tcp: add NVMe over TCP host driverSagi Grimberg
This patch implements the NVMe over TCP host driver. It can be used to connect to remote NVMe over Fabrics subsystems over good old TCP/IP. The driver implements the TP 8000 of how nvme over fabrics capsules and data are encapsulated in nvme-tcp pdus and exchaged on top of a TCP byte stream. nvme-tcp header and data digest are supported as well. To connect to all NVMe over Fabrics controllers reachable on a given taget port over TCP use the following command: nvme connect-all -t tcp -a $IPADDR This requires the latest version of nvme-cli with TCP support. Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com> Signed-off-by: Roy Shterman <roys@lightbitslabs.com> Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com> Signed-off-by: Christoph Hellwig <hch@lst.de>