summaryrefslogtreecommitdiff
path: root/drivers/block/null_blk.c
AgeCommit message (Collapse)Author
2017-09-07Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer updates from Jens Axboe: "This is the first pull request for 4.14, containing most of the code changes. It's a quiet series this round, which I think we needed after the churn of the last few series. This contains: - Fix for a registration race in loop, from Anton Volkov. - Overflow complaint fix from Arnd for DAC960. - Series of drbd changes from the usual suspects. - Conversion of the stec/skd driver to blk-mq. From Bart. - A few BFQ improvements/fixes from Paolo. - CFQ improvement from Ritesh, allowing idling for group idle. - A few fixes found by Dan's smatch, courtesy of Dan. - A warning fixup for a race between changing the IO scheduler and device remova. From David Jeffery. - A few nbd fixes from Josef. - Support for cgroup info in blktrace, from Shaohua. - Also from Shaohua, new features in the null_blk driver to allow it to actually hold data, among other things. - Various corner cases and error handling fixes from Weiping Zhang. - Improvements to the IO stats tracking for blk-mq from me. Can drastically improve performance for fast devices and/or big machines. - Series from Christoph removing bi_bdev as being needed for IO submission, in preparation for nvme multipathing code. - Series from Bart, including various cleanups and fixes for switch fall through case complaints" * 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits) kernfs: checking for IS_ERR() instead of NULL drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set drbd: Fix allyesconfig build, fix recent commit drbd: switch from kmalloc() to kmalloc_array() drbd: abort drbd_start_resync if there is no connection drbd: move global variables to drbd namespace and make some static drbd: rename "usermode_helper" to "drbd_usermode_helper" drbd: fix race between handshake and admin disconnect/down drbd: fix potential deadlock when trying to detach during handshake drbd: A single dot should be put into a sequence. drbd: fix rmmod cleanup, remove _all_ debugfs entries drbd: Use setup_timer() instead of init_timer() to simplify the code. drbd: fix potential get_ldev/put_ldev refcount imbalance during attach drbd: new disk-option disable-write-same drbd: Fix resource role for newly created resources in events2 drbd: mark symbols static where possible drbd: Send P_NEG_ACK upon write error in protocol != C drbd: add explicit plugging when submitting batches drbd: change list_for_each_safe to while(list_first_entry_or_null) drbd: introduce drbd_recv_header_maybe_unplug ...
2017-08-29smp: Avoid using two cache lines for struct call_single_dataYing Huang
struct call_single_data is used in IPIs to transfer information between CPUs. Its size is bigger than sizeof(unsigned long) and less than cache line size. Currently it is not allocated with any explicit alignment requirements. This makes it possible for allocated call_single_data to cross two cache lines, which results in double the number of the cache lines that need to be transferred among CPUs. This can be fixed by requiring call_single_data to be aligned with the size of call_single_data. Currently the size of call_single_data is the power of 2. If we add new fields to call_single_data, we may need to add padding to make sure the size of new definition is the power of 2 as well. Fortunately, this is enforced by GCC, which will report bad sizes. To set alignment requirements of call_single_data to the size of call_single_data, a struct definition and a typedef is used. To test the effect of the patch, I used the vm-scalability multiple thread swap test case (swap-w-seq-mt). The test will create multiple threads and each thread will eat memory until all RAM and part of swap is used, so that huge number of IPIs are triggered when unmapping memory. In the test, the throughput of memory writing improves ~5% compared with misaligned call_single_data, because of faster IPIs. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Huang, Ying <ying.huang@intel.com> [ Add call_single_data_t and align with size of call_single_data. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-28null_blk: use available 'dev' in nullb_device_power_store()Jens Axboe
We already have this pointer, no need to use to_nullb_device() again. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28block/nullb: delete unnecessary memory freeShaohua Li
Commit 2984c86(nullb: factor disk parameters) has a typo. The nullb_device allocation/free is done outside of null_add_dev. The commit accidentally frees the nullb_device in error code path. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25block/nullb: fix NULL dereferenceShaohua Li
Dan reported this: The patch 2984c8684f96: "nullb: factor disk parameters" from Aug 14, 2017, leads to the following Smatch complaint: drivers/block/null_blk.c:1759 null_init_tag_set() error: we previously assumed 'nullb' could be null (see line 1750) 1755 set->cmd_size = sizeof(struct nullb_cmd); 1756 set->flags = BLK_MQ_F_SHOULD_MERGE; 1757 set->driver_data = NULL; 1758 1759 if (nullb->dev->blocking) ^^^^^^^^^^^^^^^^^^^^ And an unchecked dereference. nullb could be NULL here. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-25null_blk: update email adressJens Axboe
Update to a working one, the fusionio address hasn't been valid in 4 years. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: badbblocks supportShaohua Li
Sometime disk could have tracks broken and data there is inaccessable, but data in other parts can be accessed in normal way. MD RAID supports such disks. But we don't have a good way to test it, because we can't control which part of a physical disk is bad. For a virtual disk, this can be easily controlled. This patch adds a new 'badblock' attribute. Configure it in this way: echo "+1-100" > xxx/badblock, this will make sector [1-100] as bad blocks. echo "-20-30" > xxx/badblock, this will make sector [20-30] good If badblocks are accessed, the nullb disk will return IO error. Other parts of the disk can accessed in normal way. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: emulate cacheShaohua Li
Software must flush disk cache to guarantee data safety. To check if software correctly does disk cache flush, we must know the behavior of disk. But physical disk behavior is uncontrollable. Even software doesn't do the flush, the disk probably does the flush. This patch tries to emulate a cache in the test disk. All write will go to a cache first, when the cache is full, we then flush some data to disk storage. A flush request will flush all data of the cache to disk storage. A FUA write will write to memory store directly and revalidate data in cache. If there is a power failure (by writing to power attribute, 'echo 0 > disk_name/power'), we discard all data in the cache, but preserve the data in disk storage. Later we can power on the disk again as usual (write 1 to 'power' attribute), then we can check data integrity and very if software does everything correctly. A new attribute 'cache_size' (in MB) is added to configure cache size. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh <kkc6196@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: bandwidth controlShaohua Li
In test, we usually expect controllable disk speed. For example, in a raid array, we'd like some disks are fast and some are slow. MD RAID actually has a feature for this. To test the feature, we'd like to make the disk run in specific speed. block throttling probably can be used for this purpose, but it requires cgroup setup. Here we just implement a simple throttling mechanism in the driver. There is slight fluctuation in the mechanism, but it's good enough for test. To configure the bandwidth cap, user sets the 'mbps' attribute. mbps is MB/s. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh <kkc6196@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: support discardShaohua Li
discard makes sense for memory backed disk. And also it's useful to test if upper layer supports dicard correctly. User configures 'discard' attribute to enable/disable dicard support. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh <kkc6196@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: support memory backed storeShaohua Li
This adds memory backed store in nullb. User configure 'memory_backed' attribute for this. By default, nullb disk doesn't use memory backed store. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh <kkc6196@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: use ida to manage indexShaohua Li
We now dynamically create disks. Managing the disk index with ida to avoid bump up the index too much. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: add interface to power on diskShaohua Li
The device created in nullb configfs interface isn't power on by default. After user configures the device, user can do 'echo 1 > xxx/nullb/device_name/power' to power on the device, which will create a disk. the xxx/nullb/device_name/index is the disk index, so if the index is 2, the new created disk should be named as /dev/nullb2. Note, the 'index' is only valid after disk is power on. 'echo 0 > xxx/nullb/device_name/power' will remove the disk. Note, this doesn't remove the device. To remove the device, user should do 'rmdir xxx/nullb/device_name'. Removing the device will remove the disk too. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: add configfs interfaceShaohua Li
Add configfs interface for nullb. configfs interface is more flexible and easy to configure in a per-disk basis. Configuration is something like this: mount -t configfs none /mnt Checking which features the driver supports: cat /mnt/nullb/features The 'features' attribute is for future extension. We probably will add new features into the driver, userspace can check this attribute to find the supported features. Create/remove a device: mkdir/rmdir /mnt/nullb/a Then configure the device by setting attributes under /mnt/nullb/a, most of nullb supported module parameters are converted to attributes: size; /* device size in MB */ completion_nsec; /* time in ns to complete a request */ submit_queues; /* number of submission queues */ home_node; /* home node for the device */ queue_mode; /* block interface */ blocksize; /* block size */ irqmode; /* IRQ completion handler */ hw_queue_depth; /* queue depth */ use_lightnvm; /* register as a LightNVM device */ blocking; /* blocking blk-mq device */ use_per_node_hctx; /* use per-node allocation for hardware context */ Note, creating a device doesn't create a disk immediately. Creating a disk is done in two phases: create a device and then power on the device. Next patch will introduce device power on. Based on original patch from Kyungchan Koh Signed-off-by: Kyungchan Koh <kkc6196@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23nullb: factor disk parametersShaohua Li
When we switch to configfs interface, each disk could have different configuration. To prepare for the change, we move most disk setting to a separate data structure. The existing module parameter interface is kept. The 'nr_devices' and 'shared_tags' don't make sense for per-disk setting, so they are remained as global settings. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-07null_blk: make sure submit_queues > 0weiping zhang
set submit_queues to 1 by default, and make sure it's value > 0. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-07null_blk: simplify logic for use_per_node_hctxweiping zhang
make sure submit_queues equal nr_online_nodes. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-06null_blk: fix error flow for shared tags during module_initMax Gurtovoy
In case we use shared tags feature, blk_mq_alloc_tag_set might fail during module initialization. In that case, fail the load with a suitable error code. Also move the tagset initialization process after defining the amount of submission queues. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20null_blk: add support for shared tagsJens Axboe
Some storage drivers need to share tag sets between devices. It's useful to be able to model that with null_blk, to find hangs or performance issues. Add a 'shared_tags' bool module parameter that. If that is set to true and nr_devices is bigger than 1, all devices allocated will share the same tag set. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-09blk-mq: switch ->queue_rq return value to blk_status_tChristoph Hellwig
Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09block: introduce new block status code typeChristoph Hellwig
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20blk-mq: remove the error argument to blk_mq_complete_requestChristoph Hellwig
Now that all drivers that call blk_mq_complete_requests have a ->complete callback we can remove the direct call to blk_mq_end_request, as well as the error argument to blk_mq_complete_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20null_blk: don't pass always-0 req->errors to blk_mq_complete_requestChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-19null_blk: Use blk_init_request_from_bio() instead of open-coding itBart Van Assche
This patch changes the behavior of the null_blk driver for the LightNVM mode as follows: * REQ_FAILFAST_MASK is set for read-ahead requests. * If no I/O priority has been set in the bio, the I/O priority is copied from the I/O context. * The rq_disk member is initialized if bio->bi_bdev != NULL. * req->errors is initialized to zero. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matias Bjørling <m@bjorling.me> Cc: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-31blk-mq: constify struct blk_mq_opsEric Biggers
Constify all instances of blk_mq_ops, as they are never modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-30null_blk: add blocking modeJens Axboe
This adds a new module parameter to null_blk, blocking. If set, null_blk will set the BLK_MQ_F_BLOCKING flag, indicating that it sometimes/always needs to block in its ->queue_rq() function. The intent is to help find regressions in blocking drivers, since not many of them exist. If null_blk is loaded with submit_queues > 1 and blocking=1, this shows the regression recently fixed by bf4907c05e61. Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-17Merge branch 'for-4.11/next' into for-4.11/linus-mergeJens Axboe
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig
Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: use end_io callback instead of instanceMatias Bjørling
When the lightnvm core had the "gennvm" layer between the device and the target, there was a need for the core to be able to figure out which target it should send an end_io callback to. Leading to a "double" end_io, first for the media manager instance, and then for the target instance. Now that core and gennvm is merged, there is no longer a need for this, and a single end_io callback will do. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: reduce number of nvm_id groups to oneMatias Bjørling
The number of configuration groups has been limited to one in current code, even if there is support for up to four. With the introduction of the open-channel SSD 1.3 specification, only a single group is exposed onwards. Reflect this in the nvm_id structure. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-25ktime: Cleanup ktime_set() usageThomas Gleixner
ktime_set(S,N) was required for the timespec storage type and is still useful for situations where a Seconds and Nanoseconds part of a time value needs to be converted. For anything where the Seconds argument is 0, this is pointless and can be replaced with a simple assignment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2016-11-16null_blk: add usage hints for NVMYasuaki Ishimatsu
If CONFIG_NVM is disabled, loading null_block module with use_lightnvm=1 fails. But there are no messages and documents related to the failure. Add the appropriate error message. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Massaged the text a bit. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-09Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull blk-mq irq/cpu mapping updates from Jens Axboe: "This is the block-irq topic branch for 4.9-rc. It's mostly from Christoph, and it allows drivers to specify their own mappings, and more importantly, to share the blk-mq mappings with the IRQ affinity mappings. It's a good step towards making this work better out of the box" * 'for-4.9/block-irq' of git://git.kernel.dk/linux-block: blk_mq: linux/blk-mq.h does not include all the headers it depends on blk-mq: kill unused blk_mq_create_mq_map() blk-mq: get rid of the cpumask in struct blk_mq_tags nvme: remove the post_scan callout nvme: switch to use pci_alloc_irq_vectors blk-mq: provide a default queue mapping for PCI device blk-mq: allow the driver to pass in a queue mapping blk-mq: remove ->map_queue blk-mq: only allocate a single mq_map per tag_set blk-mq: don't redistribute hardware queues on a CPU hotplug event
2016-09-21lightnvm: control life of nvm_dev in driverMatias Bjørling
LightNVM compatible device drivers does not have a method to expose LightNVM specific sysfs entries. To enable LightNVM sysfs entries to be exposed, lightnvm device drivers require a struct device to attach it to. To allow both the actual device driver and lightnvm sysfs entries to coexist, the device driver tracks the lifetime of the nvm_dev structure. This patch refactors NVMe and null_blk to handle the lifetime of struct nvm_dev, which eliminates the need for struct gendisk when a lightnvm compatible device is provided. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-21null_blk: refactor to support non-gendisk devicesMatias Bjørling
With LightNVM enabled devices, the gendisk structure is not exposed to the user. This hides the device driver specific sysfs entries, and prevents binding of LightNVM geometry information to the device. Refactor the device registration process, so that gendisk and non-gendisk devices are easily managed. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15blk-mq: remove ->map_queueChristoph Hellwig
All drivers use the default, so provide an inline version of it. If we ever need other queue mapping we can add an optional method back, although supporting will also require major changes to the queue setup code. This provides better code generation, and better debugability as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20block: get rid of bio_rw and READAChristoph Hellwig
These two are confusing leftover of the old world order, combining values of the REQ_OP_ and REQ_ namespaces. For callers that don't special case we mostly just replace bi_rw with bio_data_dir or op_is_write, except for the few cases where a switch over the REQ_OP_ values makes more sense. Any check for READA is replaced with an explicit check for REQ_RAHEAD. Also remove the READA alias for REQ_RAHEAD. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18null_blk: add lightnvm null_blk device to the nullb_listWenwei Tao
After register null_blk devices into lightnvm, we forget to add these devices to the the nullb_list, makes them invisible to the null_blk driver. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Fixes: a514379b0c77 ("null_blk: oops when initializing without lightnvm") Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-11null_blk: oops when initializing without lightnvmMatias Bjørling
If the LightNVM subsystem is not compiled into the kernel, and the null_blk device driver requests lightnvm to be initialized. The call to nvm_register fails and the null_add_dev function cleans up the initialization. However, at this point the null block device has already been added to the nullb_list and thus a second cleanup will occur when the function has returned, that leads to a double call to blk_cleanup_queue. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04lightnvm: allow to force mm initializationMatias Bjørling
System block allows the device to initialize with its configured media manager. The system blocks is written to disk, and read again when media manager is determined. For this to work, the backend must store the data. Device drivers, such as null_blk, does not have any backend storage. This patch allows the media manager to be initialized without a storage backend. It also fix incorrect configuration of capabilities in null_blk, as it does not support get/set bad block interface. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-21Merge branch 'for-4.5/lightnvm' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull lightnvm fixes and updates from Jens Axboe: "This should have been part of the drivers branch, but it arrived a bit late and wasn't based on the official core block driver branch. So they got a small scolding, but got a pass since it's still new. Hence it's in a separate branch. This is mostly pure fixes, contained to lightnvm/, and minor feature additions" * 'for-4.5/lightnvm' of git://git.kernel.dk/linux-block: (26 commits) lightnvm: ensure that nvm_dev_ops can be used without CONFIG_NVM lightnvm: introduce factory reset lightnvm: use system block for mm initialization lightnvm: introduce ioctl to initialize device lightnvm: core on-disk initialization lightnvm: introduce mlc lower page table mappings lightnvm: add mccap support lightnvm: manage open and closed blocks separately lightnvm: fix missing grown bad block type lightnvm: reference rrpc lun in rrpc block lightnvm: introduce nvm_submit_ppa lightnvm: move rq->error to nvm_rq->error lightnvm: support multiple ppas in nvm_erase_ppa lightnvm: move the pages per block check out of the loop lightnvm: sectors first in ppa list lightnvm: fix locking and mempool in rrpc_lun_gc lightnvm: put block back to gc list on its reclaim fail lightnvm: check bi_error in gc lightnvm: return the get_bb_tbl return value lightnvm: refactor end_io functions for sync ...
2016-01-21Merge branch 'for-4.5/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "This is the block driver pull request for 4.5, with the exception of NVMe, which is in a separate branch and will be posted after this one. This pull request contains: - A set of bcache stability fixes, which have been acked by Kent. These have been used and tested for more than a year by the community, so it's about time that they got in. - A set of drbd updates from the drbd team (Andreas, Lars, Philipp) and Markus Elfring, Oleg Drokin. - A set of fixes for xen blkback/front from the usual suspects, (Bob, Konrad) as well as community based fixes from Kiri, Julien, and Peng. - A 2038 time fix for sx8 from Shraddha, with a fix from me. - A small mtip32xx cleanup from Zhu Yanjun. - A null_blk division fix from Arnd" * 'for-4.5/drivers' of git://git.kernel.dk/linux-block: (71 commits) null_blk: use sector_div instead of do_div mtip32xx: restrict variables visible in current code module xen/blkfront: Fix crash if backend doesn't follow the right states. xen/blkback: Fix two memory leaks. xen/blkback: make st_ statistics per ring xen/blkfront: Handle non-indirect grant with 64KB pages xen-blkfront: Introduce blkif_ring_get_request xen-blkback: clear PF_NOFREEZE for xen_blkif_schedule() xen/blkback: Free resources if connect_ring failed. xen/blocks: Return -EXX instead of -1 xen/blkback: make pool of persistent grants and free pages per-queue xen/blkback: get the number of hardware queues/rings from blkfront xen/blkback: pseudo support for multi hardware queues/rings xen/blkback: separate ring information out of struct xen_blkif xen/blkfront: correct setting for xen_blkif_max_ring_order xen/blkfront: make persistent grants pool per-queue xen/blkfront: Remove duplicate setting of ->xbdev. xen/blkfront: Cleanup of comments, fix unaligned variables, and syntax errors. xen/blkfront: negotiate number of queues/rings to be used with backend xen/blkfront: split per device io_lock ...
2016-01-19Merge branch 'for-4.5/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "We don't have a lot of core changes this time around, it's mostly in drivers, which will come in a subsequent pull. The cores changes include: - blk-mq - Prep patch from Christoph, changing blk_mq_alloc_request() to take flags instead of just using gfp_t for sleep/nosleep. - Doc patch from me, clarifying the difference between legacy and blk-mq for timer usage. - Fixes from Raghavendra for memory-less numa nodes, and a reuse of CPU masks. - Cleanup from Geliang Tang, using offset_in_page() instead of open coding it. - From Ilya, rename request_queue slab to it reflects what it holds, and a fix for proper use of bdgrab/put. - A real fix for the split across stripe boundaries from Keith. We yanked a broken version of this from 4.4-rc final, this one works. - From Mike Krinkin, emit a trace message when we split. - From Wei Tang, two small cleanups, not explicitly clearing memory that is already cleared" * 'for-4.5/core' of git://git.kernel.dk/linux-block: block: use bd{grab,put}() instead of open-coding block: split bios to max possible length block: add call to split trace point blk-mq: Avoid memoryless numa node encoded in hctx numa_node blk-mq: Reuse hardware context cpumask for tags blk-mq: add a flags parameter to blk_mq_alloc_request Revert "blk-flush: Queue through IO scheduler when flush not required" block: clarify blk_add_timer() use case for blk-mq bio: use offset_in_page macro block: do not initialise statics to 0 or NULL block: do not initialise globals to 0 or NULL block: rename request_queue slab cache
2016-01-13null_blk: use sector_div instead of do_divArnd Bergmann
Dividing a sector_t number should be done using sector_div rather than do_div to optimize the 32-bit sector_t case, and with the latest do_div optimizations, we now get a compile-time warning for this: arch/arm/include/asm/div64.h:32:95: note: expected 'uint64_t * {aka long long unsigned int *}' but argument is of type 'sector_t * {aka long unsigned int *}' drivers/block/null_blk.c:521:81: warning: comparison of distinct pointer types lacks a cast This changes the newly added code to use sector_div. It is a simplified version of the original patch, as Linus Torvalds pointed out that we should not be using an expensive division function in the first place. This version was suggested by Matias Bjorling. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Matias Bjorling <m@bjorling.me> Fixes: b2b7e00148a2 ("null_blk: register as a LightNVM device") Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12lightnvm: refactor end_io functions for syncMatias Bjørling
To implement sync I/O support within the LightNVM core, the end_io functions are refactored to take an end_io function pointer instead of testing for initialized media manager, followed by calling its end_io function. Sync I/O can then be implemented using a callback that signal I/O completion. This is similar to the logic found in blk_to_execute_io(). By implementing it this way, the underlying device I/Os submission logic is abstracted away from core, targets, and media managers. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-28null_blk: use async queue restart helperJens Axboe
If null_blk is run in NULL_IRQ_TIMER mode and with queue_mode NULL_Q_RQ, we need to restart the queue from the hrtimer interrupt. We can't directly invoke the request_fn from that context, so punt the queue run to async kblockd context. Tested-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-22null_blk: fix use-after-free errorMike Krinkin
blk_end_request_all may free request, so we need to save request_queue pointer before blk_end_request_all call. The problem was introduced in commit cf8ecc5a8455266f8d51 ("null_blk: guarantee device restart in all irq modes") and causes general protection fault with slab poisoning enabled. Fixes: cf8ecc5a8455266f8d51 ("null_blk: guarantee device restart in all irq modes") Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-08null_blk: Fix error path in module initializationMinfei Huang
Module couldn't release resource properly during the initialization. To fix this issue, we will clean up the proper resource before returning. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07lightnvm: replace req queue with nvmdev for lldMatias Bjørling
In the case where a request queue is passed to the low lever lightnvm device drive integration, the device driver might pass its admin commands through another queue. Instead pass nvm_dev, and let the low level drive the appropriate queue. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-01blk-mq: add a flags parameter to blk_mq_alloc_requestChristoph Hellwig
We already have the reserved flag, and a nowait flag awkwardly encoded as a gfp_t. Add a real flags argument to make the scheme more extensible and allow for a nicer calling convention. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>