summaryrefslogtreecommitdiff
path: root/drivers/md
AgeCommit message (Collapse)Author
2020-07-16raid5: remove the meaningless check in raid5_make_requestGuoqing Jiang
We can't guarntee the batched stripe to be set with STRIPE_HANDLE since there are lots of functions could set the flag, such as sync_request, ops_complete_* and end_{read,write}_request etc. Also clear_batch_ready called in handle_stripe ensures the batched list can't continue to be handled by handle_stripe. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-16raid5: put the comment of clear_batch_ready to the right placeGuoqing Jiang
To make people understand the function well, let's put the comment to the right place. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-16raid5: call clear_batch_ready before set STRIPE_ACTIVEGuoqing Jiang
We tried to only put the head sh of batch list to handle_list, then the handle_stripe doesn't handle other members in the batch list. However, we still got the calltrace in break_stripe_batch_list. [593764.644269] stripe state: 2003 kernel: [593764.644299] ------------[ cut here ]------------ kernel: [593764.644308] WARNING: CPU: 12 PID: 856 at drivers/md/raid5.c:4625 break_stripe_batch_list+0x203/0x240 [raid456] [...] kernel: [593764.644363] Call Trace: kernel: [593764.644370] handle_stripe+0x907/0x20c0 [raid456] kernel: [593764.644376] ? __wake_up_common_lock+0x89/0xc0 kernel: [593764.644379] handle_active_stripes.isra.57+0x35f/0x570 [raid456] kernel: [593764.644382] ? raid5_wakeup_stripe_thread+0x96/0x1f0 [raid456] kernel: [593764.644385] raid5d+0x480/0x6a0 [raid456] kernel: [593764.644390] ? md_thread+0x11f/0x160 kernel: [593764.644392] md_thread+0x11f/0x160 kernel: [593764.644394] ? wait_woken+0x80/0x80 kernel: [593764.644396] kthread+0xfc/0x130 kernel: [593764.644398] ? find_pers+0x70/0x70 kernel: [593764.644399] ? kthread_create_on_node+0x70/0x70 kernel: [593764.644401] ret_from_fork+0x1f/0x30 As we can see, the stripe was set with STRIPE_ACTIVE and STRIPE_HANDLE, and only handle_stripe could set those flags then return. And since the stipe was already in the batch list, we need to return earlier before set the two flags. And after dig a little about git history especially commit 3664847d95e6 ("md/raid5: fix a race condition in stripe batch"), it seems the batched stipe still could be handled by handle_stipe, then handle_stipe needs to return earlier if clear_batch_ready to return true. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-16md: rewrite md_setup_drive to avoid ioctlsChristoph Hellwig
md_setup_drive knows it works with md devices, so it is rather pointless to open a file descriptor and issue ioctls. Just call directly into the relevant low-level md routines after getting a handle to the device using blkdev_get_by_dev instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: NeilBrown <neilb@suse.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16md: simplify md_setup_driveChristoph Hellwig
Move the loop over the possible arrays into the caller to remove a level of indentation for the whole function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: NeilBrown <neilb@suse.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16md: remove the kernel version of md_u.hChristoph Hellwig
mdp_major can just move to drivers/md/md.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16md: remove the autoscan partition re-readChristoph Hellwig
devfs is long gone, and autoscan works just fine without this these days. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: NeilBrown <neilb@suse.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16md: replace the RAID_AUTORUN ioctl with a direct function callChristoph Hellwig
Instead of using a spcial RAID_AUTORUN ioctl that only exists for non-modular builds and is only called from the early init code, just call the actual function directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: NeilBrown <neilb@suse.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16md: move the early init autodetect code to drivers/md/Christoph Hellwig
Just like the NFS and CIFS root code this better lives with the driver it is tightly integrated with. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-15md: raid10: Fix compilation warningDamien Le Moal
Remove the if statement around the call to sysfs_link_rdev() in raid10_start_reshape() to avoid the compilation warning: warning: suggest braces around empty body in an ‘if’ statement when compiling with W=1. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-15md: raid5: Fix compilation warningDamien Le Moal
Remove the if statement around the calls to sysfs_link_rdev() to avoid the compilation warning "suggest braces around empty body in an ‘if’ statement" when compiling with W=1. Also fix function description comments to avoid kdoc format warnings. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-15md: raid5-cache: Remove set but unused variableDamien Le Moal
Remove the variable offset in r5c_tree_index() to avoid a "set but not used" compilation warning when compiling with W=1. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-15md: Fix compilation warningDamien Le Moal
Remove the if statement around the calls to sysfs_link_rdev() to avoid the compilation warnings: warning: suggest braces around empty body in an ‘if’ statement when compiling with W=1. For the call to sysfs_create_link() generating the same warning, use the err variable to store the function result, avoiding triggering another warning as the function is declared as 'warn_unused_result'. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-16libnvdimm/nvdimm/flush: Allow architecture to override the flush barrierAneesh Kumar K.V
Architectures like ppc64 provide persistent memory specific barriers that will ensure that all stores for which the modifications are written to persistent storage by preceding dcbfps and dcbstps instructions have updated persistent storage before any data access or data transfer caused by subsequent instructions is initiated. This is in addition to the ordering done by wmb() Update nvdimm core such that architecture can use barriers other than wmb to ensure all previous writes are architecturally visible for the platform buffer flush. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200701072235.223558-5-aneesh.kumar@linux.ibm.com
2020-07-14md-cluster: fix wild pointer of unlock_all_bitmaps()Zhao Heming
reproduction steps: ``` node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb node2 # mdadm -A /dev/md0 /dev/sda /dev/sdb node1 # mdadm -G /dev/md0 -b none mdadm: failed to remove clustered bitmap. node1 # mdadm -S --scan ^C <==== mdadm hung & kernel crash ``` kernel stack: ``` [ 335.230657] general protection fault: 0000 [#1] SMP NOPTI [...] [ 335.230848] Call Trace: [ 335.230873] ? unlock_all_bitmaps+0x5/0x70 [md_cluster] [ 335.230886] unlock_all_bitmaps+0x3d/0x70 [md_cluster] [ 335.230899] leave+0x10f/0x190 [md_cluster] [ 335.230932] ? md_super_wait+0x93/0xa0 [md_mod] [ 335.230947] ? leave+0x5/0x190 [md_cluster] [ 335.230973] md_cluster_stop+0x1a/0x30 [md_mod] [ 335.230999] md_bitmap_free+0x142/0x150 [md_mod] [ 335.231013] ? _cond_resched+0x15/0x40 [ 335.231025] ? mutex_lock+0xe/0x30 [ 335.231056] __md_stop+0x1c/0xa0 [md_mod] [ 335.231083] do_md_stop+0x160/0x580 [md_mod] [ 335.231119] ? 0xffffffffc05fb078 [ 335.231148] md_ioctl+0xa04/0x1930 [md_mod] [ 335.231165] ? filename_lookup+0xf2/0x190 [ 335.231179] blkdev_ioctl+0x93c/0xa10 [ 335.231205] ? _cond_resched+0x15/0x40 [ 335.231214] ? __check_object_size+0xd4/0x1a0 [ 335.231224] block_ioctl+0x39/0x40 [ 335.231243] do_vfs_ioctl+0xa0/0x680 [ 335.231253] ksys_ioctl+0x70/0x80 [ 335.231261] __x64_sys_ioctl+0x16/0x20 [ 335.231271] do_syscall_64+0x65/0x1f0 [ 335.231278] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ``` Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-14md/raid5-cache: clear MD_SB_CHANGE_PENDING before flushing stripesSong Liu
In recovery, if we process too much data, raid5-cache may set MD_SB_CHANGE_PENDING, which causes spinning in handle_stripe(). Fix this issue by clearing the bit before flushing data only stripes. This issue was initially discussed in [1]. [1] https://www.spinics.net/lists/raid/msg64409.html Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-14md: fix deadlock causing by sysfs_notifyJunxiao Bi
The following deadlock was captured. The first process is holding 'kernfs_mutex' and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device, this pending bio list would be flushed by second process 'md127_raid1', but it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace sysfs_notify() can fix it. There were other sysfs_notify() invoked from io path, removed all of them. PID: 40430 TASK: ffff8ee9c8c65c40 CPU: 29 COMMAND: "probe_file" #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06 #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6 #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs] #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs] #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs] #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs] #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs] #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs] #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7 #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93 #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968 #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2 #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445 #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a #16 [ffffb87c4df37958] new_slab at ffffffff9a251430 #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85 #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89 #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10 #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854 #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377 #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118 #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83 #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619 #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566 #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787 #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949 #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad RIP: 00007f617a5f2905 RSP: 00007f607334f838 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007f6064044b20 RCX: 00007f617a5f2905 RDX: 00007f6064044b20 RSI: 00007f6064044b20 RDI: 00007f6064005890 RBP: 00007f6064044aa0 R8: 0000000000000030 R9: 000000000000011c R10: 0000000000000013 R11: 0000000000000246 R12: 00007f606417e6d0 R13: 00007f6064044aa0 R14: 00007f6064044b10 R15: 00000000ffffffff ORIG_RAX: 0000000000000006 CS: 0033 SS: 002b PID: 927 TASK: ffff8f15ac5dbd80 CPU: 42 COMMAND: "md127_raid1" #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06 #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013 #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83 #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696 #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5 #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1] #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348 #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005 #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344 Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-13md: improve io stats accountingArtur Paszkiewicz
Use generic io accounting functions to manage io stats. There was an attempt to do this earlier in commit 18c0b223cf99 ("md: use generic io stats accounting functions to simplify io stat accounting"), but it did not include a call to generic_end_io_acct() and caused issues with tracking in-flight IOs, so it was later removed in commit 74672d069b29 ("md: fix md io stats accounting broken"). This patch attempts to fix this by using both disk_start_io_acct() and disk_end_io_acct(). To make it possible, a struct md_io is allocated for every new md bio, which includes the io start_time. A new mempool is introduced for this purpose. We override bio->bi_end_io with our own callback and call disk_start_io_acct() before passing the bio to md_handle_request(). When it completes, we call disk_end_io_acct() and the original bi_end_io callback. This adds correct statistics about in-flight IOs and IO processing time, interpreted e.g. in iostat as await, svctm, aqu-sz and %util. It also fixes a situation where too many IOs where reported if a bio was re-submitted to the mddev, because io accounting is now performed only on newly arriving bios. Acked-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-13md: raid0/linear: fix dereference before null check on pointer mddevColin Ian King
Pointer mddev is being dereferenced with a test_bit call before mddev is being null checked, this may cause a null pointer dereference. Fix this by moving the null pointer checks to sanity check mddev before it is dereferenced. Addresses-Coverity: ("Dereference before null check") Fixes: 62f7b1989c02 ("md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-13dm verity: add "panic_on_corruption" error handling modeJeongHyeon Lee
Samsung smart phones may need the ability to panic on corruption. Not all devices provide the bootloader support needed to use the existing "restart_on_corruption" mode. Additional details for why Samsung needs this new mode can be found here: https://www.redhat.com/archives/dm-devel/2020-June/msg00235.html Signed-off-by: jhs2.lee <jhs2.lee@samsung.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: use double checked locking in fast pathMike Snitzer
Fast-path code biased toward lazy acknowledgement of bit being set (primarily only for initialization). Multipath code is very retry oriented so even if state is missed it'll recover. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: rename current_pgpath to pgpath in multipath_prepare_ioctlMike Snitzer
Makes consistent with __map_bio() and multipath_clone_and_map(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: rework __map_bio()Mike Snitzer
so that it follows same pattern as request-based multipath_clone_and_map() Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: factor out multipath_queue_bioMike Snitzer
Enables further cleanup. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: push locking down to must_push_back_rq()Mike Snitzer
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: take m->lock spinlock when testing QUEUE_IF_NO_PATHMike Snitzer
Fix multipath_end_io, multipath_end_io_bio and multipath_busy to take m->lock while testing if MPATHF_QUEUE_IF_NO_PATH bit is set. These are all slow-path cases when no paths are available so extra locking isn't a performance hit. Correctness matters most. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-13dm mpath: changes from initial m->flags locking auditMike Snitzer
Fix locking in slow-paths where m->lock should be taken. Signed-off-by: Mike Snitzer <snitzer@rredhat.com>
2020-07-08writeback: remove bdi->congested_fnChristoph Hellwig
Except for pktdvd, the only places setting congested bits are file systems that allocate their own backing_dev_info structures. And pktdvd is a deprecated driver that isn't useful in stack setup either. So remove the dead congested_fn stacking infrastructure. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: David Sterba <dsterba@suse.com> [axboe: fixup unused variables in bcache/request.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08writeback: remove struct bdi_writeback_congestedChristoph Hellwig
We never set any congested bits in the group writeback instances of it. And for the simpler bdi-wide case a simple scalar field is all that that is needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08md: switch to ->check_events for media change notificationsChristoph Hellwig
md is the last driver using the legacy media_changed method. Switch it over to (not so) new ->clear_events approach, which also removes the need for the ->revalidate_disk method. Signed-off-by: Christoph Hellwig <hch@lst.de> [axboe: remove unused 'bdops' variable in disk_clear_events()] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08dm: use noio when sending kobject eventMikulas Patocka
kobject_uevent may allocate memory and it may be called while there are dm devices suspended. The allocation may recurse into a suspended device, causing a deadlock. We must set the noio flag when sending a uevent. The observed deadlock was reported here: https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html Reported-by: Khazhismel Kumykov <khazhy@google.com> Reported-by: Tahsin Erdogan <tahsin@google.com> Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm zoned: Fix zone reclaim triggerDamien Le Moal
Only triggering reclaim based on the percentage of unmapped cache zones can fail to detect cases where reclaim is needed, e.g. if the target has only 2 or 3 cache zones and only one unmapped cache zone, the percentage of free cache zones is higher than DMZ_RECLAIM_LOW_UNMAP_ZONES (30%) and reclaim does not trigger. This problem, combined with the fact that dmz_schedule_reclaim() is called from dmz_handle_bio() without the map lock held, leads to a race between zone allocation and dmz_should_reclaim() result. Depending on the workload applied, this race can lead to the write path waiting forever for a free zone without reclaim being triggered. Fix this by moving dmz_schedule_reclaim() inside dmz_alloc_zone() under the map lock. This results in checking the need for zone reclaim whenever a new data or buffer zone needs to be allocated. Also fix dmz_reclaim_percentage() to always return 0 if the number of unmapped cache (or random) zones is less than or equal to 1. Suggested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm zoned: fix unused but set variable warningsWei Yongjun
Fix unused but set variable warnings: drivers/md/dm-zoned-reclaim.c:504:42: warning: variable nr_rnd set but not used [-Wunused-but-set-variable] 504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0; | ^~~~~~ drivers/md/dm-zoned-reclaim.c:504:24: warning: variable nr_unmap_rnd set but not used [-Wunused-but-set-variable] 504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0; | ^~~~~~~~~~~~ Fixes: f97809aec589 ("dm zoned: per-device reclaim") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm writecache: reject asynchronous pmem devicesMichal Suchanek
DM writecache does not handle asynchronous pmem. Reject it when supplied as cache. Link: https://lore.kernel.org/linux-nvdimm/87lfk5hahc.fsf@linux.ibm.com/ Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm: use bio_uninit instead of bio_disassociate_blkgChristoph Hellwig
bio_uninit is the proper API to clean up a BIO that has been allocated on stack or inside a structure that doesn't come from the BIO allocator. Switch dm to use that instead of bio_disassociate_blkg, which really is an implementation detail. Note that the bio_uninit calls are also moved to the two callers of __send_empty_flush, so that they better pair with the bio_init calls used to initialize them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08Merge tag 'v5.8-rc4' into for-5.9/driversJens Axboe
Merge in 5.8-rc4 for-5.9/block to setup for-5.9/drivers, to provide a clean base and making the life for the NVMe changes easier. Signed-off-by: Jens Axboe <axboe@kernel.dk> * tag 'v5.8-rc4': (732 commits) Linux 5.8-rc4 x86/ldt: use "pr_info_once()" instead of open-coding it badly MIPS: Do not use smp_processor_id() in preemptible code MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen .gitignore: Do not track `defconfig` from `make savedefconfig` io_uring: fix regression with always ignoring signals in io_cqring_wait() x86/ldt: Disable 16-bit segments on Xen PV x86/entry/32: Fix #MC and #DB wiring on x86_32 x86/entry/xen: Route #DB correctly on Xen PV x86/entry, selftests: Further improve user entry sanity checks x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER i2c: mlxcpld: check correct size of maximum RECV_LEN packet i2c: add Kconfig help text for slave mode i2c: slave-eeprom: update documentation i2c: eg20t: Load module automatically if ID matches i2c: designware: platdrv: Set class based on DMI i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665 mm/page_alloc: fix documentation error vmalloc: fix the owner argument for the new __vmalloc_node_range callers mm/cma.c: use exact_nid true to fix possible per-numa cma leak ...
2020-07-07dm: do not use waitqueue for request-based DMMing Lei
Given request-based DM now uses blk-mq's blk_mq_queue_inflight() to determine if outstanding IO has completed (and DM has no control over the blk-mq state machine used to track outstanding IO) it is unsafe to wakeup waiter (dm_wait_for_completion) before blk-mq has cleared a request's state bits (e.g. MQ_RQ_IN_FLIGHT or MQ_RQ_COMPLETE). As such dm_wait_for_completion() could be left to wait indefinitely if no other requests complete. Fix this by eliminating request-based DM's use of waitqueue to wait for blk-mq requests to complete in dm_wait_for_completion. Signed-off-by: Ming Lei <ming.lei@redhat.com> Depends-on: 3c94d83cb3526 ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-05Replace HTTP links with HTTPS ones: LVMAlexander A. Klimov
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Link: https://lore.kernel.org/r/20200627103138.71885-1-grandmaster@al2klimov.de Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-07-01dm: remove unused variableJens Axboe
Since merging the commit identified in Fixes below, we trigger this compile time warning: drivers/md/dm.c: In function ‘__map_bio’: drivers/md/dm.c:1296:24: warning: unused variable ‘md’ [-Wunused-variable] 1296 | struct mapped_device *md = io->md; | ^~ Remove the 'md' variable. Fixes: 5a6c35f9af41 ("block: remove direct_make_request") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: remove the bd_queue field from struct block_deviceChristoph Hellwig
Just use bd_disk->queue instead. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: remove direct_make_requestChristoph Hellwig
Now that submit_bio_noacct has a decent blk-mq fast path there is no more need for this bypass. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: rename generic_make_request to submit_bio_noacctChristoph Hellwig
generic_make_request has always been very confusingly misnamed, so rename it to submit_bio_noacct to make it clear that it is submit_bio minus accounting and a few checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: move ->make_request_fn to struct block_device_operationsChristoph Hellwig
The make_request_fn is a little weird in that it sits directly in struct request_queue instead of an operation vector. Replace it with a block_device_operations method called submit_bio (which describes much better what it does). Also remove the request_queue argument to it, as the queue can be derived pretty trivially from the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: remove the request_queue argument from blk_queue_splitChristoph Hellwig
The queue can be trivially derived from the bio, so pass one less argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01dm: stop using ->queuedataChristoph Hellwig
Instead of setting up the queuedata as well just use one private data field. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01bcache: stop setting ->queuedataChristoph Hellwig
Nothing in bcache actually uses the ->queuedata field. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-29dm: use bio_uninit instead of bio_disassociate_blkgChristoph Hellwig
bio_uninit is the proper API to clean up a BIO that has been allocated on stack or inside a structure that doesn't come from the BIO allocator. Switch dm to use that instead of bio_disassociate_blkg, which really is an implementation detail. Note that the bio_uninit calls are also moved to the two callers of __send_empty_flush, so that they better pair with the bio_init calls used to initialize them. Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-27Merge tag 'for-5.8/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Quite a few DM zoned target fixes and a Zone append fix in DM core. Considering the amount of dm-zoned changes that went in during the 5.8 merge window these fixes are not that surprising. - A few DM writecache target fixes. - A fix to Documentation index to include DM ebs target docs. - Small cleanup to use struct_size() in DM core's retrieve_deps(). * tag 'for-5.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm writecache: add cond_resched to loop in persistent_memory_claim() dm zoned: Fix reclaim zone selection dm zoned: Fix random zone reclaim selection dm: update original bio sector on Zone Append dm zoned: Fix metadata zone size check docs: device-mapper: add dm-ebs.rst to an index file dm ioctl: use struct_size() helper in retrieve_deps() dm writecache: skip writecache_wait when using pmem mode dm writecache: correct uncommitted_block when discarding uncommitted entry dm zoned: assign max_io_len correctly dm zoned: fix uninitialized pointer dereference
2020-06-24blk-mq: move failure injection out of blk_mq_complete_requestChristoph Hellwig
Move the call to blk_should_fake_timeout out of blk_mq_complete_request and into the drivers, skipping call sites that are obvious error handlers, and remove the now superflous blk_mq_force_complete_rq helper. This ensures we don't keep injecting errors into completions that just terminate the Linux request after the hardware has been reset or the command has been aborted. Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-19dm writecache: add cond_resched to loop in persistent_memory_claim()Mikulas Patocka
Add cond_resched() to a loop that fills in the mapper memory area because the loop can be executed many times. Fixes: 48debafe4f2fe ("dm: add writecache target") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>