summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-01block: move sync_blockdev from __blkdev_put to blkdev_putChristoph Hellwig
Do the early unlocked syncing even earlier to move more code out of the recursive path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210525061301.2242282-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: split __blkdev_getChristoph Hellwig
Split __blkdev_get into one helper for the whole device, and one for opening partitions. This removes the (bounded) recursion when opening a partition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210525061301.2242282-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: unexport blk_alloc_queueChristoph Hellwig
blk_alloc_queue is just an internal helper now, unexport it and remove it from the public header. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01null_blk: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the null_blk driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Note that the blk-mq mode is left with its own allocations scheme, to be handled later. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-26-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01xpram: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the xpram driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01dcssblk: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the dcssblk driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01ps3vram: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the ps3vram driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-23-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01n64cart: convert to blk_alloc_diskChristoph Hellwig
Convert the n64cart driver to use the blk_alloc_disk helper to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-22-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01simdisk: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the simdisk driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-21-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01nfblock: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the nfblock driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-20-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01nvme-multipath: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the nvme-multipath driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-19-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01nvdimm-pmem: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the nvdimm-pmem driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01nvdimm-btt: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the nvdimm-btt driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01nvdimm-blk: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the nvdimm-blk driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01md: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the md driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01dm: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the dm driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01bcache: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the bcache driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Coly Li <colyli@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01lightnvm: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the lightnvm driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01zram: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the zram driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01rsxx: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the rsxx driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01pktcdvd: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the pktcdvd driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01drbd: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the drbd driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01brd: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the brd driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. This also allows to remove the request_queue pointer in struct request_queue, and to simplify the initialization as blk_cleanup_disk can be called on any disk returned from blk_alloc_disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: add blk_alloc_disk and blk_cleanup_disk APIsChristoph Hellwig
Add two new APIs to allocate and free a gendisk including the request_queue for use with BIO based drivers. This is to avoid boilerplate code in drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: add a flag to make put_disk on partially initalized disks saferChristoph Hellwig
Add a flag to indicate that __device_add_disk did grab a queue reference so that disk_release only drops it if we actually had it. This sort out one of the major pitfals with partially initialized gendisk that a lot of drivers did get wrong or still do. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: automatically enable GENHD_FL_EXT_DEVTChristoph Hellwig
Automatically set the GENHD_FL_EXT_DEVT flag for all disks allocated without an explicit number of minors. This is what all new block drivers should do, so make sure it is the default without boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: move the DISK_MAX_PARTS sanity check into __device_add_diskChristoph Hellwig
Keep this together with the first place that actually looks at ->minors and prepare for not passing a minors argument to alloc_disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-01block: refactor device number setup in __device_add_diskChristoph Hellwig
Untangle the mess around blk_alloc_devt by moving the check for the used allocation scheme into the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: Use request queue-wide tags for tagset-wide sbitmapJohn Garry
The tags used for an IO scheduler are currently per hctx. As such, when q->nr_hw_queues grows, so does the request queue total IO scheduler tag depth. This may cause problems for SCSI MQ HBAs whose total driver depth is fixed. Ming and Yanhui report higher CPU usage and lower throughput in scenarios where the fixed total driver tag depth is appreciably lower than the total scheduler tag depth: https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@huawei.com/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b In that scenario, since the scheduler tag is got first, much contention is introduced since a driver tag may not be available after we have got the sched tag. Improve this scenario by introducing request queue-wide tags for when a tagset-wide sbitmap is used. The static sched requests are still allocated per hctx, as requests are initialised per hctx, as in blk_mq_init_request(..., hctx_idx, ...) -> set->ops->init_request(.., hctx_idx, ...). For simplicity of resizing the request queue sbitmap when updating the request queue depth, just init at the max possible size, so we don't need to deal with the possibly with swapping out a new sbitmap for old if we need to grow. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1620907258-30910-3-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: Some tag allocation code refactoringJohn Garry
The tag allocation code to alloc the sbitmap pairs is common for regular bitmaps tags and shared sbitmap, so refactor into a common function. Also remove superfluous "flags" argument from blk_mq_init_shared_sbitmap(). Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1620907258-30910-2-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: clearing flush request reference in tags->rqs[]Ming Lei
Before we free request queue, clearing flush request reference in tags->rqs[], so that potential UAF can be avoided. Based on one patch written by David Jeffery. Tested-by: John Garry <john.garry@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210511152236.763464-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: clear stale request in tags->rq[] before freeing one request poolMing Lei
refcount_inc_not_zero() in bt_tags_iter() still may read one freed request. Fix the issue by the following approach: 1) hold a per-tags spinlock when reading ->rqs[tag] and calling refcount_inc_not_zero in bt_tags_iter() 2) clearing stale request referred via ->rqs[tag] before freeing request pool, the per-tags spinlock is held for clearing stale ->rq[tag] So after we cleared stale requests, bt_tags_iter() won't observe freed request any more, also the clearing will wait for pending request reference. The idea of clearing ->rqs[] is borrowed from John Garry's previous patch and one recent David's patch. Tested-by: John Garry <john.garry@huawei.com> Reviewed-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210511152236.763464-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iterMing Lei
Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and this way will prevent the request from being re-used when ->fn is running. The approach is same as what we do during handling timeout. Fix request use-after-free(UAF) related with completion race or queue releasing: - If one rq is referred before rq->q is frozen, then queue won't be frozen before the request is released during iteration. - If one rq is referred after rq->q is frozen, refcount_inc_not_zero() will return false, and we won't iterate over this request. However, still one request UAF not covered: refcount_inc_not_zero() may read one freed request, and it will be handled in next patch. Tested-by: John Garry <john.garry@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210511152236.763464-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block: avoid double io accounting for flush requestMing Lei
For flush request, rq->end_io() may be called two times, one is from timeout handling(blk_mq_check_expired()), another is from normal completion(__blk_mq_end_request()). Move blk_account_io_flush() after flush_rq->ref drops to zero, so io accounting can be done just once for flush request. Fixes: b68663186577 ("block: add iostat counters for flush requests") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: John Garry <john.garry@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210511152236.763464-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block: remove unneeded parenthesis from blk-sysfsMax Gurtovoy
Align to common code conventions. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210511155319.1885277-1-mgurtovoy@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24blkcg: drop CLONE_IO check in blkcg_can_attach()Tejun Heo
blkcg has always rejected to attach if any of the member tasks has shared io_context. The rationale was that io_contexts can be shared across different cgroups making it impossible to define what the appropriate control behavior should be. However, this check causes more problems than it solves: * The check prevents controller enable and migrations but not CLONE_IO itself, which can lead to surprises as the outcome changes depending on the order of operations. * Sharing within a cgroup is fine but the check can't distinguish that. This leads to unnecessary conflicts with the recent CLONE_IO usage in io_uring. io_context sharing doesn't make any difference for rq_qos based controllers and the way it's used is safe as long as tasks aren't migrated dynamically which is the vast majority of use cases. While we can try to make the check more precise to avoid false positives, the added complexity doesn't seem worthwhile. Let's just drop blkcg_can_attach(). Signed-off-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/YJrTvHbrRDbJjw+S@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24aoe: remove unnecessary mutex_init()Yang Yingliang
The mutex ktio_spawn_lock is initialized statically. It is unnecessary to initialize by mutex_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20210511113440.3772053-1-yangyingliang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block_dump: remove comments in docszhangyi (F)
Now block_dump feature is gone, remove all comments in docs. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210313030146.2882027-4-yi.zhang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block_dump: remove block_dump featurezhangyi (F)
We have already delete block_dump feature in mark_inode_dirty() because it can be replaced by tracepoints, now we also remove the part in submit_bio() for the same reason. The part of block dump feature in submit_bio() dump the write process, write region and sectors on the target disk into kernel message. it can be replaced by block_bio_queue tracepoint in submit_bio_checks(), so we do not need block_dump anymore, remove the whole block_dump feature. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210313030146.2882027-3-yi.zhang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-24block_dump: remove block_dump feature in mark_inode_dirty()zhangyi (F)
block_dump is an old debugging interface, one of it's functions is used to print the information about who write which file on disk. If we enable block_dump through /proc/sys/vm/block_dump and turn on debug log level, we can gather information about write process name, target file name and disk from kernel message. This feature is realized in block_dump___mark_inode_dirty(), it print above information into kernel message directly when marking inode dirty, so it is noisy and can easily trigger log storm. At the same time, get the dentry refcount is also not safe, we found it will lead to deadlock on ext4 file system with data=journal mode. After tracepoints has been introduced into the kernel, we got a tracepoint in __mark_inode_dirty(), which is a better replacement of block_dump___mark_inode_dirty(). The only downside is that it only trace the inode number and not a file name, but it probably doesn't matter because the original printed file name in block_dump is not accurate in some cases, and we can still find it through the inode number and device id. So this patch delete the dirting inode part of block_dump feature. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210313030146.2882027-2-yi.zhang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-23Linux 5.13-rc3Linus Torvalds
2021-05-23Merge tag 'perf-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two perf fixes: - Do not check the LBR_TOS MSR when setting up unrelated LBR MSRs as this can cause malfunction when TOS is not supported - Allocate the LBR XSAVE buffers along with the DS buffers upfront because allocating them when adding an event can deadlock" * tag 'perf-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/lbr: Remove cpuc->lbr_xsave allocation from atomic context perf/x86: Avoid touching LBR_TOS MSR for Arch LBR
2021-05-23Merge tag 'locking-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two locking fixes: - Invoke the lockdep tracepoints in the correct place so the ordering is correct again - Don't leave the mutex WAITER bit stale when the last waiter is dropping out early due to a signal as that forces all subsequent lock operations needlessly into the slowpath until it's cleaned up again" * tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal locking/lockdep: Correct calling tracepoints
2021-05-23Merge tag 'irq-urgent-2021-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A few fixes for irqchip drivers: - Allocate interrupt descriptors correctly on Mainstone PXA when SPARSE_IRQ is enabled; otherwise the interrupt association fails - Make the APPLE AIC chip driver depend on APPLE - Remove redundant error output on devm_ioremap_resource() failure" * tag 'irq-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Remove redundant error printing irqchip/apple-aic: APPLE_AIC should depend on ARCH_APPLE ARM: PXA: Fix cplds irqdesc allocation when using legacy mode
2021-05-23Merge tag 'x86_urgent_for_v5.13_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix how SEV handles MMIO accesses by forwarding potential page faults instead of killing the machine and by using the accessors with the exact functionality needed when accessing memory. - Fix a confusion with Clang LTO compiler switches passed to the it - Handle the case gracefully when VMGEXIT has been executed in userspace * tag 'x86_urgent_for_v5.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev-es: Use __put_user()/__get_user() for data accesses x86/sev-es: Forward page-faults which happen during emulation x86/sev-es: Don't return NULL from sev_es_get_ghcb() x86/build: Fix location of '-plugin-opt=' flags x86/sev-es: Invalidate the GHCB after completing VMGEXIT x86/sev-es: Move sev_es_put_ghcb() in prep for follow on patch
2021-05-23Merge tag 'powerpc-5.13-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix breakage of strace (and other ptracers etc.) when using the new scv ABI (Power9 or later with glibc >= 2.33). - Fix early_ioremap() on 64-bit, which broke booting on some machines. Thanks to Dmitry V. Levin, Nicholas Piggin, Alexey Kardashevskiy, and Christophe Leroy. * tag 'powerpc-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls powerpc: Fix early setup to make early_ioremap() work
2021-05-22Merge tag 'kbuild-fixes-v5.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix short log indentation for tools builds - Fix dummy-tools to adjust to the latest stackprotector check * tag 'kbuild-fixes-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: dummy-tools: adjust to stricter stackprotector check scripts/jobserver-exec: Fix a typo ("envirnoment") tools build: Fix quiet cmd indentation
2021-05-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "10 patches. Subsystems affected by this patch series: mm (pagealloc, gup, kasan, and userfaultfd), ipc, selftests, watchdog, bitmap, procfs, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: userfaultfd: hugetlbfs: fix new flag usage in error path lib: kunit: suppress a compilation warning of frame size proc: remove Alexey from MAINTAINERS linux/bits.h: fix compilation error with GENMASK watchdog: reliable handling of timestamps kasan: slab: always reset the tag in get_freepointer_safe() tools/testing/selftests/exec: fix link error ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry Revert "mm/gup: check page posion status for coredump." mm/shuffle: fix section mismatch warning
2021-05-22userfaultfd: hugetlbfs: fix new flag usage in error pathMike Kravetz
In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags") the use of PagePrivate to indicate a reservation count should be restored at free time was changed to the hugetlb specific flag HPageRestoreReserve. Changes to a userfaultfd error path as well as a VM_BUG_ON() in remove_inode_hugepages() were overlooked. Users could see incorrect hugetlb reserve counts if they experience an error with a UFFDIO_COPY operation. Specifically, this would be the result of an unlikely copy_huge_page_from_user error. There is not an increased chance of hitting the VM_BUG_ON. Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mina Almasry <almasry.mina@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Mina Almasry <almasrymina@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-22lib: kunit: suppress a compilation warning of frame sizeZhen Lei
lib/bitfield_kunit.c: In function `test_bitfields_constants': lib/bitfield_kunit.c:93:1: warning: the frame size of 7456 bytes is larger than 2048 bytes [-Wframe-larger-than=] } ^ As the description of BITFIELD_KUNIT in lib/Kconfig.debug, it "Only useful for kernel devs running the KUnit test harness, and not intended for inclusion into a production build". Therefore, it is not worth modifying variable 'test_bitfields_constants' to clear this warning. Just suppress it. Link: https://lkml.kernel.org/r/20210518094533.7652-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Vitor Massaru Iha <vitor@massaru.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>