summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-09-07mmc: core: Improve documentation of MMC_CAP_HW_RESETWolfram Sang
MMC_CAP_HW_RESET means that the controller is capable of resetting the eMMC device via RST_n signal, not a reset of the controller. Two drivers got this wrong, so let's make it more clear. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20200821063533.3771-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-07mmc: sdio: Parse CISTPL_VERS_1 major and minor revision numbersPali Rohár
They should indicate compliance of standard. Signed-off-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20200727133837.19086-3-pali@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-06Merge tag 'x86-urgent-2020-09-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - more generic entry code ABI fallout - debug register handling bugfixes - fix vmalloc mappings on 32-bit kernels - kprobes instrumentation output fix on 32-bit kernels - fix over-eager WARN_ON_ONCE() on !SMAP hardware - NUMA debugging fix - fix Clang related crash on !RETPOLINE kernels * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Unbreak 32bit fast syscall x86/debug: Allow a single level of #DB recursion x86/entry: Fix AC assertion tracing/kprobes, x86/ptrace: Fix regs argument order for i386 x86, fakenuma: Fix invalid starting node ID x86/mm/32: Bring back vmalloc faulting on x86_32 x86/cmdline: Disable jump tables for cmdline.c
2020-09-06Merge tag 'for-linus-5.9-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A small series for fixing a problem with Xen PVH guests when running as backends (e.g. as dom0). Mapping other guests' memory is now working via ZONE_DEVICE, thus not requiring to abuse the memory hotplug functionality for that purpose" * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: add helpers to allocate unpopulated memory memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC xen/balloon: add header guard
2020-09-05Merge tags 'auxdisplay-for-linus-v5.9-rc4', ↵Linus Torvalds
'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux Pull misc fixes from Miguel Ojeda: "A trivial patch for auxdisplay: - Replace HTTP links with HTTPS ones (Alexander A. Klimov) The usual clang-format trivial update: - Update with the latest for_each macro list (Miguel Ojeda) And Luc requested me to pick a sparse fix on my queue, so here it goes along with other two trivial Compiler Attributes ones (also from Luc). - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van Oostenryck) - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van Oostenryck) - Compiler Attributes: remove comment about sparse not supporting __has_attribute (Luc Van Oostenryck)" * tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: auxdisplay: Replace HTTP links with HTTPS ones * tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: clang-format: Update with the latest for_each macro list * tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: sparse: use static inline for __chk_{user,io}_ptr() Compiler Attributes: fix comment concerning GCC 4.6 Compiler Attributes: remove comment about sparse not supporting __has_attribute
2020-09-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "19 patches. Subsystems affected by this patch series: MAINTAINERS, ipc, fork, checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration, hugetlb)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/log2.h: add missing () around n in roundup_pow_of_two() mm/khugepaged.c: fix khugepaged's request size in collapse_file mm/hugetlb: fix a race between hugetlb sysctl handlers mm/hugetlb: try preferred node first when alloc gigantic page from cma mm/migrate: preserve soft dirty in remove_migration_pte() mm/migrate: remove unnecessary is_zone_device_page() check mm/rmap: fixup copying of soft dirty and uffd ptes mm/migrate: fixup setting UFFD_WP flag mm: madvise: fix vma user-after-free checkpatch: fix the usage of capture group ( ... ) fork: adjust sysctl_max_threads definition to match prototype ipc: adjust proc_ipc_sem_dointvec definition to match prototype mm: track page table modifications in __apply_to_page_range() MAINTAINERS: IA64: mark Status as Odd Fixes only MAINTAINERS: add LLVM maintainers MAINTAINERS: update Cavium/Marvell entries mm: slub: fix conversion of freelist_corrupted() mm: memcg: fix memcg reclaim soft lockup memcg: fix use-after-free in uncharge_batch
2020-09-05include/linux/log2.h: add missing () around n in roundup_pow_of_two()Jason Gunthorpe
Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c170945 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-04Merge tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull libata fixes from Jens Axboe: - improve Sandisks ATA_HORKAGE on NCQ (Tejun) - link printk cleanup (Xu) * tag 'libata-5.9-2020-09-04' of git://git.kernel.dk/linux-block: libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks ata: ahci: use ata_link_info() instead of ata_link_printk()
2020-09-04Merge tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A bit larger than usual this week, mostly due to the NVMe fixes arriving late for -rc3 and hence didn't make last weeks pull request. - NVMe: - instance leak and io boundary fixes from Keith - fc locking fix from Christophe - various tcp/rdma reset during traffic fixes from Sagi - pci use-after-free fix from Tong - tcp target null deref fix from Ziye - Locking fix for partition removal (Christoph) - Ensure bdi->io_pages is always set (me) - Fixup for hd struct reference (Ming) - Fix for zero length bvecs (Ming) - Two small blk-iocost fixes (Tejun)" * tag 'block-5.9-2020-09-04' of git://git.kernel.dk/linux-block: block: allow for_each_bvec to support zero len bvec blk-stat: make q->stats->lock irqsafe blk-iocost: ioc_pd_free() shouldn't assume irq disabled block: fix locking in bdev_del_partition block: release disk reference in hd_struct_free_work block: ensure bdi->io_pages is always initialized nvme-pci: cancel nvme device request before disabling nvme: only use power of two io boundaries nvme: fix controller instance leak nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()' nvme: Fix NULL dereference for pci nvme controllers nvme-rdma: fix reset hang if controller died in the middle of a reset nvme-rdma: fix timeout handler nvme-rdma: serialize controller teardown sequences nvme-tcp: fix reset hang if controller died in the middle of a reset nvme-tcp: fix timeout handler nvme-tcp: serialize controller teardown sequences nvme: have nvme_wait_freeze_timeout return if it timed out nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
2020-09-04Merge branch 'simplify-do_wp_page'Linus Torvalds
Merge emailed patches from Peter Xu: "This is a small series that I picked up from Linus's suggestion to simplify cow handling (and also make it more strict) by checking against page refcounts rather than mapcounts. This makes uffd-wp work again (verified by running upmapsort)" Note: this is horrendously bad timing, and making this kind of fundamental vm change after -rc3 is not at all how things should work. The saving grace is that it really is a a nice simplification: 8 files changed, 29 insertions(+), 120 deletions(-) The reason for the bad timing is that it turns out that commit 17839856fd58 ("gup: document and work around 'COW can break either way' issue" broke not just UFFD functionality (as Peter noticed), but Mikulas Patocka also reports that it caused issues for strace when running in a DAX environment with ext4 on a persistent memory setup. And we can't just revert that commit without re-introducing the original issue that is a potential security hole, so making COW stricter (and in the process much simpler) is a step to then undoing the forced COW that broke other uses. Link: https://lore.kernel.org/lkml/alpine.LRH.2.02.2009031328040.6929@file01.intranet.prod.int.rdu2.redhat.com/ * emailed patches from Peter Xu <peterx@redhat.com>: mm: Add PGREUSE counter mm/gup: Remove enfornced COW mechanism mm/ksm: Remove reuse_ksm_page() mm: do_wp_page() simplification
2020-09-04mm: Add PGREUSE counterPeter Xu
This accounts for wp_page_reuse() case, where we reused a page for COW. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-04mm/ksm: Remove reuse_ksm_page()Peter Xu
Remove the function as the last reference has gone away with the do_wp_page() changes. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-04dyndbg: refine export, rename to dynamic_debug_exec_queries()Jim Cromie
commit 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") had a few problems: - broken non DYNAMIC_DEBUG_CORE configs, sparse warning - the exported function modifies query string, breaks on RO strings. - func name follows internal convention, shouldn't be exposed as is. 1st is fixed in header with ifdefd function prototype or stub defn. Also remove an obsolete HAVE-symbol ifdef-comment, and add others. Fix others by wrapping existing internal function with a new one, named in accordance with module-prefix naming convention, before export hits v5.9.0. In new function, copy query string to a local buffer, so users can pass hard-coded/RO queries, and internal function can be used unchanged. Fixes: 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20200831182210.850852-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-04x86/entry: Unbreak 32bit fast syscallThomas Gleixner
Andy reported that the syscall treacing for 32bit fast syscall fails: # ./tools/testing/selftests/x86/ptrace_syscall_32 ... [RUN] SYSEMU [FAIL] Initial args are wrong (nr=224, args=10 11 12 13 14 4289172732) ... [RUN] SYSCALL [FAIL] Initial args are wrong (nr=29, args=0 0 0 0 0 4289172732) The eason is that the conversion to generic entry code moved the retrieval of the sixth argument (EBP) after the point where the syscall entry work runs, i.e. ptrace, seccomp, audit... Unbreak it by providing a split up version of syscall_enter_from_user_mode(). - syscall_enter_from_user_mode_prepare() establishes state and enables interrupts - syscall_enter_from_user_mode_work() runs the entry work Replace the call to syscall_enter_from_user_mode() in the 32bit fast syscall C-entry with the split functions and stick the EBP retrieval between them. Fixes: 27d6b4d14f5c ("x86/entry: Use generic syscall entry function") Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87k0xdjbtt.fsf@nanos.tec.linutronix.de
2020-09-04mm: Add arch hooks for saving/restoring tagsSteven Price
Arm's Memory Tagging Extension (MTE) adds some metadata (tags) to every physical page, when swapping pages out to disk it is necessary to save these tags, and later restore them when reading the pages back. Add some hooks along with dummy implementations to enable the arch code to handle this. Three new hooks are added to the swap code: * arch_prepare_to_swap() and * arch_swap_invalidate_page() / arch_swap_invalidate_area(). One new hook is added to shmem: * arch_swap_restore() Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: add unlock_page() on the error path] [catalin.marinas@arm.com: dropped the _tags suffix] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-09-04mm: Introduce arch_validate_flags()Catalin Marinas
Similarly to arch_validate_prot() called from do_mprotect_pkey(), an architecture may need to sanity-check the new vm_flags. Define a dummy function always returning true. In addition to do_mprotect_pkey(), also invoke it from mmap_region() prior to updating vma->vm_page_prot to allow the architecture code to veto potentially inconsistent vm_flags. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-09-04arm64: mte: Add PROT_MTE support to mmap() and mprotect()Catalin Marinas
To enable tagging on a memory range, the user must explicitly opt in via a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new memory type in the AttrIndx field of a pte, simplify the or'ing of these bits over the protection_map[] attributes by making MT_NORMAL index 0. There are two conditions for arch_vm_get_page_prot() to return the MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE, registered as VM_MTE in the vm_flags, and (2) the vma supports MTE, decided during the mmap() call (only) and registered as VM_MTE_ALLOWED. arch_calc_vm_prot_bits() is responsible for registering the user request as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its mmap() file ops call. In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on stack or brk area. The Linux mmap() syscall currently ignores unknown PROT_* flags. In the presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE will not report an error and the memory will not be mapped as Normal Tagged. For consistency, mprotect(PROT_MTE) will not report an error either if the memory range does not support MTE. Two subsequent patches in the series will propose tightening of this behaviour. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04mm: Introduce arch_calc_vm_flag_bits()Kevin Brodsky
Similarly to arch_calc_vm_prot_bits(), introduce a dummy arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro can be overridden by architectures to insert specific VM_* flags derived from the mmap() MAP_* flags. Signed-off-by: Kevin Brodsky <Kevin.Brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04mm: Add PG_arch_2 page flagSteven Price
For arm64 MTE support it is necessary to be able to mark pages that contain user space visible tags that will need to be saved/restored e.g. when swapped out. To support this add a new arch specific flag (PG_arch_2). This flag is only available on 64-bit architectures due to the limited number of spare page flags on the 32-bit ones. Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERICRoger Pau Monne
This is in preparation for the logic behind MEMORY_DEVICE_DEVDAX also being used by non DAX devices. No functional change intended. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: https://lore.kernel.org/r/20200901083326.21264-3-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi Kivilinna. 2) Fix loss of RTT samples in rxrpc, from David Howells. 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu. 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka. 5) We disable BH for too lokng in sctp_get_port_local(), add a cond_resched() here as well, from Xin Long. 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu. 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from Yonghong Song. 8) Missing of_node_put() in mt7530 DSA driver, from Sumera Priyadarsini. 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan. 10) Fix geneve tunnel checksumming bug in hns3, from Yi Li. 11) Memory leak in rxkad_verify_response, from Dinghao Liu. 12) In tipc, don't use smp_processor_id() in preemptible context. From Tuong Lien. 13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu. 14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter. 15) Fix ABI mismatch between driver and firmware in nfp, from Louis Peens. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits) net/smc: fix sock refcounting in case of termination net/smc: reset sndbuf_desc if freed net/smc: set rx_off for SMCR explicitly net/smc: fix toleration of fake add_link messages tg3: Fix soft lockup when tg3_reset_task() fails. doc: net: dsa: Fix typo in config code sample net: dp83867: Fix WoL SecureOn password nfp: flower: fix ABI mismatch between driver and firmware tipc: fix shutdown() of connectionless socket ipv6: Fix sysctl max for fib_multipath_hash_policy drivers/net/wan/hdlc: Change the default of hard_header_len to 0 net: gemini: Fix another missing clk_disable_unprepare() in probe net: bcmgenet: fix mask check in bcmgenet_validate_flow() amd-xgbe: Add support for new port mode net: usb: dm9601: Add USB ID of Keenetic Plus DSL vhost: fix typo in error message net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() pktgen: fix error message with wrong function name net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode cxgb4: fix thermal zone device registration ...
2020-09-03blk-mq, elevator: Count requests per hctx to improve performanceKashyap Desai
High CPU utilization on "native_queued_spin_lock_slowpath" due to lock contention is possible for mq-deadline and bfq IO schedulers when nr_hw_queues is more than one. It is because kblockd work queue can submit IO from all online CPUs (through blk_mq_run_hw_queues()) even though only one hctx has pending commands. The elevator callback .has_work for mq-deadline and bfq scheduler considers pending work if there are any IOs on request queue but it does not account hctx context. Add a per-hctx 'elevator_queued' count to the hctx to avoid triggering the elevator even though there are no requests queued. [jpg: Relocated atomic_dec() in dd_dispatch_request(), update commit message per Kashyap] Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Record active_queues_shared_sbitmap per tag_set for when using ↵John Garry
shared sbitmap For when using a shared sbitmap, no longer should the number of active request queues per hctx be relied on for when judging how to share the tag bitmap. Instead maintain the number of active request queues per tag_set, and make the judgement based on that. Originally-from: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Record nr_active_requests per queue for when using shared sbitmapJohn Garry
The per-hctx nr_active value can no longer be used to fairly assign a share of tag depth per request queue for when using a shared sbitmap, as it does not consider that the tags are shared tags over all hctx's. For this case, record the nr_active_requests per request_queue, and make the judgement based on that value. Co-developed-with: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Facilitate a shared sbitmap per tagsetJohn Garry
Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support multiple reply queues with single hostwide tags. In addition, these drivers want to use interrupt assignment in pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0], CPU hotplug may cause in-flight IO completion to not be serviced when an interrupt is shutdown. That problem is solved in commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline"). However, to take advantage of that blk-mq feature, the HBA HW queuess are required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW queues need to be exposed to the upper layer. In making that transition, the per-SCSI command request tags are no longer unique per Scsi host - they are just unique per hctx. As such, the HBA LLDD would have to generate this tag internally, which has a certain performance overhead. However another problem is that blk-mq assumes the host may accept (Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy counter was removed, which would stop the LLDD being sent more than .can_queue commands; however, it should still be ensured that the block layer does not issue more than .can_queue commands to the Scsi host. To solve this problem, introduce a shared sbitmap per blk_mq_tag_set, which may be requested at init time. New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the tagset to indicate whether the shared sbitmap should be used. Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests are still allocated per hctx; the reason for this is that if tags and requests were only allocated for a single hctx - like hctx0 - it may break block drivers which expect a request be associated with a specific hctx, i.e. not always hctx0. This will introduce extra memory usage. This change is based on work originally from Ming Lei in [1] and from Bart's suggestion in [2]. [0] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/ [1] https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@redhat.com/ [2] https://lore.kernel.org/linux-block/ff77beff-5fd9-9f05-12b6-826922bace1f@huawei.com/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHAREDMing Lei
BLK_MQ_F_TAG_SHARED actually means that tags is shared among request queues, all of which should belong to LUNs attached to same HBA. So rename it to make the point explicitly. [jpg: rebase a few times, add rnbd-clt.c change] Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: allow for_each_bvec to support zero len bvecMing Lei
Block layer usually doesn't support or allow zero-length bvec. Since commit 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()"), iterate_bvec() switches to bvec iterator. However, Al mentioned that 'Zero-length segments are not disallowed' in iov_iter. Fixes for_each_bvec() so that it can move on after seeing one zero length bvec. Fixes: 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()") Reported-by: syzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.html Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - data sanitization and validtion fixes for report descriptor parser from Marc Zyngier - memory leak fix for hid-elan driver from Dinghao Liu - two device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: core: Sanitize event code and type when mapping input HID: core: Correctly handle ReportSize being zero HID: elan: Fix memleak in elan_input_configured HID: microsoft: Add rumble support for the 8bitdo SN30 Pro+ controller HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for all Saitek X52 devices
2020-09-02regmap: Add can_sleep configuration optionDmitry Osipenko
Regmap can't sleep if spinlock is used for the locking protection. This patch fixes regression caused by a previous commit that switched regmap to use fsleep() and this broke Amlogic S922X platform. This patch adds new configuration option for regmap users, allowing to specify whether regmap operations can sleep and assuming that sleep is allowed if mutex is used for the regmap locking protection. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 2b32d2f7ce0a ("regmap: Use flexible sleep") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20200902141843.6591-1-digetx@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-02libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to SandisksTejun Heo
All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: remove revalidate_disk()Christoph Hellwig
Remove the now unused helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: use revalidate_disk_size in set_capacity_revalidate_and_notifyChristoph Hellwig
Only virtio_blk and xen-blkfront set the revalidate argument to true, and both do not implement the ->revalidate_disk method. So switch to the helper that just updates the size instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: add a new revalidate_disk_size helperChristoph Hellwig
revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02block: rename bd_invalidatedChristoph Hellwig
Replace bd_invalidate with a new BDEV_NEED_PART_SCAN flag in a bd_flags variable to better describe the condition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove an outdated comment on the bd_dev fieldChristoph Hellwig
kdev_t is long gone, so we don't need to comment a field isn't one.. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the discard_alignment field from struct hd_structChristoph Hellwig
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the alignment_offset field from struct hd_structChristoph Hellwig
The alignment offset is only used in slow path callers, so just calculate it on the fly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: Move blk_mq_bio_list_merge() into blk-merge.cBaolin Wang
Move the blk_mq_bio_list_merge() into blk-merge.c and rename it as a generic name. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the BIO_USER_MAPPED flagChristoph Hellwig
Just check if there is private data, in which case the bio must have originated from bio_copy_user_iov. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: remove the BIO_NULL_MAPPED flagChristoph Hellwig
We can simply use a boolean flag in the bio_map_data data structure instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: fix locking for struct block_device size updatesChristoph Hellwig
Two different callers use two different mutexes for updating the block device size, which obviously doesn't help to actually protect against concurrent updates from the different callers. In addition one of the locks, bd_mutex is rather prone to deadlocks with other parts of the block stack that use it for high level synchronization. Switch to using a new spinlock protecting just the size updates, as that is all we need, and make sure everyone does the update through the proper helper. This fixes a bug reported with the nvme revalidating disks during a hot removal operation, which can currently deadlock on bd_mutex. Reported-by: Xianting Tian <xianting_tian@126.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: replace bd_set_size with bd_set_nr_sectorsChristoph Hellwig
Replace bd_set_size with a version that takes the number of sectors instead, as that fits most of the current and future callers much better. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01block: Make request_queue.rpm_status an enumGeert Uytterhoeven
request_queue.rpm_status is assigned values of the rpm_status enum only, so reflect that in its type. Note that including <linux/pm.h> is (currently) a no-op, as it is already included through <linux/genhd.h> and <linux/device.h>, but it is better to play it safe. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01HID: core: Sanitize event code and type when mapping inputMarc Zyngier
When calling into hid_map_usage(), the passed event code is blindly stored as is, even if it doesn't fit in the associated bitmap. This event code can come from a variety of sources, including devices masquerading as input devices, only a bit more "programmable". Instead of taking the event code at face value, check that it actually fits the corresponding bitmap, and if it doesn't: - spit out a warning so that we know which device is acting up - NULLify the bitmap pointer so that we catch unexpected uses Code paths that can make use of untrusted inputs can now check that the mapping was indeed correct and bail out if not. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
2020-09-01tracepoint: Optimize using static_call()Steven Rostedt (VMware)
Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-09-01static_call: Allow early initPeter Zijlstra
In order to use static_call() to wire up x86_pmu, we need to initialize earlier, specifically before memory allocation works; copy some of the tricks from jump_label to enable this. Primarily we overload key->next to store a sites pointer when there are no modules, this avoids having to use kmalloc() to initialize the sites and allows us to run much earlier. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org
2020-09-01static_call: Handle tail-callsPeter Zijlstra
GCC can turn our static_call(name)(args...) into a tail call, in which case we get a JMP.d32 into the trampoline (which then does a further tail-call). Teach objtool to recognise and mark these in .static_call_sites and adjust the code patching to deal with this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.101186767@infradead.org
2020-09-01static_call: Add static_call_cond()Peter Zijlstra
Extend the static_call infrastructure to optimize the following common pattern: if (func_ptr) func_ptr(args...) For the trampoline (which is in effect a tail-call), we patch the JMP.d32 into a RET, which then directly consumes the trampoline call. For the in-line sites we replace the CALL with a NOP5. NOTE: this is 'obviously' limited to functions with a 'void' return type. NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed to a full function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org
2020-09-01static_call: Avoid kprobes on inline static_call()sPeter Zijlstra
Similar to how we disallow kprobes on any other dynamic text (ftrace/jump_label) also disallow kprobes on inline static_call()s. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200818135804.744920586@infradead.org
2020-09-01static_call: Add inline static call infrastructureJosh Poimboeuf
Add infrastructure for an arch-specific CONFIG_HAVE_STATIC_CALL_INLINE option, which is a faster version of CONFIG_HAVE_STATIC_CALL. At runtime, the static call sites are patched directly, rather than using the out-of-line trampolines. Compared to out-of-line static calls, the performance benefits are more modest, but still measurable. Steven Rostedt did some tracepoint measurements: https://lkml.kernel.org/r/20181126155405.72b4f718@gandalf.local.home This code is heavily inspired by the jump label code (aka "static jumps"), as some of the concepts are very similar. For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface; merged trampolines] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.684334440@infradead.org