summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2019-03-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: - a few misc things - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits) tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include proc: more robust bulk read test proc: test /proc/*/maps, smaps, smaps_rollup, statm proc: use seq_puts() everywhere proc: read kernel cpu stat pointer once proc: remove unused argument in proc_pid_lookup() fs/proc/thread_self.c: code cleanup for proc_setup_thread_self() fs/proc/self.c: code cleanup for proc_setup_self() proc: return exit code 4 for skipped tests mm,mremap: bail out earlier in mremap_to under map pressure mm/sparse: fix a bad comparison mm/memory.c: do_fault: avoid usage of stale vm_area_struct writeback: fix inode cgroup switching comment mm/huge_memory.c: fix "orig_pud" set but not used mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC mm/memcontrol.c: fix bad line in comment mm/cma.c: cma_declare_contiguous: correct err handling mm/page_ext.c: fix an imbalance with kmemleak mm/compaction: pass pgdat to too_many_isolated() instead of zone mm: remove zone_lru_lock() function, access ->lru_lock directly ...
2019-03-06Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull ARM SoC late updates from Arnd Bergmann: "Here are two branches that came relatively late during the linux-5.0 development cycle and have dependencies on the other branches: - On the TI OMAP platform, the CPSW Ethernet PHY mode selection driver is being replaced, this puts the final pieces in place - On the DaVinci platform, the interrupt handling code in arch/arm gets moved into a regular device driver in drivers/irqchip. Since they both had some time in linux-next after the 5.0-rc8 release, I'm sending them along with the other updates" * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (38 commits) net: ethernet: ti: cpsw: deprecate cpsw-phy-sel driver ARM: davinci: remove intc related fields from davinci_soc_info irqchip: davinci-cp-intc: move the driver to drivers/irqchip ARM: davinci: cp-intc: remove redundant comments ARM: davinci: cp-intc: drop GPL license boilerplate ARM: davinci: cp-intc: use readl/writel_relaxed() ARM: davinci: cp-intc: unify error handling ARM: davinci: cp-intc: improve coding style ARM: davinci: cp-intc: request the memory region before remapping it ARM: davinci: cp-intc: use the new-style config structure ARM: davinci: cp-intc: convert all hex numbers to lowercase ARM: davinci: cp-intc: use a common prefix for all symbols ARM: davinci: cp-intc: add the new config structures for da8xx SoCs irqchip: davinci-cp-intc: add a new config structure ARM: davinci: cp-intc: add a wrapper around cp_intc_init() ARM: davinci: cp-intc: remove cp_intc.h irqchip: davinci-aintc: move the driver to drivers/irqchip ARM: davinci: aintc: remove unnecessary includes ARM: davinci: aintc: remove the timer-specific irq_set_handler() ARM: davinci: aintc: request memory region before remapping it ...
2019-03-06net: hns3: Fix a logical vs bitwise typoDan Carpenter
There were a couple logical ORs accidentally mixed in with the bitwise ORs. Fixes: e8149933b1fa ("net: hns3: remove hnae3_get_bit in data path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06Merge tag 'armsoc-newsoc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM new SoC family support from Arnd Bergmann: "Two new SoC families are added this time. Sugaya Taichi submitted support for the Milbeaut SoC family from Socionext and explains: "SC2000 is a SoC of the Milbeaut series. equipped with a DSP optimized for computer vision. It also features advanced functionalities such as 360-degree, real-time spherical stitching with multi cameras, image stabilization for without mechanical gimbals, and rolling shutter correction. More detail is below: https://www.socionext.com/en/products/assp/milbeaut/SC2000.html" Interestingly, this one has a history dating back to older chips made by Socionext and previously Matsushita/Panasonic based on their own mn10300 CPU architecture that was removed from the kernel last year. Manivannan Sadhasivam adds support for another SoC family, this is the Bitmain BM1880 chip used in the Sophon Edge TPU developer board. The chip is intended for Deep Learning applications, and comes with dual-core Arm Cortex-A53 to run Linux as well as a RISC-V microcontroller core to control the tensor unit. For the moment, the TPU is not accessible in mainline Linux, so we treat it as a generic Arm SoC. More information is available at https://www.sophon.ai/" * tag 'armsoc-newsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: multi_v7_defconfig: add ARCH_MILBEAUT and ARCH_MILBEAUT_M10V ARM: configs: Add Milbeaut M10V defconfig ARM: dts: milbeaut: Add device tree set for the Milbeaut M10V board clocksource/drivers/timer-milbeaut: Introduce timer for Milbeaut SoCs dt-bindings: timer: Add Milbeaut M10V timer description ARM: milbeaut: Add basic support for Milbeaut m10v SoC dt-bindings: Add documentation for Milbeaut SoCs dt-bindings: arm: Add SMP enable-method for Milbeaut dt-bindings: sram: milbeaut: Add binding for Milbeaut smp-sram MAINTAINERS: Add entry for Bitmain SoC platform arm64: dts: bitmain: Add Sophon Egde board support arm64: dts: bitmain: Add BM1880 SoC support arm64: Add ARCH_BITMAIN platform dt-bindings: arm: Document Bitmain BM1880 SoC
2019-03-06Merge tag 'armsoc-drivers' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, the drivers/tee and drivers/reset subsystems get merged here, with the expected set of smaller updates and some new hardware support. The tee subsystem now supports device drivers to be attached to a tee, the first example here is a random number driver with its implementation in the secure world. Three new power domain drivers get added for specific chip families: - Broadcom BCM283x chips (used in Raspberry Pi) - Qualcomm Snapdragon phone chips - Xilinx ZynqMP FPGA SoCs One new driver is added to talk to the BPMP firmware on NVIDIA Tegra210 Existing drivers are extended for new SoC variants from NXP, NVIDIA, Amlogic and Qualcomm" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (113 commits) tee: optee: update optee_msg.h and optee_smc.h to dual license tee: add cancellation support to client interface dpaa2-eth: configure the cache stashing amount on a queue soc: fsl: dpio: configure cache stashing destination soc: fsl: dpio: enable frame data cache stashing per software portal soc: fsl: guts: make fsl_guts_get_svr() static hwrng: make symbol 'optee_rng_id_table' static tee: optee: Fix unsigned comparison with less than zero hwrng: Fix unsigned comparison with less than zero tee: fix possible error pointer ctx dereferencing hwrng: optee: Initialize some structs using memset instead of braces tee: optee: Initialize some structs using memset instead of braces soc: fsl: dpio: fix memory leak of a struct qbman on error exit path clk: tegra: dfll: Make symbol 'tegra210_cpu_cvb_tables' static soc: qcom: llcc-slice: Fix typos qcom: soc: llcc-slice: Consolidate some code qcom: soc: llcc-slice: Clear the global drv_data pointer on error drivers: soc: xilinx: Add ZynqMP power domain driver firmware: xilinx: Add APIs to control node status/power dt-bindings: power: Add ZynqMP power domain bindings ...
2019-03-06scsi: virtio_scsi: don't send sc payload with tmfsFelipe Franciosi
The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of device-readable records and a single device-writable response entry: struct virtio_scsi_ctrl_tmf { // Device-readable part le32 type; le32 subtype; u8 lun[8]; le64 id; // Device-writable part u8 response; } The above should be organised as two descriptor entries (or potentially more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64 id" or after "u8 response". The Linux driver doesn't respect that, with virtscsi_abort() and virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf(). It results in the original scsi command payload (or writable buffers) added to the tmf. This fixes the problem by leaving cmd->sc zeroed out, which makes virtscsi_kick_cmd() add the tmf to the control vq without any payload. Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull ARM SoC platform updates from Arnd Bergmann: "The APM X-Gene platform is now maintained by folks from Ampere computing that took over the product line a while ago, this gets reflected in the MAINTAINERS file. Cleanups continue on the older mach-davinci and mach-pxa platform, to get them to be more like the modern ones. For pxa, we now remove the Raumfeld platform code as it now works with device tree based booting. i.MX adds a couple new features for the i.MX7ULP SoC Mediatek gains support for a new SoC: MT7629 is a new wireless router platform, following MT7623. Aside from those, there are the usual minor cleanups and bugfixes across several platforms" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (49 commits) MAINTAINERS: Update Ampere email address usb: ohci-da8xx: remove unused callbacks from platform data ARM: davinci: da830-evm: remove legacy usb helpers ARM: davinci: omapl138-hawk: remove legacy usb helpers usb: ohci-da8xx: add vbus and overcurrent gpios ARM: davinci: da830-evm: use gpio lookup entries for usb gpios ARM: davinci: omapl138-hawk: use gpio lookup entries for usb gpios usb: ohci-da8xx: add a helper pointer to &pdev->dev usb: ohci-da8xx: add a new line after local variables arm64: meson: enable g12a clock controller MAINTAINERS: Add entry for uDPU board ARM: davinci: da850-evm: use GPIO hogs instead of the legacy API arm: mediatek: add MT7629 smp bring up code Revert "ARM: mediatek: add MT7623a smp bringup code" dt-bindings: soc: fix typo of MT8173 power dt-bindings ARM: meson: remove COMMON_CLK_AMLOGIC selection arm64: meson: remove COMMON_CLK_AMLOGIC selection ARM: lpc32xx: remove platform data of ARM PL111 LCD controller ARM: lpc32xx: remove platform data of ARM PL180 SD/MMC controller ARM: lpc32xx: Use kmemdup to replace duplicating its implementation ...
2019-03-06scsi: smartpqi: Reporting 'logical unit failure'Erwan Velu
When the HARDWARE_ERROR/0x3e/0x1 case is triggered, the logical volume is offlined. When reading the kernel log, the reason why the device got offlined isn't reported to the user. This situation makes it difficult for admins to root cause. Log a message when this condition occurs. [mkp: tweaked commit message] Signed-off-by: Erwan Velu <e.velu@criteo.com> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06Merge branch 'stable/for-jens-5.1' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-5.1/block-post Pull two xen blkback fixes from Konrad. * 'stable/for-jens-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront xen/blkback: add stack variable 'blkif' in connect_ring()
2019-03-06vhost: silence an unused-variable warningArnd Bergmann
On some architectures, the MMU can be disabled, leading to access_ok() becoming an empty macro that does not evaluate its size argument, which in turn produces an unused-variable warning: drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable] size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; Mark the variable as __maybe_unused to shut up that warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06virtio: hint if callbacks surprisingly might sleepCornelia Huck
A virtio transport is free to implement some of the callbacks in virtio_config_ops in a matter that they cannot be called from atomic context (e.g. virtio-ccw, which maps a lot of the callbacks to channel I/O, which is an inherently asynchronous mechanism). This can be very surprising for developers using the much more common virtio-pci transport, just to find out that things break when used on s390. The documentation for virtio_config_ops now contains a comment explaining this, but it makes sense to add a might_sleep() annotation to various wrapper functions in the virtio core to avoid surprises later. Note that annotations are NOT added to two classes of calls: - direct calls from device drivers (all current callers should be fine, however) - calls which clearly won't be made from atomic context (such as those ultimately coming in via the driver core) Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06virtio-ccw: wire up ->bus_name callbackCornelia Huck
Return the bus id of the ccw proxy device. This makes 'ethtool -i' show a more useful value than 'virtio' in the bus-info field. Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06s390/virtio: handle find on invalid queue gracefullyHalil Pasic
A queue with a capacity of zero is clearly not a valid virtio queue. Some emulators report zero queue size if queried with an invalid queue index. Instead of crashing in this case let us just return -ENOENT. To make that work properly, let us fix the notifier cleanup logic as well. Cc: stable@vger.kernel.org Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06virtio_balloon: remove the unnecessary 0-initializationWei Wang
We've changed to kzalloc the vb struct, so no need to 0-initialize this field one more time. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-03-06virtio-balloon: improve update_balloon_size_funcWei Wang
There is no need to update the balloon actual register when there is no ballooning request. This patch avoids update_balloon_size when diff is 0. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06virtio-blk: Consider virtio_max_dma_size() for maximum segment sizeJoerg Roedel
Segments can't be larger than the maximum DMA mapping size supported on the platform. Take that into account when setting the maximum segment size for a block device. Cc: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06virtio: Introduce virtio_max_dma_size()Joerg Roedel
This function returns the maximum segment size for a single dma transaction of a virtio device. The possible limit comes from the SWIOTLB implementation in the Linux kernel, that has an upper limit of (currently) 256kb of contiguous memory it can map. Other DMA-API implementations might also have limits. Use the new dma_max_mapping_size() function to determine the maximum mapping size when DMA-API is in use for virtio. Cc: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Lots of tooling updates - too many to list, here's a few highlights: - Various subcommand updates to 'perf trace', 'perf report', 'perf record', 'perf annotate', 'perf script', 'perf test', etc. - CPU and NUMA topology and affinity handling improvements, - HW tracing and HW support updates: - Intel PT updates - ARM CoreSight updates - vendor HW event updates - BPF updates - Tons of infrastructure updates, both on the build system and the library support side - Documentation updates. - ... and lots of other changes, see the changelog for details. Kernel side updates: - Tighten up kprobes blacklist handling, reduce the number of places where developers can install a kprobe and hang/crash the system. - Fix/enhance vma address filter handling. - Various PMU driver updates, small fixes and additions. - refcount_t conversions - BPF updates - error code propagation enhancements - misc other changes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits) perf script python: Add Python3 support to syscall-counts-by-pid.py perf script python: Add Python3 support to syscall-counts.py perf script python: Add Python3 support to stat-cpi.py perf script python: Add Python3 support to stackcollapse.py perf script python: Add Python3 support to sctop.py perf script python: Add Python3 support to powerpc-hcalls.py perf script python: Add Python3 support to net_dropmonitor.py perf script python: Add Python3 support to mem-phys-addr.py perf script python: Add Python3 support to failed-syscalls-by-pid.py perf script python: Add Python3 support to netdev-times.py perf tools: Add perf_exe() helper to find perf binary perf script: Handle missing fields with -F +.. perf data: Add perf_data__open_dir_data function perf data: Add perf_data__(create_dir|close_dir) functions perf data: Fail check_backup in case of error perf data: Make check_backup work over directories perf tools: Add rm_rf_perf_data function perf tools: Add pattern name checking to rm_rf perf tools: Add depth checking to rm_rf perf data: Add global path holder ...
2019-03-06Merge branch 'efi-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main EFI changes in this cycle were: - Use 32-bit alignment for efi_guid_t - Allow the SetVirtualAddressMap() call to be omitted - Implement earlycon=efifb based on existing earlyprintk code - Various minor fixes and code cleanups from Sai, Ard and me" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Fix build error due to enum collision between efi.h and ima.h efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation x86: Make ARCH_USE_MEMREMAP_PROT a generic Kconfig symbol efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted efi: Replace GPL license boilerplate with SPDX headers efi/fdt: Apply more cleanups efi: Use 32-bit alignment for efi_guid_t efi/memattr: Don't bail on zero VA if it equals the region's PA x86/efi: Mark can_free_region() as an __init function
2019-03-06dm integrity: limit the rate of error messagesMikulas Patocka
When using dm-integrity underneath md-raid, some tests with raid auto-correction trigger large amounts of integrity failures - and all these failures print an error message. These messages can bring the system to a halt if the system is using serial console. Fix this by limiting the rate of error messages - it improves the speed of raid recovery and avoids the hang. Fixes: 7eada909bfd7a ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-06s390/zcrypt: revisit ap device remove procedureHarald Freudenberger
Working with the vfio-ap driver let to some revisit of the way how an ap (queue) device is removed from the driver. With the current implementation all the cleanup was done before the driver even got notified about the removal. Now the ap queue removal is done in 3 steps: 1) A preparation step, all ap messages within the queue are flushed and so the driver does 'receive' them. Also a new state AP_STATE_REMOVE assigned to the queue makes sure there are no new messages queued in. 2) Now the driver's remove function is invoked and the driver should do the job of cleaning up it's internal administration lists or whatever. After 2) is done it is guaranteed, that the driver is not invoked any more. On the other hand the driver has to make sure that the APQN is not accessed any more after step 2 is complete. 3) Now the ap bus code does the job of total cleanup of the APQN. A reset with zero is triggered and the state of the queue goes to AP_STATE_UNBOUND. After step 3) is complete, the ap queue has no pending messages and the APQN is cleared and so there are no requests and replies lingering around in the firmware queue for this APQN. Also the interrupts are disabled. After these remove steps the ap queue device may be assigned to another driver. Stress testing this remove/probe procedure showed a problem with the correct module reference counting. The actual receive of an reply in the driver is done asynchronous with completions. So with a driver change on an ap queue the message flush triggers completions but the threads waiting for the completions may run at a time where the queue already has the new driver assigned. So the module_put() at receive time needs to be done on the driver module which queued the ap message. This change is also part of this patch. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-05agp: efficeon: no need to set PG_reserved on GATT tablesDavid Hildenbrand
Patch series "mm: PG_reserved cleanups and documentation", v2. I was recently going over all users of PG_reserved. Short story: it is difficult and sometimes not really clear if setting/checking for PG_reserved is only a relict from the past. Easy to break things. I guess I now have a pretty good idea wh things are like that nowadays and how they evolved. I had way more cleanups in this series inititally, but some architectures take PG_reserved as a way to apply a different caching strategy (for MMIO pages). So I decided to only include the most obvious changes (that are less likely to break something). So the big chunk of manual SetPageReserved users are MMIO/DMA related things on device buffers. Most notably, for device memory we will hopefully soon stop setting PG_reserved. Then the documentation has to be updated. This patch (of 9): The l1 GATT page table is kept in a special on-chip page with 64 entries. We allocate the l2 page table pages via get_zeroed_page() and enter them into the table. These l2 pages are modified accordingly when inserting/removing memory via efficeon_insert_memory and efficeon_remove_memory. Apart from that, these pages are not exposed or ioremap'ed. We can stop setting them reserved (propably copied from generic code). Link: http://lkml.kernel.org/r/20190114125903.24845-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05mm, compaction: use free lists to quickly locate a migration sourceMel Gorman
The migration scanner is a linear scan of a zone with a potentiall large search space. Furthermore, many pageblocks are unusable such as those filled with reserved pages or partially filled with pages that cannot migrate. These still get scanned in the common case of allocating a THP and the cost accumulates. The patch uses a partial search of the free lists to locate a migration source candidate that is marked as MOVABLE when allocating a THP. It prefers picking a block with a larger number of free pages already on the basis that there are fewer pages to migrate to free the entire block. The lowest PFN found during searches is tracked as the basis of the start for the linear search after the first search of the free list fails. After the search, the free list is shuffled so that the next search will not encounter the same page. If the search fails then the subsequent searches will be shorter and the linear scanner is used. If this search fails, or if the request is for a small or unmovable/reclaimable allocation then the linear scanner is still used. It is somewhat pointless to use the list search in those cases. Small free pages must be used for the search and there is no guarantee that movable pages are located within that block that are contiguous. 5.0.0-rc1 5.0.0-rc1 noboost-v3r10 findmig-v3r15 Amean fault-both-3 3771.41 ( 0.00%) 3390.40 ( 10.10%) Amean fault-both-5 5409.05 ( 0.00%) 5082.28 ( 6.04%) Amean fault-both-7 7040.74 ( 0.00%) 7012.51 ( 0.40%) Amean fault-both-12 11887.35 ( 0.00%) 11346.63 ( 4.55%) Amean fault-both-18 16718.19 ( 0.00%) 15324.19 ( 8.34%) Amean fault-both-24 21157.19 ( 0.00%) 16088.50 * 23.96%* Amean fault-both-30 21175.92 ( 0.00%) 18723.42 * 11.58%* Amean fault-both-32 21339.03 ( 0.00%) 18612.01 * 12.78%* 5.0.0-rc1 5.0.0-rc1 noboost-v3r10 findmig-v3r15 Percentage huge-3 86.50 ( 0.00%) 89.83 ( 3.85%) Percentage huge-5 92.52 ( 0.00%) 91.96 ( -0.61%) Percentage huge-7 92.44 ( 0.00%) 92.85 ( 0.44%) Percentage huge-12 92.98 ( 0.00%) 92.74 ( -0.25%) Percentage huge-18 91.70 ( 0.00%) 91.71 ( 0.02%) Percentage huge-24 91.59 ( 0.00%) 92.13 ( 0.60%) Percentage huge-30 90.14 ( 0.00%) 93.79 ( 4.04%) Percentage huge-32 90.03 ( 0.00%) 91.27 ( 1.37%) This shows an improvement in allocation latencies with similar allocation success rates. While not presented, there was a 31% reduction in migration scanning and a 8% reduction on system CPU usage. A 2-socket machine showed similar benefits. [mgorman@techsingularity.net: several fixes] Link: http://lkml.kernel.org/r/20190204120111.GL9565@techsingularity.net [vbabka@suse.cz: migrate block that was found-fast, some optimisations] Link: http://lkml.kernel.org/r/20190118175136.31341-10-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <Vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05mm: replace all open encodings for NUMA_NO_NODEAnshuman Khandual
Patch series "Replace all open encodings for NUMA_NO_NODE", v3. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" This patch (of 2): At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe] Acked-by: Jens Axboe <axboe@kernel.dk> [mtip32xx] Acked-by: Vinod Koul <vkoul@kernel.org> [dmaengine.c] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Doug Ledford <dledford@redhat.com> [drivers/infiniband] Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05vmw_balloon: mark inflated pages PG_offlineDavid Hildenbrand
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. [david@redhat.com: use vmballoon_page_in_frames more widely] Link: http://lkml.kernel.org/r/20181122100627.5189-7-david@redhat.com Link: http://lkml.kernel.org/r/20181119101616.8901-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Nadav Amit <namit@vmware.com> Cc: Xavier Deguillard <xdeguillard@vmware.com> Cc: Nadav Amit <namit@vmware.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julien Freche <jfreche@vmware.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Hansen <chansen3@cisco.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kairui Song <kasong@redhat.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <len.brown@intel.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Pankaj gupta <pagupta@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05hv_balloon: mark inflated pages PG_offlineDavid Hildenbrand
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. Link: http://lkml.kernel.org/r/20181119101616.8901-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj gupta <pagupta@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Kairui Song <kasong@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Hansen <chansen3@cisco.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Freche <jfreche@vmware.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Nadav Amit <namit@vmware.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05xen/balloon: mark inflated pages PG_offlineDavid Hildenbrand
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. Link: http://lkml.kernel.org/r/20181119101616.8901-5-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Hansen <chansen3@cisco.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Julien Freche <jfreche@vmware.com> Cc: Kairui Song <kasong@redhat.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <len.brown@intel.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Nadav Amit <namit@vmware.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Pankaj gupta <pagupta@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05mm/page_alloc.c: memory hotplug: free pages as higher orderArun KS
When freeing pages are done with higher order, time spent on coalescing pages by buddy allocator can be reduced. With section size of 256MB, hot add latency of a single section shows improvement from 50-60 ms to less than 1 ms, hence improving the hot add latency by 60 times. Modify external providers of online callback to align with the change. [arunks@codeaurora.org: v11] Link: http://lkml.kernel.org/r/1547792588-18032-1-git-send-email-arunks@codeaurora.org [akpm@linux-foundation.org: remove unused local, per Arun] [akpm@linux-foundation.org: avoid return of void-returning __free_pages_core(), per Oscar] [akpm@linux-foundation.org: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch] [arunks@codeaurora.org: v8] Link: http://lkml.kernel.org/r/1547032395-24582-1-git-send-email-arunks@codeaurora.org [arunks@codeaurora.org: v9] Link: http://lkml.kernel.org/r/1547098543-26452-1-git-send-email-arunks@codeaurora.org Link: http://lkml.kernel.org/r/1538727006-5727-1-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mathieu Malaterre <malat@debian.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Srivatsa Vaddagiri <vatsa@codeaurora.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05Merge branch 'timers-2038-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull year 2038 updates from Thomas Gleixner: "Another round of changes to make the kernel ready for 2038. After lots of preparatory work this is the first set of syscalls which are 2038 safe: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 The syscall numbers are identical all over the architectures" * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) riscv: Use latest system call ABI checksyscalls: fix up mq_timedreceive and stat exceptions unicore32: Fix __ARCH_WANT_STAT64 definition asm-generic: Make time32 syscall numbers optional asm-generic: Drop getrlimit and setrlimit syscalls from default list 32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option compat ABI: use non-compat openat and open_by_handle_at variants y2038: add 64-bit time_t syscalls to all 32-bit architectures y2038: rename old time and utime syscalls y2038: remove struct definition redirects y2038: use time32 syscall names on 32-bit syscalls: remove obsolete __IGNORE_ macros y2038: syscalls: rename y2038 compat syscalls x86/x32: use time64 versions of sigtimedwait and recvmmsg timex: change syscalls to use struct __kernel_timex timex: use __kernel_timex internally sparc64: add custom adjtimex/clock_adjtime functions time: fix sys_timer_settime prototype time: Add struct __kernel_timex time: make adjtime compat handling available for 32 bit ...
2019-03-05PCI: Fix "try" semantics of bus and slot resetAlex Williamson
The commit referenced below introduced device locking around save and restore of state for each device during a PCI bus "try" reset, making it decidely non-"try" and prone to deadlock in the event that a device is already locked. Restore __pci_reset_bus() and __pci_reset_slot() to their advertised locking semantics by pushing the save and restore functions into the branch where the entire tree is already locked. Extend the helper function names with "_locked" and update the comment to reflect this calling requirement. Fixes: b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>
2019-03-05PCI/LINK: Report degraded links via link bandwidth notificationAlexandru Gagniuc
A warning is generated when a PCIe device is probed with a degraded link, but there was no similar mechanism to warn when the link becomes degraded after probing. The Link Bandwidth Notification provides this mechanism. Use the Link Bandwidth Management Interrupt to detect bandwidth changes, and rescan the bandwidth, looking for the weakest point. This is the same logic used in probe(). Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2019-03-05Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The interrupt departement delivers this time: - New infrastructure to manage NMIs on platforms which have a sane NMI delivery, i.e. identifiable NMI vectors instead of a single lump. - Simplification of the interrupt affinity management so drivers don't have to implement ugly loops around the PCI/MSI enablement. - Speedup for interrupt statistics in /proc/stat - Provide a function to retrieve the default irq domain - A new interrupt controller for the Loongson LS1X platform - Affinity support for the SiFive PLIC - Better support for the iMX irqsteer driver - NUMA aware memory allocations for GICv3 - The usual small fixes, improvements and cleanups all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) irqchip/imx-irqsteer: Add multi output interrupts support irqchip/imx-irqsteer: Change to use reg_num instead of irq_group dt-bindings: irq: imx-irqsteer: Add multi output interrupts support dt-binding: irq: imx-irqsteer: Use irq number instead of group number irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables irqdomain: Allow the default irq domain to be retrieved irqchip/sifive-plic: Implement irq_set_affinity() for SMP host irqchip/sifive-plic: Differentiate between PLIC handler and context irqchip/sifive-plic: Add warning in plic_init() if handler already present irqchip/sifive-plic: Pre-compute context hart base and enable base PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets genirq/affinity: Remove the leftovers of the original set support nvme-pci: Simplify interrupt allocation genirq/affinity: Add new callback for (re)calculating interrupt sets genirq/affinity: Store interrupt sets size in struct irq_affinity genirq/affinity: Code consolidation irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid. irqchip/i8259: Fix shutdown order by moving syscore_ops registration dt-bindings: interrupt-controller: loongson ls1x intc ...
2019-03-05ubi: wl: Silence uninitialized variable warningDan Carpenter
This condition needs to be fipped around because "err" is uninitialized when "force" is set. The Smatch static analysis tool complains and UBsan will also complain at runtime. Fixes: 663586c0a892 ("ubi: Expose the bitrot interface") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2019-03-05Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and clockevent updates from Thomas Gleixner: "The time(r) core and clockevent updates are mostly boring this time: - A new driver for the Tegra210 timer - Small fixes and improvements alll over the place - Documentation updates and cleanups" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) soc/tegra: default select TEGRA_TIMER for Tegra210 clocksource/drivers/tegra: Add Tegra210 timer support dt-bindings: timer: add Tegra210 timer clocksource/drivers/timer-cs5535: Rename the file for consistency clocksource/drivers/timer-pxa: Rename the file for consistency clocksource/drivers/tango-xtal: Rename the file for consistency dt-bindings: timer: gpt: update binding doc clocksource/drivers/exynos_mct: Remove unused header includes dt-bindings: timer: mediatek: update bindings for MT7629 SoC clocksource/drivers/exynos_mct: Fix error path in timer resources initialization clocksource/drivers/exynos_mct: Remove dead code clocksource/drivers/riscv: Add required checks during clock source init dt-bindings: timer: renesas: tmu: Document r8a774c0 bindings dt-bindings: timer: renesas, cmt: Document r8a774c0 CMT support clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable timers: Mark expected switch fall-throughs timekeeping/debug: No need to check return value of debugfs_create functions ...
2019-03-05dm snapshot: don't define direct_access if we don't support itMikulas Patocka
Don't define a direct_access function that fails, dm_dax_direct_access already fails with -EIO if the pointer is zero; Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm cache: add support for discard passdown to the origin deviceMike Snitzer
DM cache now defaults to passing discards down to the origin device. User may disable this using the "no_discard_passdown" feature when creating the cache device. If the cache's underlying origin device doesn't support discards then passdown is disabled (with warning). Similarly, if the underlying origin device's max_discard_sectors is less than a cache block discard passdown will be disabled (this is required because sizing of the cache internal discard bitset depends on it). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm writecache: fix typo in name for writeback_wqHuaisheng Ye
The workqueue's name should be "writecache-writeback" instead of "writecache-writeabck". Signed-off-by: Huaisheng Ye <yehs1@lenovo.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: add support to directly boot to a mapped deviceHelen Koike
Add a "create" module parameter, which allows device-mapper targets to be configured at boot time. This enables early use of DM targets in the boot process (as the root device or otherwise) without the need of an initramfs. The syntax used in the boot param is based on the concise format from the dmsetup tool to follow the rule of least surprise: dmsetup table --concise /dev/mapper/lroot Which is: dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+] Where, <name> ::= The device name. <uuid> ::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | "" <minor> ::= The device minor number | "" <flags> ::= "ro" | "rw" <table> ::= <start_sector> <num_sectors> <target_type> <target_args> <target_type> ::= "verity" | "linear" | ... For example, the following could be added in the boot parameters: dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0 Only the targets that were tested are allowed and the ones that don't change any block device when the device is create as read-only. For example, mirror and cache targets are not allowed. The rationale behind this is that if the user makes a mistake, choosing the wrong device to be the mirror or the cache can corrupt data. The only targets initially allowed are: * crypt * delay * linear * snapshot-origin * striped * verity Co-developed-by: Will Drewry <wad@chromium.org> Co-developed-by: Kees Cook <keescook@chromium.org> Co-developed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm thin: add sanity checks to thin-pool and external snapshot creationJason Cai (Xiang Feng)
Invoking dm_get_device() twice on the same device path with different modes is dangerous. Because in that case, upgrade_mode() will alloc a new 'dm_dev' and free the old one, which may be referenced by a previous caller. Dereferencing the dangling pointer will trigger kernel NULL pointer dereference. The following two cases can reproduce this issue. Actually, they are invalid setups that must be disallowed, e.g.: 1. Creating a thin-pool with read_only mode, and the same device as both metadata and data. dmsetup create thinp --table \ "0 41943040 thin-pool /dev/vdb /dev/vdb 128 0 1 read_only" BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 ... Call Trace: new_read+0xfb/0x110 [dm_bufio] dm_bm_read_lock+0x43/0x190 [dm_persistent_data] ? kmem_cache_alloc_trace+0x15c/0x1e0 __create_persistent_data_objects+0x65/0x3e0 [dm_thin_pool] dm_pool_metadata_open+0x8c/0xf0 [dm_thin_pool] pool_ctr.cold.79+0x213/0x913 [dm_thin_pool] ? realloc_argv+0x50/0x70 [dm_mod] dm_table_add_target+0x14e/0x330 [dm_mod] table_load+0x122/0x2e0 [dm_mod] ? dev_status+0x40/0x40 [dm_mod] ctl_ioctl+0x1aa/0x3e0 [dm_mod] dm_ctl_ioctl+0xa/0x10 [dm_mod] do_vfs_ioctl+0xa2/0x600 ? handle_mm_fault+0xda/0x200 ? __do_page_fault+0x26c/0x4f0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 2. Creating a external snapshot using the same thin-pool device. dmsetup create thinp --table \ "0 41943040 thin-pool /dev/vdc /dev/vdb 128 0 2 ignore_discard" dmsetup message /dev/mapper/thinp 0 "create_thin 0" dmsetup create snap --table \ "0 204800 thin /dev/mapper/thinp 0 /dev/mapper/thinp" BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... Call Trace: ? __alloc_pages_nodemask+0x13c/0x2e0 retrieve_status+0xa5/0x1f0 [dm_mod] ? dm_get_live_or_inactive_table.isra.7+0x20/0x20 [dm_mod] table_status+0x61/0xa0 [dm_mod] ctl_ioctl+0x1aa/0x3e0 [dm_mod] dm_ctl_ioctl+0xa/0x10 [dm_mod] do_vfs_ioctl+0xa2/0x600 ksys_ioctl+0x60/0x90 ? ksys_write+0x4f/0xb0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm block manager: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm verity fec: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm integrity: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: always call blk_queue_split() in dm_process_bio()Mike Snitzer
Do not just call blk_queue_split() if the bio is_abnormal_io(). Fixes: 568c73a355e ("dm: update dm_process_bio() to split bio if in ->make_request_fn()") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm switch: use struct_size() in kzalloc()Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: remove unused _rq_tio_cache and _rq_cacheMike Snitzer
Also move dm_rq_target_io structure definition from dm-rq.h to dm-rq.c Fixes: 6a23e05c2fe3c6 ("dm: remove legacy request-based IO path") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds
Pull MIPS updates from Paul Burton: - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB (GINVT) instructions, allowing for more efficient TLB maintenance when running on a CPU such as the I6500 that supports these. - Enable huge page support for MIPS64r6. - Optimize post-DMA cache sync by removing that code entirely for kernel configurations in which we know it won't be needed. - The number of pages allocated for interrupt stacks is now calculated correctly, where before we would wastefully allocate too much memory in some configurations. - The ath79 platform migrates to devicetree. - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board. - The ingenic/jz4740 platform gains support for appended devicetrees. - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see cleanups as do various pieces of core architecture code. * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits) MIPS: lantiq: Remove separate GPHY Firmware loader MIPS: ingenic: Add support for appended devicetree MIPS: SGI-IP27: rework HUB interrupts MIPS: SGI-IP27: do boot CPU init later MIPS: SGI-IP27: do xtalk scanning later MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output MIPS: SGI-IP27: clean up bridge access and header files MIPS: SGI-IP27: get rid of volatile and hubreg_t MIPS: irq: Allocate accurate order pages for irq stack MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys() MIPS: eBPF: Remove REG_32BIT_ZERO_EX MIPS: eBPF: Always return sign extended 32b values MIPS: CM: Fix indentation MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support MIPS: OCTEON: program rx/tx-delay always from DT MIPS: OCTEON: delete board-specific link status MIPS: OCTEON: don't lie about interface type of CN3005 board MIPS: OCTEON: warn if deprecated link status is being used MIPS: OCTEON: add fixed-link nodes to in-kernel device tree MIPS: Delete unused flush_cache_sigtramp() ...
2019-03-05Merge branch 'parisc-5.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "The most important changes in this patch set are: - DMA-related cleanups for parisc with the aim to move anything not required by drivers out of <asm/dma-mapping.h>, by Christoph Hellwig - Switch to memblock_alloc(), by Mike Rapoport - Makefile cleanups by Masahiro Yamada - Switch to bust_spinlocks(), by Sergey Senozhatsky - Improved initial SMP affinity selection for IRQs - Added IPI- and rescheduling interrupts in /proc/interrupts output" * 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits) parisc: use memblock_alloc() instead of custom get_memblock() parisc: Add constants for various PDC firmware calls parisc: Add constant for PDC_PAT_COMPLEX firmware call parisc: Show machine product number during boot parisc: Add constants for PDC_RELOCATE PDC call parisc: Add PDC_CRASH_PREP PDC function number parisc: Use F_EXTEND() macro in iosapic code parisc: remove the HBA_DATA macro parisc/lba_pci: use container_of in LBA_DEV parisc/dino: use container_of in DINO_DEV parisc: properly type the return value of parisc_walk_tree parisc: properly type the iommu field in struct pci_hba_data parisc: turn GET_IOC into an inline function parisc: move internal implementation details out of <asm/dma-mapping.h> parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h> parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile parisc: replace oops_in_progress manipulation with bust_spinlocks() parisc: Improve initial IRQ to CPU assignment parisc: Count IPI function call interrupts parisc: Show rescheduling interrupts on SMP machines only ...
2019-03-05Merge tag 's390-5.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - A copy of Arnds compat wrapper generation series - Pass information about the KVM guest to the host in form the control program code and the control program version code - Map IOV resources to support PCI physical functions on s390 - Add vector load and store alignment hints to improve performance - Use the "jdd" constraint with gcc 9 to make jump labels working again - Remove amode workaround for old z/VM releases from the DCSS code - Add support for in-kernel performance measurements using the CPU measurement counter facility - Introduce a new PMU device cpum_cf_diag to capture counters and store thenn as event raw data. - Bug fixes and cleanups * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits) Revert "s390/cpum_cf: Add kernel message exaplanations" s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y s390/suspend: fix prefix register reset in swsusp_arch_resume s390: warn about clearing als implied facilities s390: allow overriding facilities via command line s390: clean up redundant facilities list setup s390/als: remove duplicated in-place implementation of stfle s390/cio: Use cpa range elsewhere within vfio-ccw s390/cio: Fix vfio-ccw handling of recursive TICs s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation s390/cpum_cf: Add kernel message exaplanations s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace s390/cpum_cf: add ctr_stcctm() function s390/cpum_cf: move common functions into a separate file s390/cpum_cf: introduce kernel_cpumcf_avail() function s390/cpu_mf: replace stcctm5() with the stcctm() function s390/cpu_mf: add store cpu counter multiple instruction support s390/cpum_cf: Add minimal in-kernel interface for counter measurements s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts ...
2019-03-05Merge tag 'm68k-for-v5.1-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - VLA removal - gcc-8.x build fixes - small improvements and cleanups - defconfig updates * tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Add -ffreestanding to CFLAGS m68k/apollo: Fix comment in Makefile dio: Fix buffer overflow in case of unknown board m68k/defconfig: Update defconfigs for v5.0-rc1 m68k/atari: Avoid VLA use in atari_switches_setup() m68k: Avoid VLA use in mangle_kernel_stack() m68k/mac: Use '030 reset method on SE/30 m68k/mac: Remove obsolete comment m68k/mac: Skip VIA port setup unless RTC is connected m68k/mac: Clean up unused timer definitions m68k/defconfig: Drop NET_VENDOR_<FOO>=n
2019-03-05rbd: advertise support for RBD_FEATURE_DEEP_FLATTENIlya Dryomov
All copyups perform deep-copyup regardless of whether deep-flatten feature is enabled. The feature bit is used to ensure that image is written to only by new-enough clients that always perform deep-copyup. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>