summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-04-20Merge tag 'i3c/fixes-for-5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pill i3c fixes from Boris Brezillon: - fix the random PID check - fix the disable controller logic in the designware driver - fix I3C entry in MAINTAINERS * tag 'i3c/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: MAINTAINERS: Fix the I3C entry i3c: dw: Fix dw_i3c_master_disable controller by using correct mask i3c: Fix the verification of random PID
2019-04-20Merge tag 'sound-5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Two core fixes for long-standing bugs for the races at concurrent device creation and deletion that were (unsurprisingly) spotted by syzkaller with usb-fuzzer. The rest are usual small HD-audio fixes" * tag 'sound-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - add two more pin configuration sets to quirk table ALSA: core: Fix card races between register and disconnect ALSA: info: Fix racy addition/deletion of nodes ALSA: hda: Initialize power_state field properly
2019-04-20Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Misc clocksource driver fixes, and a sched-clock wrapping fix" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze() clocksource/drivers/timer-ti-dm: Remove omap_dm_timer_set_load_start clocksource/drivers/oxnas: Fix OX820 compatible clocksource/drivers/arm_arch_timer: Remove unneeded pr_fmt macro clocksource/drivers/npcm: select TIMER_OF
2019-04-20Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - various tooling fixes - kretprobe fixes - kprobes annotation fixes - kprobes error checking fix - fix the default events for AMD Family 17h CPUs - PEBS fix - AUX record fix - address filtering fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Avoid kretprobe recursion bug kprobes: Mark ftrace mcount handler functions nokprobe x86/kprobes: Verify stack frame on kretprobe perf/x86/amd: Add event map for AMD Family 17h perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf() perf tools: Fix map reference counting perf evlist: Fix side band thread draining perf tools: Check maps for bpf programs perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info() tools include uapi: Sync sound/asound.h copy perf top: Always sample time to satisfy needs of use of ordered queuing perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user) tools lib traceevent: Fix missing equality check for strcmp perf stat: Disable DIR_FORMAT feature for 'perf stat record' perf scripts python: export-to-sqlite.py: Fix use of parent_id in calls_view perf header: Fix lock/unlock imbalances when processing BPF/BTF info perf/x86: Fix incorrect PEBS_REGS perf/ring_buffer: Fix AUX record suppression perf/core: Fix the address filtering fix kprobes: Fix error check when reusing optimized probes
2019-04-20Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes all over the place: a console spam fix, section attributes fixes, a KASLR fix, a TLB stack-variable alignment fix, a reboot quirk, boot options related warnings fix, an LTO fix, a deadlock fix and an RDT fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority x86/cpu/bugs: Use __initconst for 'const' init data x86/mm/KASLR: Fix the size of the direct mapping section x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness" x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info" x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T x86/mm: Prevent bogus warnings with "noexec=off" x86/build/lto: Fix truncated .bss with -fdata-sections x86/speculation: Prevent deadlock on ssb_state::lock x86/resctrl: Do not repeat rdtgroup mode initialization
2019-04-20Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A deadline scheduler warning/race fix, and a cfs_period_us quota calculation workaround where the real fix looks too involved to merge immediately" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Correctly handle active 0-lag timers sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
2019-04-20Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "A lockdep warning fix and a script execution fix when atomics are generated" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomics: Don't assume that scripts are executable locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again
2019-04-19Merge branch 'for-5.1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "A patch to fix a RCU imbalance error in the devices cgroup configuration error path" * 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: device_cgroup: fix RCU imbalance in error case
2019-04-19Merge branch 'for-5.1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu Pull percpu fixlet from Dennis Zhou: "This stops printing the base address of percpu memory on initialization" * 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: percpu: stop printing kernel addresses
2019-04-19Merge tag 'tty-5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are five small fixes for some tty/serial/vt issues that have been reported. The vt one has been around for a while, it is good to finally get that resolved. The others fix a build warning that showed up in 5.1-rc1, and resolve a problem in the sh-sci driver. Note, the second patch for build warning fix for the sc16is7xx driver was just applied to the tree, as it resolves a problem with the previous patch to try to solve the issue. It has not shown up in linux-next yet, unlike all of the other patches, but it has passed 0-day testing and everyone seems to agree that it is correct" * tag 'tty-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: sc16is7xx: put err_spi and err_i2c into correct #ifdef vt: fix cursor when clearing the screen sc16is7xx: move label 'err_spi' to correct section serial: sh-sci: Fix HSCIF RX sampling point adjustment serial: sh-sci: Fix HSCIF RX sampling point calculation
2019-04-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt
2019-04-19Merge tag 'staging-5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO fixes from Greg KH: "Here is a bunch of IIO driver fixes, and some smaller staging driver fixes, for 5.1-rc6. The IIO fixes were delayed due to my vacation, but all resolve a number of reported issues and have been in linux-next for a few weeks with no reported issues. The other staging driver fixes are all tiny, resolving some reported issues in the comedi and most drivers, as well as some erofs fixes. All of these patches have been in linux-next with no reported issues" * tag 'staging-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (24 commits) staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf staging: comedi: ni_usb6501: Fix use of uninitialized mutex staging: erofs: fix unexpected out-of-bound data access staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf staging: comedi: vmk80xx: Fix use of uninitialized semaphore staging: most: core: use device description as name iio: core: fix a possible circular locking dependency iio: ad_sigma_delta: select channel when reading register iio: pms7003: select IIO_TRIGGERED_BUFFER iio: cros_ec: Fix the maths for gyro scale calculation iio: adc: xilinx: prevent touching unclocked h/w on remove iio: adc: xilinx: fix potential use-after-free on probe iio: adc: xilinx: fix potential use-after-free on remove iio: dac: mcp4725: add missing powerdown bits in store eeprom io: accel: kxcjk1013: restore the range after resume. iio:chemical:bme680: Fix SPI read interface iio:chemical:bme680: Fix, report temperature in millidegrees iio: chemical: fix missing Kconfig block for sgp30 iio: adc: at91: disable adc channel interrupt in timeout case iio: gyro: mpu3050: fix chip ID reading ...
2019-04-19Merge tag 'char-misc-5.1-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are four small misc driver fixes for 5.1-rc6. Nothing major at all, they fix up a Kconfig issues, a SPDX invalid license tag, and two tiny bugfixes. All have been in linux-next for a while with no reported issues" * tag 'char-misc-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: drivers: power: supply: goldfish_battery: Fix bogus SPDX identifier extcon: ptn5150: fix COMPILE_TEST dependencies misc: fastrpc: add checked value for dma_set_mask habanalabs: remove low credit limit of DMA #0
2019-04-19block: make sure that bvec length can't be overflowMing Lei
bvec->bv_offset may be bigger than PAGE_SIZE sometimes, such as, when one bio is splitted in the middle of one bvec via bio_split(), and bi_iter.bi_bvec_done is used to build offset of the 1st bvec of remained bio. And the remained bio's bvec may be re-submitted to fs layer via ITER_IBVEC, such as loop and nvme-loop. So we have to make sure that every bvec's offset is less than PAGE_SIZE from bio_for_each_segment_all() because some drivers(loop, nvme-loop) passes the splitted bvec to fs layer via ITER_BVEC. This patch fixes this issue reported by Zhang Yi When running nvme/011. Cc: Christoph Hellwig <hch@lst.de> Cc: Yi Zhang <yi.zhang@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19block: kill all_q_node in request_queueHou Tao
all_q_node has not been used since commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU"), so remove it. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - several new key mappings for HID - a host of new ACPI IDs used to identify Elan touchpads in Lenovo laptops * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ HID: input: add mapping for "Toggle Display" key HID: input: add mapping for "Full Screen" key HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys HID: input: add mapping for Expose/Overview key HID: input: fix mapping of aspect ratio key [media] doc-rst: switch to new names for Full Screen/Aspect keys Input: document meanings of KEY_SCREEN and KEY_ZOOM Input: elan_i2c - add hardware ID for multiple Lenovo laptops
2019-04-19x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log ↵Hans de Goede
priority The "ENERGY_PERF_BIAS: Set to 'normal', was 'performance'" message triggers on pretty much every Intel machine. The purpose of log messages with a warning level is to notify the user of something which potentially is a problem, or at least somewhat unexpected. This message clearly does not match those criteria, so lower its log priority from warning to info. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20181230172715.17469-1-hdegoede@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19Merge tag 'perf-urgent-for-mingo-5.1-20190419' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: perf top: Jiri Olsa: - Fix 'perf top --pid', it needs PERF_SAMPLE_TIME since we switched to using a different thread to sort the events and then even for just a single thread we now need timestamps. BPF: Jiri Olsa: - Fix bpf_prog and btf lookup functions failure path to to properly return NULL. - Fix side band thread draining, used to process PERF_RECORD_BPF_EVENT metadata records. core: Jiri Olsa: - Fix map lookup by name to get a refcount when the name is already in the tree. Found Song Liu: - Fix __map__is_kmodule() by taking into account recently added BPF maps. UAPI: Arnaldo Carvalho de Melo: - Sync sound/asound.h copy Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19coredump: fix race condition between mmget_not_zero()/get_task_mm() and core ↵Andrea Arcangeli
dumping The core dumping code has always run without holding the mmap_sem for writing, despite that is the only way to ensure that the entire vma layout will not change from under it. Only using some signal serialization on the processes belonging to the mm is not nearly enough. This was pointed out earlier. For example in Hugh's post from Jul 2017: https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils "Not strictly relevant here, but a related note: I was very surprised to discover, only quite recently, how handle_mm_fault() may be called without down_read(mmap_sem) - when core dumping. That seems a misguided optimization to me, which would also be nice to correct" In particular because the growsdown and growsup can move the vm_start/vm_end the various loops the core dump does around the vma will not be consistent if page faults can happen concurrently. Pretty much all users calling mmget_not_zero()/get_task_mm() and then taking the mmap_sem had the potential to introduce unexpected side effects in the core dumping code. Adding mmap_sem for writing around the ->core_dump invocation is a viable long term fix, but it requires removing all copy user and page faults and to replace them with get_dump_page() for all binary formats which is not suitable as a short term fix. For the time being this solution manually covers the places that can confuse the core dump either by altering the vma layout or the vma flags while it runs. Once ->core_dump runs under mmap_sem for writing the function mmget_still_valid() can be dropped. Allowing mmap_sem protected sections to run in parallel with the coredump provides some minor parallelism advantage to the swapoff code (which seems to be safe enough by never mangling any vma field and can keep doing swapins in parallel to the core dumping) and to some other corner case. In order to facilitate the backporting I added "Fixes: 86039bd3b4e6" however the side effect of this same race condition in /proc/pid/mem should be reproducible since before 2.6.12-rc2 so I couldn't add any other "Fixes:" because there's no hash beyond the git genesis commit. Because find_extend_vma() is the only location outside of the process context that could modify the "mm" structures under mmap_sem for reading, by adding the mmget_still_valid() check to it, all other cases that take the mmap_sem for reading don't need the new check after mmget_not_zero()/get_task_mm(). The expand_stack() in page fault context also doesn't need the new check, because all tasks under core dumping are frozen. Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Jann Horn <jannh@google.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm/kmemleak.c: fix unused-function warningArnd Bergmann
The only references outside of the #ifdef have been removed, so now we get a warning in non-SMP configurations: mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function] Add a new #ifdef around it. Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19init: initialize jump labels before command line option parsingDan Williams
When a module option, or core kernel argument, toggles a static-key it requires jump labels to be initialized early. While x86, PowerPC, and ARM64 arrange for jump_label_init() to be called before parse_args(), ARM does not. Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303 page_alloc_shuffle+0x12c/0x1ac static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used before call to jump_label_init() Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1 Hardware name: ARM Integrator/CP (Device Tree) [<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18) [<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24) [<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108) [<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c) [<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>] (page_alloc_shuffle+0x12c/0x1ac) [<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48) [<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350) [<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488) Move the fallback call to jump_label_init() to occur before parse_args(). The redundant calls to jump_label_init() in other archs are left intact in case they have static key toggling use cases that are even earlier than option parsing. Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Guenter Roeck <groeck@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Russell King <rmk@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19kernel/watchdog_hld.c: hard lockup message should end with a newlineSergey Senozhatsky
Separate print_modules() and hard lockup error message. Before the patch: NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Modules linked in: nls_cp437 Link: http://lkml.kernel.org/r/20190412062557.2700-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19kcov: improve CONFIG_ARCH_HAS_KCOV help textMark Rutland
The help text for CONFIG_ARCH_HAS_KCOV is stale, and describes the feature as being enabled only for x86_64, when it is now enabled for several architectures, including arm, arm64, powerpc, and s390. Let's remove that stale help text, and update it along the lines of hat for ARCH_HAS_FORTIFY_SOURCE, better describing when an architecture should select CONFIG_ARCH_HAS_KCOV. Link: http://lkml.kernel.org/r/20190412102733.5154-1-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: fix inactive list balancing between NUMA nodes and cgroupsJohannes Weiner
During !CONFIG_CGROUP reclaim, we expand the inactive list size if it's thrashing on the node that is about to be reclaimed. But when cgroups are enabled, we suddenly ignore the node scope and use the cgroup scope only. The result is that pressure bleeds between NUMA nodes depending on whether cgroups are merely compiled into Linux. This behavioral difference is unexpected and undesirable. When the refault adaptivity of the inactive list was first introduced, there were no statistics at the lruvec level - the intersection of node and memcg - so it was better than nothing. But now that we have that infrastructure, use lruvec_page_state() to make the list balancing decision always NUMA aware. [hannes@cmpxchg.org: fix bisection hole] Link: http://lkml.kernel.org/r/20190417155241.GB23013@cmpxchg.org Link: http://lkml.kernel.org/r/20190412144438.2645-1-hannes@cmpxchg.org Fixes: 2a2e48854d70 ("mm: vmscan: fix IO/refault regression in cache workingset transition") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm/hotplug: treat CMA pages as unmovableQian Cai
has_unmovable_pages() is used by allocating CMA and gigantic pages as well as the memory hotplug. The later doesn't know how to offline CMA pool properly now, but if an unused (free) CMA page is encountered, then has_unmovable_pages() happily considers it as a free memory and propagates this up the call chain. Memory offlining code then frees the page without a proper CMA tear down which leads to an accounting issues. Moreover if the same memory range is onlined again then the memory never gets back to the CMA pool. State after memory offline: # grep cma /proc/vmstat nr_free_cma 205824 # cat /sys/kernel/debug/cma/cma-kvm_cma/count 209920 Also, kmemleak still think those memory address are reserved below but have already been used by the buddy allocator after onlining. This patch fixes the situation by treating CMA pageblocks as unmovable except when has_unmovable_pages() is called as part of CMA allocation. Offlined Pages 4096 kmemleak: Cannot insert 0xc000201f7d040008 into the object search tree (overlaps existing) Call Trace: dump_stack+0xb0/0xf4 (unreliable) create_object+0x344/0x380 __kmalloc_node+0x3ec/0x860 kvmalloc_node+0x58/0x110 seq_read+0x41c/0x620 __vfs_read+0x3c/0x70 vfs_read+0xbc/0x1a0 ksys_read+0x7c/0x140 system_call+0x5c/0x70 kmemleak: Kernel memory leak detector disabled kmemleak: Object 0xc000201cc8000000 (size 13757317120): kmemleak: comm "swapper/0", pid 0, jiffies 4294937297 kmemleak: min_count = -1 kmemleak: count = 0 kmemleak: flags = 0x5 kmemleak: checksum = 0 kmemleak: backtrace: cma_declare_contiguous+0x2a4/0x3b0 kvm_cma_reserve+0x11c/0x134 setup_arch+0x300/0x3f8 start_kernel+0x9c/0x6e8 start_here_common+0x1c/0x4b0 kmemleak: Automatic memory scanning thread ended [cai@lca.pw: use is_migrate_cma_page() and update commit log] Link: http://lkml.kernel.org/r/20190416170510.20048-1-cai@lca.pw Link: http://lkml.kernel.org/r/20190413002623.8967-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19proc: fixup proc-pid-vm testAlexey Dobriyan
Silly sizeof(pointer) vs sizeof(uint8_t[]) bug. Link: http://lkml.kernel.org/r/20190414123009.GA12971@avx2 Fixes: e483b0208784 ("proc: test /proc/*/maps, smaps, smaps_rollup, statm") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19proc: fix map_files test on F29Alexey Dobriyan
F29 bans mapping first 64KB even for root making test fail. Iterate from address 0 until mmap() works. Gentoo (root): openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0 Gentoo (non-root): openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EPERM (Operation not permitted) mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x1000 F29 (root): openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x2000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x3000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x4000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x5000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x6000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x7000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x8000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x9000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xa000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xb000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xc000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xd000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xe000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0xf000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied) mmap(0x10000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x10000 Now all proc tests succeed on F29 if run as root, at last! Link: http://lkml.kernel.org/r/20190414123612.GB12971@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=nKonstantin Khlebnikov
Commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly") depends on skipping vmstat entries with empty name introduced in 7aaf77272358 ("mm: don't show nr_indirectly_reclaimable in /proc/vmstat") but reverted in b29940c1abd7 ("mm: rename and change semantics of nr_indirectly_reclaimable_bytes"). So skipping no longer works and /proc/vmstat has misformatted lines " 0". This patch simply shows debug counters "nr_tlb_remote_*" for UP. Link: http://lkml.kernel.org/r/155481488468.467.4295519102880913454.stgit@buzz Fixes: 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Roman Gushchin <guro@fb.com> Cc: Jann Horn <jannh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lockzhong jiang
When adding memory by probing a memory block in the sysfs interface, there is an obvious issue where we will unlock the device_hotplug_lock when we failed to takes it. That issue was introduced in 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock"). We should drop out in time when failing to take the device_hotplug_lock. Link: http://lkml.kernel.org/r/1554696437-9593-1-git-send-email-zhongjiang@huawei.com Fixes: 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reported-by: Yang yingliang <yangyingliang@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: swapoff: shmem_unuse() stop eviction without igrab()Hugh Dickins
The igrab() in shmem_unuse() looks good, but we forgot that it gives no protection against concurrent unmounting: a point made by Konstantin Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae78 ("tmpfs: fix race between umount and swapoff"). The current 5.1-rc swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day..." followed by GPF. Once again, give up on using igrab(); but don't go back to making such heavy-handed use of shmem_swaplist_mutex as last time: that would spoil the new design, and I expect could deadlock inside shmem_swapin_page(). Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem- specific inode, and shmem_evict_inode() wait for that to go down to 0. Call it "stop_eviction" rather than "swapoff_busy" because it can be put to use for others later (huge tmpfs patches expect to use it). That simplifies shmem_unuse(), protecting it from both unlink and unmount; and in practice lets it locate all the swap in its first try. But do not rely on that: there's still a theoretical case, when shmem_writepage() might have been preempted after its get_swap_page(), before making the swap entry visible to swapoff. [hughd@google.com: remove incorrect list_del()] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904091133570.1898@eggly.anvils Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081259400.1523@eggly.anvils Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> Cc: Huang Ying <ying.huang@intel.com> Cc: Kelley Nielsen <kelleynnn@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rik van Riel <riel@surriel.com> Cc: Vineeth Pillai <vpillai@digitalocean.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: swapoff: take notice of completion soonerHugh Dickins
The old try_to_unuse() implementation was driven by find_next_to_unuse(), which terminated as soon as all the swap had been freed. Add inuse_pages checks now (alongside signal_pending()) to stop scanning mms and swap_map once finished. The same ought to be done in shmem_unuse() too, but never was before, and needs a different interface: so leave it as is for now. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081258200.1523@eggly.anvils Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> Cc: Huang Ying <ying.huang@intel.com> Cc: Kelley Nielsen <kelleynnn@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rik van Riel <riel@surriel.com> Cc: Vineeth Pillai <vpillai@digitalocean.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIESHugh Dickins
SWAP_UNUSE_MAX_TRIES 3 appeared to work well in earlier testing, but further testing has proved it to be a source of unnecessary swapoff EBUSY failures (which can then be followed by unmount EBUSY failures). When mmget_not_zero() or shmem's igrab() fails, there is an mm exiting or inode being evicted, freeing up swap independent of try_to_unuse(). Those typically completed much sooner than the old quadratic swapoff, but now it's more common that swapoff may need to wait for them. It's possible to move those cases from init_mm.mmlist and shmem_swaplist to separate "exiting" swaplists, and try_to_unuse() then wait for those lists to be emptied; but we've not bothered with that in the past, and don't want to risk missing some other forgotten case. So just revert to cycling around until the swap is gone, without any retries limit. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081256170.1523@eggly.anvils Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> Cc: Huang Ying <ying.huang@intel.com> Cc: Kelley Nielsen <kelleynnn@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rik van Riel <riel@surriel.com> Cc: Vineeth Pillai <vpillai@digitalocean.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: swapoff: shmem_find_swap_entries() filter out other typesHugh Dickins
Swapfile "type" was passed all the way down to shmem_unuse_inode(), but then forgotten from shmem_find_swap_entries(): with the result that removing one swapfile would try to free up all the swap from shmem - no problem when only one swapfile anyway, but counter-productive when more, causing swapoff to be unnecessarily OOM-killed when it should succeed. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081254470.1523@eggly.anvils Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> Cc: Vineeth Pillai <vpillai@digitalocean.com> Cc: Kelley Nielsen <kelleynnn@gmail.com> Cc: Rik van Riel <riel@surriel.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19slab: store tagged freelist for off-slab slabmgmtQian Cai
Commit 51dedad06b5f ("kasan, slab: make freelist stored without tags") calls kasan_reset_tag() for off-slab slab management object leading to freelist being stored non-tagged. However, cache_grow_begin() calls alloc_slabmgmt() which calls kmem_cache_alloc_node() assigns a tag for the address and stores it in the shadow address. As the result, it causes endless errors below during boot due to drain_freelist() -> slab_destroy() -> kasan_slab_free() which compares already untagged freelist against the stored tag in the shadow address. Since off-slab slab management object freelist is such a special case, just store it tagged. Non-off-slab management object freelist is still stored untagged which has not been assigned a tag and should not cause any other troubles with this inconsistency. BUG: KASAN: double-free or invalid-free in slab_destroy+0x84/0x88 Pointer tag: [ff], memory tag: [99] CPU: 0 PID: 1376 Comm: kworker/0:4 Tainted: G W 5.1.0-rc3+ #8 Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018 Workqueue: cgroup_destroy css_killed_work_fn Call trace: print_address_description+0x74/0x2a4 kasan_report_invalid_free+0x80/0xc0 __kasan_slab_free+0x204/0x208 kasan_slab_free+0xc/0x18 kmem_cache_free+0xe4/0x254 slab_destroy+0x84/0x88 drain_freelist+0xd0/0x104 __kmem_cache_shrink+0x1ac/0x224 __kmemcg_cache_deactivate+0x1c/0x28 memcg_deactivate_kmem_caches+0xa0/0xe8 memcg_offline_kmem+0x8c/0x3d4 mem_cgroup_css_offline+0x24c/0x290 css_killed_work_fn+0x154/0x618 process_one_work+0x9cc/0x183c worker_thread+0x9b0/0xe38 kthread+0x374/0x390 ret_from_fork+0x10/0x18 Allocated by task 1625: __kasan_kmalloc+0x168/0x240 kasan_slab_alloc+0x18/0x20 kmem_cache_alloc_node+0x1f8/0x3a0 cache_grow_begin+0x4fc/0xa24 cache_alloc_refill+0x2f8/0x3e8 kmem_cache_alloc+0x1bc/0x3bc sock_alloc_inode+0x58/0x334 alloc_inode+0xb8/0x164 new_inode_pseudo+0x20/0xec sock_alloc+0x74/0x284 __sock_create+0xb0/0x58c sock_create+0x98/0xb8 __sys_socket+0x60/0x138 __arm64_sys_socket+0xa4/0x110 el0_svc_handler+0x2c0/0x47c el0_svc+0x8/0xc Freed by task 1625: __kasan_slab_free+0x114/0x208 kasan_slab_free+0xc/0x18 kfree+0x1a8/0x1e0 single_release+0x7c/0x9c close_pdeo+0x13c/0x43c proc_reg_release+0xec/0x108 __fput+0x2f8/0x784 ____fput+0x1c/0x28 task_work_run+0xc0/0x1b0 do_notify_resume+0xb44/0x1278 work_pending+0x8/0x10 The buggy address belongs to the object at ffff809681b89e00 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 0 bytes inside of 128-byte region [ffff809681b89e00, ffff809681b89e80) The buggy address belongs to the page: page:ffff7fe025a06e00 count:1 mapcount:0 mapping:01ff80082000fb00 index:0xffff809681b8fe04 flags: 0x17ffffffc000200(slab) raw: 017ffffffc000200 ffff7fe025a06d08 ffff7fe022ef7b88 01ff80082000fb00 raw: ffff809681b8fe04 ffff809681b80000 00000001000000e0 0000000000000000 page dumped because: kasan: bad access detected page allocated via order 0, migratetype Unmovable, gfp_mask 0x2420c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE) prep_new_page+0x4e0/0x5e0 get_page_from_freelist+0x4ce8/0x50d4 __alloc_pages_nodemask+0x738/0x38b8 cache_grow_begin+0xd8/0xa24 ____cache_alloc_node+0x14c/0x268 __kmalloc+0x1c8/0x3fc ftrace_free_mem+0x408/0x1284 ftrace_free_init_mem+0x20/0x28 kernel_init+0x24/0x548 ret_from_fork+0x10/0x18 Memory state around the buggy address: ffff809681b89c00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe ffff809681b89d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe >ffff809681b89e00: 99 99 99 99 99 99 99 99 fe fe fe fe fe fe fe fe ^ ffff809681b89f00: 43 43 43 43 43 fe fe fe fe fe fe fe fe fe fe fe ffff809681b8a000: 6d fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe Link: http://lkml.kernel.org/r/20190403022858.97584-1-cai@lca.pw Fixes: 51dedad06b5f ("kasan, slab: make freelist stored without tags") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19x86/cpu/bugs: Use __initconst for 'const' init dataAndi Kleen
Some of the recently added const tables use __initdata which causes section attribute conflicts. Use __initconst instead. Fixes: fa1202ef2243 ("x86/speculation: Add command line control") Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190330004743.29541-9-andi@firstfloor.org
2019-04-19x86/kprobes: Avoid kretprobe recursion bugMasami Hiramatsu
Avoid kretprobe recursion loop bg by setting a dummy kprobes to current_kprobe per-CPU variable. This bug has been introduced with the asm-coded trampoline code, since previously it used another kprobe for hooking the function return placeholder (which only has a nop) and trampoline handler was called from that kprobe. This revives the old lost kprobe again. With this fix, we don't see deadlock anymore. And you can see that all inner-called kretprobe are skipped. event_1 235 0 event_2 19375 19612 The 1st column is recorded count and the 2nd is missed count. Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed) (some difference are here because the counter is racy) Reported-by: Andrea Righi <righi.andrea@gmail.com> Tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster") Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19kprobes: Mark ftrace mcount handler functions nokprobeMasami Hiramatsu
Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack. Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com> Tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19x86/kprobes: Verify stack frame on kretprobeMasami Hiramatsu
Verify the stack frame pointer on kretprobe trampoline handler, If the stack frame pointer does not match, it skips the wrong entry and tries to find correct one. This can happen if user puts the kretprobe on the function which can be used in the path of ftrace user-function call. Such functions should not be probed, so this adds a warning message that reports which function should be blacklisted. Tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19locking/atomics: Don't assume that scripts are executableAndrew Morton
patch(1) doesn't set the x bit on files. So if someone downloads and applies patch-4.21.xz, their kernel won't build. Fix that by executing /bin/sh. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19sc16is7xx: put err_spi and err_i2c into correct #ifdefGuoqing Jiang
err_spi is only called within SERIAL_SC16IS7XX_SPI while err_i2c is called inside SERIAL_SC16IS7XX_I2C. So we need to put err_spi and err_i2c into each #ifdef accordingly. This change fixes ("sc16is7xx: move label 'err_spi' to correct section"). Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-18scsi: aic7xxx: fix EISA supportChristoph Hellwig
Instead of relying on the now removed NULL argument to pci_alloc_consistent, switch to the generic DMA API, and store the struct device so that we can pass it. Fixes: 4167b2ad5182 ("PCI: Remove NULL device handling from PCI DMA API") Reported-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"Saurav Kashyap
This patch clears FC_RP_STARTED flag during logoff, because of this re-login(flogi) didn't happen to the switch. This reverts commit 1550ec458e0cf1a40a170ab1f4c46e3f52860f65. Fixes: 1550ec458e0c ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO") Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Hannes Reinecke <hare@#suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Avoid compiler uninitialised warning introduced by recent arm64 futex fix" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: futex: Restore oldval initialization to work around buggy compilers
2019-04-18arm64: futex: Restore oldval initialization to work around buggy compilersNathan Chancellor
Commit 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value") removed oldval's zero initialization in arch_futex_atomic_op_inuser because it is not necessary. Unfortunately, Android's arm64 GCC 4.9.4 [1] does not agree: ../kernel/futex.c: In function 'do_futex': ../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized in this function [-Wmaybe-uninitialized] return oldval == cmparg; ^ In file included from ../kernel/futex.c:73:0: ../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here int oldval, ret, tmp; ^ GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser returns right away, avoiding the uninitialized use that it claims. Restoring the zero initialization works around this issue. [1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/ Cc: stable@vger.kernel.org Fixes: 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-18signal: use fdget() since we don't allow O_PATHChristian Brauner
As stated in the original commit for pidfd_send_signal() we don't allow to signal processes through O_PATH file descriptors since it is semantically equivalent to a write on the pidfd. We already correctly error out right now and return EBADF if an O_PATH fd is passed. This is because we use file->f_op to detect whether a pidfd is passed and O_PATH fds have their file->f_op set to empty_fops in do_dentry_open() and thus fail the test. Thus, there is no regression. It's just semantically correct to use fdget() and return an error right from there instead of taking a reference and returning an error later. Signed-off-by: Christian Brauner <christian@brauner.io> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jann@thejh.net> Cc: David Howells <dhowells@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-18Merge tag 's390-5.1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 bug fixes from Martin Schwidefsky: - Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED - Fix integer overflow in the dasd driver resulting in incorrect number of blocks for large devices - Fix a lockdep false positive in the 3270 driver - Fix a deadlock in the zcrypt driver - Fix incorrect debug feature entries in the pkey api - Fix inline assembly constraints fallout with CONFIG_KASAN=y * tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: correct some inline assembly constraints s390/pkey: add one more argument space for debug feature entry s390/zcrypt: fix possible deadlock situation on ap queue remove s390/3270: fix lockdep false positive on view->lock s390/dasd: Fix capacity calculation for large volumes s390/mem_detect: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD)
2019-04-18Merge tag 'afs-fixes-20190413' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: - Stop using the deprecated get_seconds(). - Don't make tracepoint strings const as the section they go in isn't read-only. - Differentiate failure due to unmarshalling from other failure cases. We shouldn't abort with RXGEN_CC/SS_UNMARSHAL if it's not due to unmarshalling. - Add a missing unlock_page(). - Fix the interaction between receiving a notification from a server that it has invalidated all outstanding callback promises and a client call that we're in the middle of making that will get a new promise. * tag 'afs-fixes-20190413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix in-progess ops to ignore server-level callback invalidation afs: Unlock pages for __pagevec_release() afs: Differentiate abort due to unmarshalling from other errors afs: Avoid section confusion in CM_NAME afs: avoid deprecated get_seconds()
2019-04-18Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a bug in the implementation of the x86 accelerated version of poly1305" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/poly1305 - fix overflow during partial reduction
2019-04-18Merge tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Since Easter is looming for me, I'm just pushing whatever is in my tree, I'll see what else turns up and maybe I'll send another pull early next week if there is anything. tegra: - stream id programming fix - avoid divide by 0 for bad hdmi audio setup code ttm: - Hugepages fix - refcount imbalance in error path fix amdgpu: - GPU VM fixes for Vega/RV - DC AUX fix for active DP-DVI dongles - DC fix for multihead regression" * tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drm: drm/tegra: hdmi: Setup audio only if configured drm/amd/display: If one stream full updates, full update all planes drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming drm/amdgpu: shadow in shadow_list without tbo.mem.start cause page fault in sriov TDR gpu: host1x: Program stream ID to bypass without SMMU drm/amd/display: extending AUX SW Timeout drm/ttm: fix dma_fence refcount imbalance on error path drm/ttm: fix incrementing the page pointer for huge pages drm/ttm: fix start page for huge page check in ttm_put_pages() drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
2019-04-18timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze()Chang-An Chen
tick_freeze() introduced by suspend-to-idle in commit 124cf9117c5f ("PM / sleep: Make it possible to quiesce timers during suspend-to-idle") uses timekeeping_suspend() instead of syscore_suspend() during suspend-to-idle. As a consequence generic sched_clock will keep going because sched_clock_suspend() and sched_clock_resume() are not invoked during suspend-to-idle which can result in a generic sched_clock wrap. On a ARM system with suspend-to-idle enabled, sched_clock is registered as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which means the real wrapping duration is 8796093022202ns. [ 134.551779] suspend-to-idle suspend (timekeeping_suspend()) [ 1204.912239] suspend-to-idle resume (timekeeping_resume()) ...... [ 1206.912239] suspend-to-idle suspend (timekeeping_suspend()) [ 5880.502807] suspend-to-idle resume (timekeeping_resume()) ...... [ 6000.403724] suspend-to-idle suspend (timekeeping_suspend()) [ 8035.753167] suspend-to-idle resume (timekeeping_resume()) ...... [ 8795.786684] (2)[321:charger_thread]...... [ 8795.788387] (2)[321:charger_thread]...... [ 0.057226] (0)[0:swapper/0]...... [ 0.061447] (2)[0:swapper/2]...... sched_clock was not stopped during suspend-to-idle, and sched_clock_poll hrtimer was not expired because timekeeping_suspend() was invoked during suspend-to-idle. It makes sched_clock wrap at kernel time 8796s. To prevent this, invoke sched_clock_suspend() and sched_clock_resume() in tick_freeze() together with timekeeping_suspend() and timekeeping_resume(). Fixes: 124cf9117c5f (PM / sleep: Make it possible to quiesce timers during suspend-to-idle) Signed-off-by: Chang-An Chen <chang-an.chen@mediatek.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Kees Cook <keescook@chromium.org> Cc: Corey Minyard <cminyard@mvista.com> Cc: <linux-mediatek@lists.infradead.org> Cc: <linux-arm-kernel@lists.infradead.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: <kuohong.wang@mediatek.com> Cc: <freddy.hsin@mediatek.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1553828349-8914-1-git-send-email-chang-an.chen@mediatek.com