linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-11-11	Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"	Arnd Bergmann
	Traditionally, we have always had warnings about uninitialized variables enabled, as this is part of -Wall, and generally a good idea [1], but it also always produced false positives, mainly because this is a variation of the halting problem and provably impossible to get right in all cases [2]. Various people have identified cases that are particularly bad for false positives, and in commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os"), I turned off the warning for any build that was done with CC_OPTIMIZE_FOR_SIZE. This drastically reduced the number of false positive warnings in the default build but unfortunately had the side effect of turning the warning off completely in 'allmodconfig' builds, which in turn led to a lot of warnings (both actual bugs, and remaining false positives) to go in unnoticed. With commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE definition") enabled the warning again for allmodconfig builds in v4.7 and in v4.8-rc1, I had finally managed to address all warnings I get in an ARM allmodconfig build and most other maybe-uninitialized warnings for ARM randconfig builds. However, commit 6e8d666e9253 ("Disable "maybe-uninitialized" warning globally") was merged at the same time and disabled it completely for all configurations, because of false-positive warnings on x86 that I had not addressed until then. This caused a lot of actual bugs to get merged into mainline, and I sent several dozen patches for these during the v4.9 development cycle. Most of these are actual bugs, some are for correct code that is safe because it is only called under external constraints that make it impossible to run into the case that gcc sees, and in a few cases gcc is just stupid and finds something that can obviously never happen. I have now done a few thousand randconfig builds on x86 and collected all patches that I needed to address every single warning I got (I can provide the combined patch for the other warnings if anyone is interested), so I hope we can get the warning back and let people catch the actual bugs earlier. This reverts the change to disable the warning completely and for now brings it back at the "make W=1" level, so we can get it merged into mainline without introducing false positives. A follow-up patch enables it on all levels unless some configuration option turns it off because of false-positives. Link: https://rusty.ozlabs.org/?p=232 [1] Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings [2] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	lib/stackdepot: export save/fetch stack for drivers	Chris Wilson
	Some drivers would like to record stacktraces in order to aide leak tracing. As stackdepot already provides a facility for only storing the unique traces, thereby reducing the memory required, export that functionality for use by drivers. The code was originally created for KASAN and moved under lib in commit cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") so that it could be shared with mm/. In turn, we want to share it now with drivers. Link: http://lkml.kernel.org/r/20161108133209.22704-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: kmemleak: scan .data.ro_after_init	Jakub Kicinski
	Limit the number of kmemleak false positives by including .data.ro_after_init in memory scanning. To achieve this we need to add symbols for start and end of the section to the linker scripts. The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark families as __ro_after_init"). Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB	Greg Thelen
	While testing OBJFREELIST_SLAB integration with pagealloc, we found a bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB & CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed for loading drivers or creating new caches will fail. The original kmem_cache is created early making OFF_SLAB not possible. When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc is enabled it will try to enable it first under certain conditions. Given kmem_cache(sys) reuses the original flag, you can have both flags at the same time resulting in allocation failures and odd behaviors. This fix discards allocator specific flags from memcg before calling create_cache. The bug exists since 4.6-rc1 and affects testing debug pagealloc configurations. Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB") Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Thomas Garnier <thgarnie@google.com> Tested-by: Thomas Garnier <thgarnie@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	coredump: fix unfreezable coredumping task	Andrey Ryabinin
	It could be not possible to freeze coredumping task when it waits for 'core_state->startup' completion, because threads are frozen in get_signal() before they got a chance to complete 'core_state->startup'. Inability to freeze a task during suspend will cause suspend to fail. Also CRIU uses cgroup freezer during dump operation. So with an unfreezable task the CRIU dump will fail because it waits for a transition from 'FREEZING' to 'FROZEN' state which will never happen. Use freezer_do_not_count() to tell freezer to ignore coredumping task while it waits for core_state->startup completion. Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/filemap: don't allow partially uptodate page for pipes	Eryu Guan
	Starting from 4.9-rc1 kernel, I started noticing some test failures of sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when testing on sub-page block size filesystems (tested both XFS and ext4), these syscalls start to return EIO in the tests. e.g. sendfile02 1 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1 sendfile02 2 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1 sendfile02 3 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1 sendfile02 4 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1 This is because that in sub-page block size cases, we don't need the whole page to be uptodate, only the part we care about is uptodate is OK (if fs has ->is_partially_uptodate defined). But page_cache_pipe_buf_confirm() doesn't have the ability to check the partially-uptodate case, it needs the whole page to be uptodate. So it returns EIO in this case. This is a regression introduced by commit 82c156f85384 ("switch generic_file_splice_read() to use of ->read_iter()"). Prior to the change, generic_file_splice_read() doesn't allow partially-uptodate page either, so it worked fine. Fix it by skipping the partially-uptodate check if we're working on a pipe in do_generic_file_read(), so we read the whole page from disk as long as the page is not uptodate. I think the other way to fix it is to add the ability to check & allow partially-uptodate page to page_cache_pipe_buf_confirm(), but that is much harder to do and seems gain little. Link: http://lkml.kernel.org/r/1477986187-12717-1-git-send-email-guaneryu@gmail.com Signed-off-by: Eryu Guan <guaneryu@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/hugetlb: fix huge page reservation leak in private mapping error paths	Mike Kravetz
	Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly allocated huge page. If a reservation was associated with the huge page, alloc_huge_page() consumed the reservation while allocating. When the newly allocated page is freed in free_huge_page(), it will increment the global reservation count. However, the reservation entry in the reserve map will remain. This is not an issue for shared mappings as the entry in the reserve map indicates a reservation exists. But, an entry in a private mapping reserve map indicates the reservation was consumed and no longer exists. This results in an inconsistency between the reserve map and the global reservation count. This 'leaks' a reserved huge page. Create a new routine restore_reserve_on_error() to restore the reserve entry in these specific error paths. This routine makes use of a new function vma_add_reservation() which will add a reserve entry for a specific address/page. In general, these error paths were rarely (if ever) taken on most architectures. However, powerpc contained arch specific code that that resulted in an extra fault and execution of these error paths on all private mappings. Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings) Link: http://lkml.kernel.org/r/1476933077-23091-2-git-send-email-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Kirill A . Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	ocfs2: fix not enough credit panic	Junxiao Bi
	The following panic was caught when run ocfs2 disconfig single test (block size 512 and cluster size 8192). ocfs2_journal_dirty() return -ENOSPC, that means credits were used up. The total credit should include 3 times of "num_dx_leaves" from ocfs2_dx_dir_rebalance(), because 2 times will be consumed in ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() -> ocfs2_dx_dir_format_cluster(). But only two times is included in ocfs2_dx_dir_rebalance_credits(), fix it. This can cause read-only fs(v4.1+) or panic for mainline linux depending on mount option. ------------[ cut here ]------------ kernel BUG at fs/ocfs2/journal.c:775! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016 task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000 RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] RSP: 0018:ffff8800a7d4b6d8 EFLAGS: 00010286 RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9 RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460 RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460 R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070 FS: 00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0 Call Trace: ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2] ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2] ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2] ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2] ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2] ocfs2_mknod+0x5a2/0x1400 [ocfs2] ocfs2_create+0x73/0x180 [ocfs2] vfs_create+0xd8/0x100 lookup_open+0x185/0x1c0 do_last+0x36d/0x780 path_openat+0x92/0x470 do_filp_open+0x4a/0xa0 do_sys_open+0x11a/0x230 SyS_open+0x1e/0x20 system_call_fastpath+0x12/0x71 Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 RIP ocfs2_journal_dirty+0xa7/0xb0 [ocfs2] ---[ end trace 91ac5312a6ee1288 ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	Revert "console: don't prefer first registered if DT specifies stdout-path"	Hans de Goede
	This reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path"). The reverted commit changes existing behavior on which many ARM boards rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have both a video output and a serial console. Depending on whether the user is using the device as a more regular computer; or as a headless device we need to have the console on either one or the other. Many users rely on the kernel behavior of the console being present on both outputs, before the reverted commit the console setup with no console= kernel arguments on an ARM board which sets stdout-path in dt would look like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 tty0 -WU (E p ) 4:1 Where as after the reverted commit, it looks like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 This commit reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") restoring the original behavior. Fixes: 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: hwpoison: fix thp split handling in memory_failure()	Naoya Horiguchi
	When memory_failure() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): page:ffffd7cd819b0040 count:0 mapcount:0 mapping: (null) index:0x1 flags: 0x1fffc000400000(hwpoison) page dumped because: VM_BUG_ON_PAGE(!page_count(p)) ------------[ cut here ]------------ kernel BUG at /src/linux-dev/mm/memory-failure.c:1132! memory_failure() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Fixes: 61f5d698cc97 ("mm: re-enable THP") Link: http://lkml.kernel.org/r/1477961577-7183-1-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	swapfile: fix memory corruption via malformed swapfile	Jann Horn
	When root activates a swap partition whose header has the wrong endianness, nr_badpages elements of badpages are swabbed before nr_badpages has been checked, leading to a buffer overrun of up to 8GB. This normally is not a security issue because it can only be exploited by root (more specifically, a process with CAP_SYS_ADMIN or the ability to modify a swap file/partition), and such a process can already e.g. modify swapped-out memory of any other userspace process on the system. Link: http://lkml.kernel.org/r/1477949533-2509-1-git-send-email-jann@thejh.net Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm/cma.c: check the max limit for cma allocation	Shiraz Hashim
	CMA allocation request size is represented by size_t that gets truncated when same is passed as int to bitmap_find_next_zero_area_off. We observe that during fuzz testing when cma allocation request is too high, bitmap_find_next_zero_area_off still returns success due to the truncation. This leads to kernel crash, as subsequent code assumes that requested memory is available. Fail cma allocation in case the request breaches the corresponding cma region size. Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	scripts/bloat-o-meter: fix SIGPIPE	Alexey Dobriyan
	Fix piping output to a program which quickly exits (read: head -n1) $ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux \| head -n1 add/remove: 0/0 grow/shrink: 9/60 up/down: 124/-305 (-181) close failed in file object destructor: sys.excepthook is missing lost sys.stderr Link: http://lkml.kernel.org/r/20161028204618.GA29923@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	shmem: fix pageflags after swapping DMA32 object	Hugh Dickins
	If shmem_alloc_page() does not set PageLocked and PageSwapBacked, then shmem_replace_page() needs to do so for itself. Without this, it puts newpage on the wrong lru, re-unlocks the unlocked newpage, and system descends into "Bad page" reports and freeze; or if CONFIG_DEBUG_VM=y, it hits an earlier VM_BUG_ON_PAGE(!PageLocked), depending on config. But shmem_replace_page() is not a common path: it's only called when swapin (or swapoff) finds the page was already read into an unsuitable zone: usually all zones are suitable, but gem objects for a few drm devices (gma500, omapdrm, crestline, broadwater) require zone DMA32 if there's more than 4GB of ram. Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1611062003510.11253@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [4.8.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm, frontswap: make sure allocated frontswap map is assigned	Vlastimil Babka
	Christian Borntraeger reports: With commit 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") kmemleak complains about a memory leak in swapon unreferenced object 0x3e09ba56000 (size 32112640): comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: __vmalloc_node_range+0x194/0x2d8 vzalloc+0x58/0x68 SyS_swapon+0xd60/0x12f8 system_call+0xd6/0x270 Turns out kmemleak is right. We now allocate the frontswap map depending on the kernel config (and no longer on the enablement) swapfile.c: [...] if (IS_ENABLED(CONFIG_FRONTSWAP)) frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long)); but later on this is passed along --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map); and ignored if frontswap is disabled --> frontswap_init(p->type, frontswap_map); static inline void frontswap_init(unsigned type, unsigned long *map) { if (frontswap_enabled()) __frontswap_init(type, map); } Thing is, that frontswap map is never freed. The leakage is relatively not that bad, because swapon is an infrequent and privileged operation. However, if the first frontswap backend is registered after a swap type has been already enabled, it will WARN_ON in frontswap_register_ops() and frontswap will not be available for the swap type. Fix this by making sure the map is assigned by frontswap_init() as long as CONFIG_FRONTSWAP is enabled. Fixes: 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	mm: remove extra newline from allocation stall warning	Tetsuo Handa
	Commit 63f53dea0c98 ("mm: warn about allocations which stall for too long") by error embedded "\n" in the format string, resulting in strange output. [ 722.876655] kworker/0:1: page alloction stalls for 160001ms, order:0 [ 722.876656] , mode:0x2400000(GFP_NOIO) [ 722.876657] CPU: 0 PID: 6966 Comm: kworker/0:1 Not tainted 4.8.0+ #69 Link: http://lkml.kernel.org/r/1476026219-7974-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11	Merge tag 'kvm-arm-for-v4.9-rc4' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for v4.9-rc4 - Kick the vcpu when a pending interrupt becomes pending again - Prevent access to invalid interrupt registers - Invalid TLBs when two vcpus from the same VM share a CPU
2016-11-11	PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails	Brian Norris
	Consider two devices, A and B, where B is a child of A, and B utilizes asynchronous suspend (it does not matter whether A is sync or async). If B fails to suspend_noirq() or suspend_late(), or is interrupted by a wakeup (pm_wakeup_pending()), then it aborts and sets the async_error variable. However, device A does not (immediately) check the async_error variable; it may continue to run its own suspend_noirq()/suspend_late() callback. This is bad. We can resolve this problem by doing our error and wakeup checking (particularly, for the async_error flag) after waiting for children to suspend, instead of before. This also helps align the logic for the noirq and late suspend cases with the logic in __device_suspend(). It's easy to observe this erroneous behavior by, for example, forcing a device to sleep a bit in its suspend_noirq() (to ensure the parent is waiting for the child to complete), then return an error, and watch the parent suspend_noirq() still get called. (Or similarly, fake a wakeup event at the right (or is it wrong?) time.) Fixes: de377b397272 (PM / sleep: Asynchronous threads for suspend_late) Fixes: 28b6fd6e3779 (PM / sleep: Asynchronous threads for suspend_noirq) Reported-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-10	splice: remove detritus from generic_file_splice_read()	Al Viro
	i_size check is a leftover from the horrors that used to play with the page cache in that function. With the switch to ->read_iter(), it's neither needed nor correct - for gfs2 it ends up being buggy, since i_size is not guaranteed to be correct until later (inside ->read_iter()). Spotted-by: Abhi Das <adas@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-11	Merge tag 'imx-drm-fixes-2016-11-10' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-fixes imx-drm: fix possible hangup when disabling crtcs - only ever disable the display controller (DC) module after all plane IDMAC channels are stopped. This fixes a regression introduced by the atomic modeset conversion. * tag 'imx-drm-fixes-2016-11-10' of git://git.pengutronix.de/git/pza/linux: drm/imx: disable planes before DC
2016-11-11	Merge branch 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes Regression fix for powerplay on some iceland boards. * 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux: drm/amd/powerplay: implement get_clock_by_type for iceland. drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2) drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk
2016-11-11	drm/udl: make control msg static const. (v2)	Dave Airlie
	Thou shall not send control msg from the stack, does that mean I can send it from the RO memory area? and it looks like the answer is no, so here's v2 which kmemdups. Reported-by: poma Tested-by: poma <poma@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-10	PCI: VMD: Update filename to reflect move	Keith Busch
	Updating MAINTAINERS to reflect the new location of the VMD driver. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-11-10	libceph: initialize last_linger_id with a large integer	Ilya Dryomov
	osdc->last_linger_id is a counter for lreq->linger_id, which is used for watch cookies. Starting with a large integer should ease the task of telling apart kernel and userspace clients. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-11-10	libceph: fix legacy layout decode with pool 0	Yan, Zheng
	If your data pool was pool 0, ceph_file_layout_from_legacy() transform that to -1 unconditionally, which broke upgrades. We only want do that for a fully zeroed ceph_file_layout, so that it still maps to a file_layout_t. If any fields are set, though, we trust the fl_pgpool to be a valid pool. Fixes: 7627151ea30bc ("libceph: define new ceph_file_layout structure") Link: http://tracker.ceph.com/issues/17825 Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-11-10	ceph: use default file splice read callback	Yan, Zheng
	Splice read/write implementation changed recently. When using generic_file_splice_read(), iov_iter with type == ITER_PIPE is passed to filesystem's read_iter callback. But ceph_sync_read() can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter expects pages from page cache). Fixing ceph_sync_read() requires a big patch. So use default splice read callback for now. Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-11-10	drm/amd/powerplay: implement get_clock_by_type for iceland.	Rex Zhu
	iceland use pptable v0. bugs: https://bugzilla.kernel.org/show_bug.cgi?id=185681 https://bugs.freedesktop.org/show_bug.cgi?id=98357 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-10	Merge remote-tracking branch 'mkp-scsi/4.9/scsi-fixes' into fixes	James Bottomley

2016-11-10	arm64: dts: rockchip: add three new resets for rk3399 PCIe controller	Shawn Lin
	pm_rst, aclk_rst and pclk_rst should be controlled by driver, so we need to add these three resets for PCIe controller. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Heiko Stuebner <heiko@sntech.de>
2016-11-10	PCI: rockchip: Add three new resets as required properties	Shawn Lin
	pm_rst, aclk_rst, pclk_rst was controlled by ROM code so the software wasn't needed to control it again in theory. But it didn't work properly, so we do need to do it again and add enough delay between the assert of pm_rst and the deassert of pm_rst. The Soc intergrated with this controller, rk3399, is still under MP test internally, so the backward compatibility won't be a big deal. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Rob Herring <robh@kernel.org>
2016-11-10	drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)	Alex Deucher
	Only check if the tables exist in relevant configs. This fixes a failure on V0 tables. v2: fix version check as suggested by Rex bugs: https://bugzilla.kernel.org/show_bug.cgi?id=185681 https://bugs.freedesktop.org/show_bug.cgi?id=98357 Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-10	drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland	Alex Deucher
	Was missing the handling for iceland. bugs: https://bugzilla.kernel.org/show_bug.cgi?id=185681 https://bugs.freedesktop.org/show_bug.cgi?id=98357 Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-10	drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk	Alex Deucher
	Missing for one case. bugs: https://bugzilla.kernel.org/show_bug.cgi?id=185681 https://bugs.freedesktop.org/show_bug.cgi?id=98357 Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-11-10	xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect	Chuck Lever
	When a LOCALINV WR is flushed, the frmr is marked STALE, then frwr_op_unmap_sync DMA-unmaps the frmr's SGL. These STALE frmrs are then recovered when frwr_op_map hunts for an INVALID frmr to use. All other cases that need frmr recovery leave that SGL DMA-mapped. The FRMR recovery path unconditionally DMA-unmaps the frmr's SGL. To avoid DMA unmapping the SGL twice for flushed LOCAL_INV WRs, alter the recovery logic (rather than the hot frwr_op_unmap_sync path) to distinguish among these cases. This solution also takes care of the case where multiple LOCAL_INV WRs are issued for the same rpcrdma_req, some complete successfully, but some are flushed. Reported-by: Vasco Steinmetz <linux@kyberraum.net> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Vasco Steinmetz <linux@kyberraum.net> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-10	char/pcmcia: add scr24x_cs chip card interface driver	Lubomir Rintel
	This implements only the very basic protocol "Mode A", just to make the device functional. Patches to implement "Mode C" that uses better bulking and is interrupt-driver may follow. The device essentially speaks the same protocol as USB CCID devices do over the bulk endpoints. The driver exchanges the command submissions and responses over a plain read()/write() interface, compatible with legacy OpenCT's pcmcia_block driver. Patches for the newer CCID driver are available: https://github.com/lkundrak/CCID/tree/lr/pcmcia_block Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga-manager: Add Socfpga Arria10 support	Alan Tull
	Add low level driver to support reprogramming FPGAs for Altera SoCFPGA Arria10. Signed-off-by: Alan Tull <atull@opensource.altera.com> Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga: add altera freeze bridge support	Alan Tull
	Add a low level driver for Altera Freeze Bridges to the FPGA Bridge framework. A freeze bridge is a bridge that exists in the FPGA fabric to isolate one region of the FPGA from the busses while that one region is being reprogrammed. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Matthew Gerlach <mgerlach@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	ARM: socfpga: fpga bridge driver support	Alan Tull
	Supports Altera SOCFPGA bridges: * fpga2sdram * fpga2hps * hps2fpga * lwhps2fpga Allows enabling/disabling the bridges through the FPGA Bridge Framework API functions. The fpga2sdram driver only supports enabling and disabling of the ports that been configured early on. This is due to a hardware limitation where the read, write, and command ports on the fpga2sdram bridge can only be reconfigured while there are no transactions to the sdram, i.e. when running out of OCRAM before the kernel boots. Device tree property 'init-val' configures the driver to enable or disable the bridge during probe. If the property does not exist, the driver will leave the bridge in its current state. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Matthew Gerlach <mgerlach@altera.com> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga: fpga-region: device tree control for FPGA	Alan Tull
	FPGA Regions support programming FPGA under control of the Device Tree. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga: add fpga bridge framework	Alan Tull
	This framework adds API functions for enabling/ disabling FPGA bridges under kernel control. This allows the Linux kernel to disable FPGA bridges during FPGA reprogramming and to enable FPGA bridges when FPGA reprogramming is done. This framework is be manufacturer-agnostic, allowing it to be used in interfaces that use the FPGA Manager Framework to reprogram FPGA's. The functions are: * of_fpga_bridge_get * fpga_bridge_put Get/put an exclusive reference to a FPGA bridge. * fpga_bridge_enable * fpga_bridge_disable Enable/Disable traffic through a bridge. * fpga_bridge_register * fpga_bridge_unregister Register/unregister a device-specific low level FPGA Bridge driver. Get an exclusive reference to a bridge and add it to a list: * fpga_bridge_get_to_list To enable/disable/put a set of bridges that are on a list: * fpga_bridges_enable * fpga_bridges_disable * fpga_bridges_put Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	add sysfs document for fpga bridge class	Alan Tull
	Add documentation for new FPGA bridge class's sysfs interface. Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga-mgr: add fpga image information struct	Alan Tull
	This patch adds a minor change in the FPGA Manager API to hold information that is specific to an FPGA image file. This change is expected to bring little, if any, pain. The socfpga and zynq drivers are fixed up in this patch. An FPGA image file will have particulars that affect how the image is programmed to the FPGA. One example is that current 'flags' currently has one bit which shows whether the FPGA image was built for full reconfiguration or partial reconfiguration. Another example is timeout values for enabling or disabling the bridges in the FPGA. As the complexity of the FPGA design increases, the bridges in the FPGA may take longer times to enable or disable. This patch adds a new 'struct fpga_image_info', moves the current 'u32 flags' to it. Two other image-specific u32's are added for the bridge enable/disable timeouts. The FPGA Manager API functions are changed, replacing the 'u32 flag' parameter with a pointer to struct fpga_image_info. Subsequent patches fix the existing low level FPGA manager drivers. Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga: add bindings document for fpga region	Alan Tull
	New bindings document for FPGA Region to support programming FPGA's under Device Tree control Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Moritz Fischer <moritz.fischer@ettus.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	doc: fpga-mgr: add fpga image info to api	Alan Tull
	This patch adds a minor change in the FPGA Manager API to hold information that is specific to an FPGA image file. This change is expected to bring little, if any, pain. An FPGA image file will have particulars that affect how the image is programmed to the FPGA. One example is that current 'flags' currently has one bit which shows whether the FPGA image was built for full reconfiguration or partial reconfiguration. Another example is timeout values for enabling or disabling the bridges in the FPGA. As the complexity of the FPGA design increases, the bridges in the FPGA may take longer times to enable or disable. This patch documents the change in the FPGA Manager API functions, replacing the 'u32 flag' parameter with a pointer to struct fpga_image_info. Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	fpga: add method to get fpga manager from device	Alan Tull
	The intent is to provide a non-DT method of getting ahold of a FPGA manager to do some FPGA programming. This patch refactors of_fpga_mgr_get() to reuse most of it while adding a new method fpga_mgr_get() for getting a pointer to a fpga manager struct, given the device. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	of/overlay: add of overlay notifications	Alan Tull
	This patch add of overlay notifications. When DT overlays are being added, some drivers/subsystems need to see device tree overlays before the changes go into the live tree. This is distinct from reconfig notifiers that are post-apply or post-remove and which issue very granular notifications without providing access to the context of a whole overlay. The following 4 notificatons are issued: OF_OVERLAY_PRE_APPLY OF_OVERLAY_POST_APPLY OF_OVERLAY_PRE_REMOVE OF_OVERLAY_POST_REMOVE In the case of pre-apply notification, if the notifier returns error, the overlay will be rejected. This patch exports two functions for registering/unregistering notifications: of_overlay_notifier_register(struct notifier_block nb) of_overlay_notifier_unregister(struct notifier_block nb) The of_mutex is held during these notifications. The notification data includes pointers to the overlay target and the overlay: struct of_overlay_notify_data { struct device_node overlay; struct device_node target; }; Signed-off-by: Alan Tull <atull@opensource.altera.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Moritz Fischer <moritz.fischer@ettus.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	s390: char: make slp_ctl explicitly non-modular	Paul Gortmaker
	The Makefile currently controlling compilation of this code is obj-y, meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modular usage, so that when reading the driver there is no doubt it is builtin-only. Since module_misc_device translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	blackfin: make-bf561/coreb.c explicitly non-modular	Paul Gortmaker
	The Kconfig currently controlling compilation of this code is: config BF561_COREB bool "Enable Core B loader" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_misc_device translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. Cc: Bas Vermeulen <bvermeul@blackstar.xs4all.nl> Cc: Steven Miao <realmz6@gmail.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	lightnvm: make core.c explicitly non-modular	Paul Gortmaker
	The Kconfig currently controlling compilation of this code is: drivers/lightnvm/Kconfig:menuconfig NVM drivers/lightnvm/Kconfig: bool "Open-Channel SSD target support" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_misc_driver translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. We replace module.h with moduleparam.h because this file still uses module params to control behaviour. Also note that MODULE_ALIAS is a no-op for non-modular code. Cc: Matias Bjorling <mb@lightnvm.io> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10	MAINTAINERS: auxdisplay: Added myself as maintainer for ht16k33 driver	Robin van der Gracht
	Signed-off-by: Robin van der Gracht <robin@protonic.nl> CC: Miguel Ojeda Sandonis <miguel.ojeda.sandonis@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>