summaryrefslogtreecommitdiff
path: root/mm
AgeCommit message (Collapse)Author
2018-10-21page cache: Finish XArray conversionMatthew Wilcox
With no more radix tree API users left, we can drop the GFP flags and use xa_init() instead of INIT_RADIX_TREE(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Comment fixupsMatthew Wilcox
Remove the last mentions of radix tree from various comments. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21memfd: Convert memfd_tag_pins to XArrayMatthew Wilcox
Switch to a batch-processing model like memfd_wait_for_pins() and use the xa_state previously set up by memfd_wait_for_pins(). Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
2018-10-21memfd: Convert memfd_wait_for_pins to XArrayMatthew Wilcox
Simplify the locking by taking the spinlock while we walk the tree on the assumption that many acquires and releases of the lock will be worse than holding the lock while we process an entire batch of pages. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
2018-10-21shmem: Convert shmem_partial_swap_usage to XArrayMatthew Wilcox
Simpler code because the xarray takes care of things like the limit and dereferencing the slot. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert shmem_free_swap to XArrayMatthew Wilcox
Since we are conditionally storing NULL in the XArray, we do not need to allocate memory and the GFP flags will be unused. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert shmem_alloc_hugepage to XArrayMatthew Wilcox
xa_find() is a slightly easier API to use than radix_tree_gang_lookup_slot() because it contains its own RCU locking. This commit removes the last user of radix_tree_gang_lookup_slot() so remove the function too. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert shmem_add_to_page_cache to XArrayMatthew Wilcox
We can use xas_find_conflict() instead of radix_tree_gang_lookup_slot() to find any conflicting entry and combine the three paths through this function into one. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert find_swap_entry to XArrayMatthew Wilcox
This is a 1:1 conversion. The major part of this patch is converting the test framework from userspace to kernel space and mirroring the algorithm now used in find_swap_entry(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert shmem_confirm_swap to XArrayMatthew Wilcox
xa_load has its own RCU locking, so we can eliminate it here. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21shmem: Convert shmem_radix_tree_replace to XArrayMatthew Wilcox
Rename shmem_radix_tree_replace() to shmem_replace_entry() and convert it to use the XArray API. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21pagevec: Use xa_mark_tMatthew Wilcox
Removes sparse warnings. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert is_page_cache_freeable to XArrayMatthew Wilcox
This is just a variable rename and comment change. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert khugepaged_scan_shmem to XArrayMatthew Wilcox
Slightly shorter and easier to read code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert collapse_shmem to XArrayMatthew Wilcox
I found another victim of the radix tree being hard to use. Because there was no call to radix_tree_preload(), khugepaged was allocating radix_tree_nodes using GFP_ATOMIC. I also converted a local_irq_save()/restore() pair to disable()/enable(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert huge_memory to XArrayMatthew Wilcox
Quite a straightforward conversion. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert page migration to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert __do_page_cache_readahead to XArrayMatthew Wilcox
This one is trivial. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert delete_from_swap_cache to XArrayMatthew Wilcox
Both callers of __delete_from_swap_cache have the swp_entry_t already, so pass that in to make constructing the XA_STATE easier. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert add_to_swap_cache to XArrayMatthew Wilcox
Combine __add_to_swap_cache and add_to_swap_cache into one function since there is no more need to preload. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert truncate to XArrayMatthew Wilcox
This is essentially xa_cmpxchg() with the locking handled above us, and it doesn't have to handle replacing a NULL entry. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert workingset to XArrayMatthew Wilcox
We construct an XA_STATE and use it to delete the node with xas_store() rather than adding a special function for this unique use case. Includes a test that simulates this usage for the test suite. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21mm: Convert page-writeback to XArrayMatthew Wilcox
Includes moving mapping_tagged() to fs.h as a static inline, and changing it to return bool. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert filemap_range_has_page to XArrayMatthew Wilcox
Instead of calling find_get_pages_range() and putting any reference, use xas_find() to iterate over any entries in the range, skipping the shadow/swap entries. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Remove stray radix commentMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert delete_batch to XArrayMatthew Wilcox
Rename the function from page_cache_tree_delete_batch to just page_cache_delete_batch. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert filemap_map_pages to XArrayMatthew Wilcox
Slight change of strategy here; if we have trouble getting hold of a page for whatever reason (eg a compound page is split underneath us), don't spin to stabilise the page, just continue the iteration, like we would if we failed to trylock the page. Since this is a speculative optimisation, it feels like we should allow the process to take an extra fault if it turns out to need this page instead of spending time to pin down a page it may not need. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_entries_tag to XArrayMatthew Wilcox
Slightly shorter and simpler code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache; Convert find_get_pages_range_tag to XArrayMatthew Wilcox
The 'end' parameter of the xas_for_each iterator avoids a useless iteration at the end of the range. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_pages_contig to XArrayMatthew Wilcox
There's no direct replacement for radix_tree_for_each_contig() in the XArray API as it's an unusual thing to do. Instead, open-code a loop using xas_next(). This removes the only user of radix_tree_for_each_contig() so delete the iterator from the API and the test suite code for it. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_pages_range to XArrayMatthew Wilcox
The 'end' parameter of the xas_for_each iterator avoids a useless iteration at the end of the range. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_entries to XArrayMatthew Wilcox
Slightly shorter and simpler code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert find_get_entry to XArrayMatthew Wilcox
Slightly shorter and simpler code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert page deletion to XArrayMatthew Wilcox
The code is slightly shorter and simpler. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Add and replace pages using the XArrayMatthew Wilcox
Use the XArray APIs to add and replace pages in the page cache. This removes two uses of the radix tree preload API and is significantly shorter code. It also removes the last user of __radix_tree_create() outside radix-tree.c itself, so make it static. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21page cache: Convert hole search to XArrayMatthew Wilcox
The page cache offers the ability to search for a miss in the previous or next N locations. Rather than teach the XArray about the page cache's definition of a miss, use xas_prev() and xas_next() to search the page array. This should be more efficient as it does not have to start the lookup from the top for each index. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Define struct xa_nodeMatthew Wilcox
This is a direct replacement for struct radix_tree_node. A couple of struct members have changed name, so convert those. Use a #define so that radix tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-18mremap: properly flush TLB before releasing the pageLinus Torvalds
Jann Horn points out that our TLB flushing was subtly wrong for the mremap() case. What makes mremap() special is that we don't follow the usual "add page to list of pages to be freed, then flush tlb, and then free pages". No, mremap() obviously just _moves_ the page from one page table location to another. That matters, because mremap() thus doesn't directly control the lifetime of the moved page with a freelist: instead, the lifetime of the page is controlled by the page table locking, that serializes access to the entry. As a result, we need to flush the TLB not just before releasing the lock for the source location (to avoid any concurrent accesses to the entry), but also before we release the destination page table lock (to avoid the TLB being flushed after somebody else has already done something to that page). This also makes the whole "need_flush" logic unnecessary, since we now always end up flushing the TLB for every valid entry. Reported-and-tested-by: Jann Horn <jannh@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-16mm: move is_kernel_rodata() to asm-generic/sections.hBartosz Golaszewski
Export this routine so that we can use it later in devm_kstrdup_const() and devm_kfree(). Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13mm/thp: fix call to mmu_notifier in set_pmd_migration_entry() v2Jérôme Glisse
Inside set_pmd_migration_entry() we are holding page table locks and thus we can not sleep so we can not call invalidate_range_start/end() So remove call to mmu_notifier_invalidate_range_start/end() because they are call inside the function calling set_pmd_migration_entry() (see try_to_unmap_one()). Link: http://lkml.kernel.org/r/20181012181056.7864-1-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13mm/mmap.c: don't clobber partially overlapping VMA with MAP_FIXED_NOREPLACEJann Horn
Daniel Micay reports that attempting to use MAP_FIXED_NOREPLACE in an application causes that application to randomly crash. The existing check for handling MAP_FIXED_NOREPLACE looks up the first VMA that either overlaps or follows the requested region, and then bails out if that VMA overlaps *the start* of the requested region. It does not bail out if the VMA only overlaps another part of the requested region. Fix it by checking that the found VMA only starts at or after the end of the requested region, in which case there is no overlap. Test case: user@debian:~$ cat mmap_fixed_simple.c #include <sys/mman.h> #include <errno.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #ifndef MAP_FIXED_NOREPLACE #define MAP_FIXED_NOREPLACE 0x100000 #endif int main(void) { char *p; errno = 0; p = mmap((void*)0x10001000, 0x4000, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED_NOREPLACE, -1, 0); printf("p1=%p err=%m\n", p); errno = 0; p = mmap((void*)0x10000000, 0x2000, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED_NOREPLACE, -1, 0); printf("p2=%p err=%m\n", p); char cmd[100]; sprintf(cmd, "cat /proc/%d/maps", getpid()); system(cmd); return 0; } user@debian:~$ gcc -o mmap_fixed_simple mmap_fixed_simple.c user@debian:~$ ./mmap_fixed_simple p1=0x10001000 err=Success p2=0x10000000 err=Success 10000000-10002000 r--p 00000000 00:00 0 10002000-10005000 ---p 00000000 00:00 0 564a9a06f000-564a9a070000 r-xp 00000000 fe:01 264004 /home/user/mmap_fixed_simple 564a9a26f000-564a9a270000 r--p 00000000 fe:01 264004 /home/user/mmap_fixed_simple 564a9a270000-564a9a271000 rw-p 00001000 fe:01 264004 /home/user/mmap_fixed_simple 564a9a54a000-564a9a56b000 rw-p 00000000 00:00 0 [heap] 7f8eba447000-7f8eba5dc000 r-xp 00000000 fe:01 405885 /lib/x86_64-linux-gnu/libc-2.24.so 7f8eba5dc000-7f8eba7dc000 ---p 00195000 fe:01 405885 /lib/x86_64-linux-gnu/libc-2.24.so 7f8eba7dc000-7f8eba7e0000 r--p 00195000 fe:01 405885 /lib/x86_64-linux-gnu/libc-2.24.so 7f8eba7e0000-7f8eba7e2000 rw-p 00199000 fe:01 405885 /lib/x86_64-linux-gnu/libc-2.24.so 7f8eba7e2000-7f8eba7e6000 rw-p 00000000 00:00 0 7f8eba7e6000-7f8eba809000 r-xp 00000000 fe:01 405876 /lib/x86_64-linux-gnu/ld-2.24.so 7f8eba9e9000-7f8eba9eb000 rw-p 00000000 00:00 0 7f8ebaa06000-7f8ebaa09000 rw-p 00000000 00:00 0 7f8ebaa09000-7f8ebaa0a000 r--p 00023000 fe:01 405876 /lib/x86_64-linux-gnu/ld-2.24.so 7f8ebaa0a000-7f8ebaa0b000 rw-p 00024000 fe:01 405876 /lib/x86_64-linux-gnu/ld-2.24.so 7f8ebaa0b000-7f8ebaa0c000 rw-p 00000000 00:00 0 7ffcc99fa000-7ffcc9a1b000 rw-p 00000000 00:00 0 [stack] 7ffcc9b44000-7ffcc9b47000 r--p 00000000 00:00 0 [vvar] 7ffcc9b47000-7ffcc9b49000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] user@debian:~$ uname -a Linux debian 4.19.0-rc6+ #181 SMP Wed Oct 3 23:43:42 CEST 2018 x86_64 GNU/Linux user@debian:~$ As you can see, the first page of the mapping at 0x10001000 was clobbered. Link: http://lkml.kernel.org/r/20181010152736.99475-1-jannh@google.com Fixes: a4ff8e8620d3 ("mm: introduce MAP_FIXED_NOREPLACE") Signed-off-by: Jann Horn <jannh@google.com> Reported-by: Daniel Micay <danielmicay@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11Merge branch 'sched-urgent-for-linus' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Ingo writes: "scheduler fix: Cleanup of dead code left over from the recent sched/numa fixes." * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm, sched/numa: Remove remaining traces of NUMA rate-limiting
2018-10-09x86/mm: Page size aware flush_tlb_mm_range()Peter Zijlstra
Use the new tlb_get_unmap_shift() to determine the stride of the INVLPG loop. Cc: Nick Piggin <npiggin@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2018-10-09Merge branch 'tlb/asm-generic' of ↵Peter Zijlstra
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into x86/mm Pull in the generic mmu_gather changes from the ARM64 tree such that we can put x86 specific things on top as well.
2018-10-09mm, sched/numa: Remove remaining traces of NUMA rate-limitingSrikar Dronamraju
Remove the leftover pglist_data::numabalancing_migrate_lock and its initialization, we stopped using this lock with: efaffc5e40ae ("mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration") [ mingo: Rewrote the changelog. ] Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linux-MM <linux-mm@kvack.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1538824999-31230-1-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-07percpu: stop leaking bitmap metadata blocksMike Rapoport
The commit ca460b3c9627 ("percpu: introduce bitmap metadata blocks") introduced bitmap metadata blocks. These metadata blocks are allocated whenever a new chunk is created, but they are never freed. Fix it. Fixes: ca460b3c9627 ("percpu: introduce bitmap metadata blocks") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Dennis Zhou <dennis@kernel.org>
2018-10-05Merge branch 'akpm'Greg Kroah-Hartman
* akpm: mm: madvise(MADV_DODUMP): allow hugetlbfs pages ocfs2: fix locking for res->tracking and dlm->tracking_list mm/vmscan.c: fix int overflow in callers of do_shrink_slab() mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly mm/vmstat.c: fix outdated vmstat_text proc: restrict kernel stack dumps to root mm/hugetlb: add mmap() encodings for 32MB and 512MB page sizes mm/migrate.c: split only transparent huge pages when allocation fails ipc/shm.c: use ERR_CAST() for shm_lock() error return mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl mm, thp: fix mlocking THP page with migration enabled ocfs2: fix crash in ocfs2_duplicate_clusters_by_page() hugetlb: take PMD sharing into account when flushing tlb/caches mm: migration: fix migration of huge PMD shared pages
2018-10-05mm: madvise(MADV_DODUMP): allow hugetlbfs pagesDaniel Black
Reproducer, assuming 2M of hugetlbfs available: Hugetlbfs mounted, size=2M and option user=testuser # mount | grep ^hugetlbfs hugetlbfs on /dev/hugepages type hugetlbfs (rw,pagesize=2M,user=dan) # sysctl vm.nr_hugepages=1 vm.nr_hugepages = 1 # grep Huge /proc/meminfo AnonHugePages: 0 kB ShmemHugePages: 0 kB HugePages_Total: 1 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 2048 kB Code: #include <sys/mman.h> #include <stddef.h> #define SIZE 2*1024*1024 int main() { void *ptr; ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_HUGETLB | MAP_ANONYMOUS, -1, 0); madvise(ptr, SIZE, MADV_DONTDUMP); madvise(ptr, SIZE, MADV_DODUMP); } Compile and strace: mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7ff7c9200000 madvise(0x7ff7c9200000, 2097152, MADV_DONTDUMP) = 0 madvise(0x7ff7c9200000, 2097152, MADV_DODUMP) = -1 EINVAL (Invalid argument) hugetlbfs pages have VM_DONTEXPAND in the VmFlags driver pages based on author testing with analysis from Florian Weimer[1]. The inclusion of VM_DONTEXPAND into the VM_SPECIAL defination was a consequence of the large useage of VM_DONTEXPAND in device drivers. A consequence of [2] is that VM_DONTEXPAND marked pages are unable to be marked DODUMP. A user could quite legitimately madvise(MADV_DONTDUMP) their hugetlbfs memory for a while and later request that madvise(MADV_DODUMP) on the same memory. We correct this omission by allowing madvice(MADV_DODUMP) on hugetlbfs pages. [1] https://stackoverflow.com/questions/52548260/madvisedodump-on-the-same-ptr-size-as-a-successful-madvisedontdump-fails-wit [2] commit 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers") Link: http://lkml.kernel.org/r/20180930054629.29150-1-daniel@linux.ibm.com Link: https://lists.launchpad.net/maria-discuss/msg05245.html Fixes: 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers") Reported-by: Kenneth Penza <kpenza@gmail.com> Signed-off-by: Daniel Black <daniel@linux.ibm.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-05mm/vmscan.c: fix int overflow in callers of do_shrink_slab()Kirill Tkhai
do_shrink_slab() returns unsigned long value, and the placing into int variable cuts high bytes off. Then we compare ret and 0xfffffffe (since SHRINK_EMPTY is converted to ret type). Thus a large number of objects returned by do_shrink_slab() may be interpreted as SHRINK_EMPTY, if low bytes of their value are equal to 0xfffffffe. Fix that by declaration ret as unsigned long in these functions. Link: http://lkml.kernel.org/r/153813407177.17544.14888305435570723973.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-05mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properlyJann Horn
5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside the kernel unconditional to reduce #ifdef soup, but (either to avoid showing dummy zero counters to userspace, or because that code was missed) didn't update the vmstat_array, meaning that all following counters would be shown with incorrect values. This only affects kernel builds with CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n. Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com Fixes: 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Kemi Wang <kemi.wang@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>