summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-12-12mm: prevent double decrease of nr_reserved_highatomicMinchan Kim
There is race between page freeing and unreserved highatomic. CPU 0 CPU 1 free_hot_cold_page mt = get_pfnblock_migratetype set_pcppage_migratetype(page, mt) unreserve_highatomic_pageblock spin_lock_irqsave(&zone->lock) move_freepages_block set_pageblock_migratetype(page) spin_unlock_irqrestore(&zone->lock) free_pcppages_bulk __free_one_page(mt) <- mt is stale By above race, a page on CPU 0 could go non-highorderatomic free list since the pageblock's type is changed. By that, unreserve logic of highorderatomic can decrease reserved count on a same pageblock severak times and then it will make mismatch between nr_reserved_highatomic and the number of reserved pageblock. So, this patch verifies whether the pageblock is highatomic or not and decrease the count only if the pageblock is highatomic. Link: http://lkml.kernel.org/r/1476259429-18279-3-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sangseok Lee <sangseok.lee@lge.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm: don't steal highatomic pageblockMinchan Kim
Patch series "use up highorder free pages before OOM", v3. I got OOM report from production team with v4.4 kernel. It had enough free memory but failed to allocate GFP_KERNEL order-0 page and finally encountered OOM kill. It occured during QA process which launches several apps, switching and so on. It happned rarely. IOW, In normal situation, it was not a problem but if we are unluck so that several apps uses peak memory at the same time, it can happen. If we manage to pass the phase, the system can go working well. I could reproduce it with my test(memory spike easily. Look at below. The reason is free pages(19M) of DMA32 zone are reserved for HIGHORDERATOMIC and doesn't unreserved before the OOM. balloon invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0 balloon cpuset=/ mems_allowed=0 CPU: 1 PID: 8473 Comm: balloon Tainted: G W OE 4.8.0-rc7-00219-g3f74c9559583-dirty #3161 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x63/0x90 dump_header+0x5c/0x1ce oom_kill_process+0x22e/0x400 out_of_memory+0x1ac/0x210 __alloc_pages_nodemask+0x101e/0x1040 handle_mm_fault+0xa0a/0xbf0 __do_page_fault+0x1dd/0x4d0 trace_do_page_fault+0x43/0x130 do_async_page_fault+0x1a/0xa0 async_page_fault+0x28/0x30 Mem-Info: active_anon:383949 inactive_anon:106724 isolated_anon:0 active_file:15 inactive_file:44 isolated_file:0 unevictable:0 dirty:0 writeback:24 unstable:0 slab_reclaimable:2483 slab_unreclaimable:3326 mapped:0 shmem:0 pagetables:1906 bounce:0 free:6898 free_pcp:291 free_cma:0 Node 0 active_anon:1535796kB inactive_anon:426896kB active_file:60kB inactive_file:176kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:96kB shmem:0kB writeback_tmp:0kB unstable:0kB pages_scanned:1418 all_unreclaimable? no DMA free:8188kB min:44kB low:56kB high:68kB active_anon:7648kB inactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:20kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 1952 1952 1952 DMA32 free:19404kB min:5628kB low:7624kB high:9620kB active_anon:1528148kB inactive_anon:426896kB active_file:60kB inactive_file:420kB unevictable:0kB writepending:96kB present:2080640kB managed:2030092kB mlocked:0kB slab_reclaimable:9932kB slab_unreclaimable:13284kB kernel_stack:2496kB pagetables:7624kB bounce:0kB free_pcp:900kB local_pcp:112kB free_cma:0kB lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB (H) = 8192kB DMA32: 7*4kB (H) 8*8kB (H) 30*16kB (H) 31*32kB (H) 14*64kB (H) 9*128kB (H) 2*256kB (H) 2*512kB (H) 4*1024kB (H) 5*2048kB (H) 0*4096kB = 19484kB 51131 total pagecache pages 50795 pages in swap cache Swap cache stats: add 3532405601, delete 3532354806, find 124289150/1822712228 Free swap = 8kB Total swap = 255996kB 524158 pages RAM 0 pages HighMem/MovableOnly 12658 pages reserved 0 pages cma reserved 0 pages hwpoisoned Another example exceeded the limit by the race is in:imklog: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) CPU: 0 PID: 476 Comm: in:imklog Tainted: G E 4.8.0-rc7-00217-g266ef83c51e5-dirty #3135 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x63/0x90 warn_alloc_failed+0xdb/0x130 __alloc_pages_nodemask+0x4d6/0xdb0 new_slab+0x339/0x490 ___slab_alloc.constprop.74+0x367/0x480 __slab_alloc.constprop.73+0x20/0x40 __kmalloc+0x1a4/0x1e0 alloc_indirect.isra.14+0x1d/0x50 virtqueue_add_sgs+0x1c4/0x470 __virtblk_add_req+0xae/0x1f0 virtio_queue_rq+0x12d/0x290 __blk_mq_run_hw_queue+0x239/0x370 blk_mq_run_hw_queue+0x8f/0xb0 blk_mq_insert_requests+0x18c/0x1a0 blk_mq_flush_plug_list+0x125/0x140 blk_flush_plug_list+0xc7/0x220 blk_finish_plug+0x2c/0x40 __do_page_cache_readahead+0x196/0x230 filemap_fault+0x448/0x4f0 ext4_filemap_fault+0x36/0x50 __do_fault+0x75/0x140 handle_mm_fault+0x84d/0xbe0 __do_page_fault+0x1dd/0x4d0 trace_do_page_fault+0x43/0x130 do_async_page_fault+0x1a/0xa0 async_page_fault+0x28/0x30 Mem-Info: active_anon:363826 inactive_anon:121283 isolated_anon:32 active_file:65 inactive_file:152 isolated_file:0 unevictable:0 dirty:0 writeback:46 unstable:0 slab_reclaimable:2778 slab_unreclaimable:3070 mapped:112 shmem:0 pagetables:1822 bounce:0 free:9469 free_pcp:231 free_cma:0 Node 0 active_anon:1455304kB inactive_anon:485132kB active_file:260kB inactive_file:608kB unevictable:0kB isolated(anon):128kB isolated(file):0kB mapped:448kB dirty:0kB writeback:184kB shmem:0kB writeback_tmp:0kB unstable:0kB pages_scanned:13641 all_unreclaimable? no DMA free:7748kB min:44kB low:56kB high:68kB active_anon:7944kB inactive_anon:104kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:108kB kernel_stack:0kB pagetables:4kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 1952 1952 1952 DMA32 free:30128kB min:5628kB low:7624kB high:9620kB active_anon:1447360kB inactive_anon:485028kB active_file:260kB inactive_file:608kB unevictable:0kB writepending:184kB present:2080640kB managed:2030132kB mlocked:0kB slab_reclaimable:11112kB slab_unreclaimable:12172kB kernel_stack:2400kB pagetables:7284kB bounce:0kB free_pcp:924kB local_pcp:72kB free_cma:0kB lowmem_reserve[]: 0 0 0 0 DMA: 7*4kB (UE) 3*8kB (UH) 1*16kB (M) 0*32kB 2*64kB (U) 1*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (U) 1*4096kB (H) = 7748kB DMA32: 10*4kB (H) 3*8kB (H) 47*16kB (H) 38*32kB (H) 5*64kB (H) 1*128kB (H) 2*256kB (H) 3*512kB (H) 3*1024kB (H) 3*2048kB (H) 4*4096kB (H) = 30128kB 2775 total pagecache pages 2536 pages in swap cache Swap cache stats: add 206786828, delete 206784292, find 7323106/106686077 Free swap = 108744kB Total swap = 255996kB 524158 pages RAM 0 pages HighMem/MovableOnly 12648 pages reserved 0 pages cma reserved 0 pages hwpoisoned During the investigation, I found some problems with highatomic so this patch aims to solve the problems and the final goal is to unreserve every highatomic free pages before the OOM kill. This patch (of 4): In page freeing path, migratetype is racy so that a highorderatomic page could free into non-highorderatomic free list. If that page is allocated, VM can change the pageblock from higorderatomic to something. In that case, highatomic pageblock accounting is broken so it doesn't work(e.g., VM cannot reserve highorderatomic pageblocks any more although it doesn't reach 1% limit). So, this patch prohibits the changing from highatomic to other type. It's no problem because MIGRATE_HIGHATOMIC is not listed in fallback array so stealing will only happen due to unexpected races which is really rare. Also, such prohibiting keeps highatomic pageblock more longer so it would be better for highorderatomic page allocation. Link: http://lkml.kernel.org/r/1476259429-18279-2-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sangseok Lee <sangseok.lee@lge.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12kmemleak: fix reference to DocumentationAndreas Platschek
Documentation/kmemleak.txt was moved to Documentation/dev-tools/kmemleak.rst, this fixes the reference to the new location. Link: http://lkml.kernel.org/r/1476544946-18804-1-git-send-email-andreas.platschek@opentech.at Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/hugetlb.c: use huge_pte_lock instead of opencoding the lockAneesh Kumar K.V
No functional change by this patch. Link: http://lkml.kernel.org/r/20161018090234.22574-1-aneesh.kumar@linux.vnet.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/hugetlb.c: use the right pte val for compare in hugetlb_cowAneesh Kumar K.V
We cannot use the pte value used in set_pte_at for pte_same comparison, because archs like ppc64, filter/add new pte flag in set_pte_at. Instead fetch the pte value inside hugetlb_cow. We are comparing pte value to make sure the pte didn't change since we dropped the page table lock. hugetlb_cow get called with page table lock held, and we can take a copy of the pte value before we drop the page table lock. With hugetlbfs, we optimize the MAP_PRIVATE write fault path with no previous mapping (huge_pte_none entries), by forcing a cow in the fault path. This avoid take an addition fault to covert a read-only mapping to read/write. Here we were comparing a recently instantiated pte (via set_pte_at) to the pte values from linux page table. As explained above on ppc64 such pte_same check returned wrong result, resulting in us taking an additional fault on ppc64. Fixes: 6a119eae942c ("powerpc/mm: Add a _PAGE_PTE bit") Link: http://lkml.kernel.org/r/20161018154245.18023-1-aneesh.kumar@linux.vnet.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reported-by: Jan Stancek <jstancek@redhat.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/gup.c: make unnecessarily global vma_permits_fault() staticTobias Klauser
Make vma_permits_fault() static as it is only used in mm/gup.c This fixes a sparse warning. Link: http://lkml.kernel.org/r/20161017122353.31598-1-tklauser@distanz.ch Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/vmscan.c: set correct defer count for shrinkerShaohua Li
Our system uses significantly more slab memory with memcg enabled with the latest kernel. With 3.10 kernel, slab uses 2G memory, while with 4.6 kernel, 6G memory is used. The shrinker has problem. Let's see we have two memcg for one shrinker. In do_shrink_slab: 1. Check cg1. nr_deferred = 0, assume total_scan = 700. batch size is 1024, then no memory is freed. nr_deferred = 700 2. Check cg2. nr_deferred = 700. Assume freeable = 20, then total_scan = 10 or 40. Let's assume it's 10. No memory is freed. nr_deferred = 10. The deferred share of cg1 is lost in this case. kswapd will free no memory even run above steps again and again. The fix makes sure one memcg's deferred share isn't lost. Link: http://lkml.kernel.org/r/2414be961b5d25892060315fbb56bb19d81d0c07.1476227351.git.shli@fb.com Signed-off-by: Shaohua Li <shli@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/mprotect.c: don't touch single threaded PTEs which are on the right nodeAndi Kleen
We had some problems with pages getting unmapped in single threaded affinitized processes. It was tracked down to NUMA scanning. In this case it doesn't make any sense to unmap pages if the process is single threaded and the page is already on the node the process is running on. Add a check for this case into the numa protection code, and skip unmapping if true. In theory the process could be migrated later, but we will eventually rescan and unmap and migrate then. In theory this could be made more fancy: remembering this state per process or even whole mm. However that would need extra tracking and be more complicated, and the simple check seems to work fine so far. [ak@linux.intel.com: v3: Minor updates from Mel. Change code layout] Link: http://lkml.kernel.org/r/1476382117-5440-1-git-send-email-andi@firstfloor.org Link: http://lkml.kernel.org/r/1476288949-20970-1-git-send-email-andi@firstfloor.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm, slab: maintain total slab count instead of active countDavid Rientjes
Rather than tracking the number of active slabs for each node, track the total number of slabs. This is a minor improvement that avoids active slab tracking when a slab goes from free to partial or partial to free. For slab debugging, this also removes an explicit free count since it can easily be inferred by the difference in number of total objects and number of active objects. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1612042020110.115755@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Thelen <gthelen@google.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm, slab: faster active and free statsGreg Thelen
Reading /proc/slabinfo or monitoring slabtop(1) can become very expensive if there are many slab caches and if there are very lengthy per-node partial and/or free lists. Commit 07a63c41fa1f ("mm/slab: improve performance of gathering slabinfo stats") addressed the per-node full lists which showed a significant improvement when no objects were freed. This patch has the same motivation and optimizes the remainder of the usecases where there are very lengthy partial and free lists. This patch maintains per-node active_slabs (full and partial) and free_slabs rather than iterating the lists at runtime when reading /proc/slabinfo. When allocating 100GB of slab from a test cache where every slab page is on the partial list, reading /proc/slabinfo (includes all other slab caches on the system) takes ~247ms on average with 48 samples. As a result of this patch, the same read takes ~0.856ms on average. [rientjes@google.com: changelog] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1611081505240.13403@chino.kir.corp.google.com Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm/slab_common.c: check kmem_create_cache flags are commonThomas Garnier
Verify that kmem_create_cache flags are not allocator specific. It is done before removing flags that are not available with the current configuration. The current kmem_cache_create removes incorrect flags but do not validate the callers are using them right. This change will ensure that callers are not trying to create caches with flags that won't be used because allocator specific. Link: http://lkml.kernel.org/r/1478553075-120242-2-git-send-email-thgarnie@google.com Signed-off-by: Thomas Garnier <thgarnie@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12slub: avoid false-postive warningArnd Bergmann
The slub allocator gives us some incorrect warnings when CONFIG_PROFILE_ANNOTATED_BRANCHES is set, as the unlikely() macro prevents it from seeing that the return code matches what it was before: mm/slub.c: In function `kmem_cache_free_bulk': mm/slub.c:262:23: error: `df.s' may be used uninitialized in this function [-Werror=maybe-uninitialized] mm/slub.c:2943:3: error: `df.cnt' may be used uninitialized in this function [-Werror=maybe-uninitialized] mm/slub.c:2933:4470: error: `df.freelist' may be used uninitialized in this function [-Werror=maybe-uninitialized] mm/slub.c:2943:3: error: `df.tail' may be used uninitialized in this function [-Werror=maybe-uninitialized] I have not been able to come up with a perfect way for dealing with this, the three options I see are: - add a bogus initialization, which would increase the runtime overhead - replace unlikely() with unlikely_notrace() - remove the unlikely() annotation completely I checked the object code for a typical x86 configuration and the last two cases produce the same result, so I went for the last one, which is the simplest. Link: http://lkml.kernel.org/r/20161024155704.3114445-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12slub: move synchronize_sched out of slab_mutex on shrinkVladimir Davydov
synchronize_sched() is a heavy operation and calling it per each cache owned by a memory cgroup being destroyed may take quite some time. What is worse, it's currently called under the slab_mutex, stalling all works doing cache creation/destruction. Actually, there isn't much point in calling synchronize_sched() for each cache - it's enough to call it just once - after setting cpu_partial for all caches and before shrinking them. This way, we can also move it out of the slab_mutex, which we have to hold for iterating over the slab cache list. Link: https://bugzilla.kernel.org/show_bug.cgi?id=172991 Link: http://lkml.kernel.org/r/0a10d71ecae3db00fb4421bcd3f82bcc911f4be4.1475329751.git.vdavydov.dev@gmail.com Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com> Reported-by: Doug Smythies <dsmythies@telus.net> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm: memcontrol: use special workqueue for creating per-memcg cachesVladimir Davydov
Creating a lot of cgroups at the same time might stall all worker threads with kmem cache creation works, because kmem cache creation is done with the slab_mutex held. The problem was amplified by commits 801faf0db894 ("mm/slab: lockless decision to grow cache") in case of SLAB and 81ae6d03952c ("mm/slub.c: replace kick_all_cpus_sync() with synchronize_sched() in kmem_cache_shrink()") in case of SLUB, which increased the maximal time the slab_mutex can be held. To prevent that from happening, let's use a special ordered single threaded workqueue for kmem cache creation. This shouldn't introduce any functional changes regarding how kmem caches are created, as the work function holds the global slab_mutex during its whole runtime anyway, making it impossible to run more than one work at a time. By using a single threaded workqueue, we just avoid creating a thread per each work. Ordering is required to avoid a situation when a cgroup's work is put off indefinitely because there are other cgroups to serve, in other words to guarantee fairness. Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981 Link: http://lkml.kernel.org/r/20161004131417.GC1862@esperanza Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com> Reported-by: Doug Smythies <dsmythies@telus.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: replace CURRENT_TIME macroDeepa Dinamani
CURRENT_TIME is not y2038 safe. Use y2038 safe ktime_get_real_seconds() here for timestamps. struct heartbeat_block's hb_seq and deletetion time are already 64 bits wide and accommodate times beyond y2038. Also use y2038 safe ktime_get_real_ts64() for on disk inode timestamps. These are also wide enough to accommodate time64_t. Link: http://lkml.kernel.org/r/1475365298-29236-1-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: use time64_t to represent orphan scan timesDeepa Dinamani
struct timespec is not y2038 safe. Use time64_t which is y2038 safe to represent orphan scan times. time64_t is sufficient here as only the seconds delta times are relevant. Also use appropriate time functions that return time in time64_t format. Time functions now return monotonic time instead of real time as only delta scan times are relevant and these values are not persistent across reboots. The format string for the debug print is still using long as this is only the time elapsed since the last scan and long is sufficient to represent this value. Link: http://lkml.kernel.org/r/1475365138-20567-1-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: fix double put of recount tree in ocfs2_lock_refcount_tree()Ashish Samant
In ocfs2_lock_refcount_tree, if ocfs2_read_refcount_block() returns an error, we do ocfs2_refcount_tree_put twice (once in ocfs2_unlock_refcount_tree and once outside it), thereby reducing the refcount of the refcount tree twice, but we dont delete the tree in this case. This will make refcnt of the tree = 0 and the ocfs2_refcount_tree_put will eventually call ocfs2_mark_lockres_freeing, setting OCFS2_LOCK_FREEING for the refcount_tree->rf_lockres. The error returned by ocfs2_read_refcount_block is propagated all the way back and for next iteration of write, ocfs2_lock_refcount_tree gets the same tree back from ocfs2_get_refcount_tree because we havent deleted the tree. Now we have the same tree, but OCFS2_LOCK_FREEING is set for rf_lockres and eventually, when _ocfs2_lock_refcount_tree is called in this iteration, BUG_ON( __ocfs2_cluster_lock:1395 ERROR: Cluster lock called on freeing lockres T00000000000000000386019775b08d! flags 0x81) is triggerred. Call stack: (loop16,11155,0):ocfs2_lock_refcount_tree:482 ERROR: status = -5 (loop16,11155,0):ocfs2_refcount_cow_hunk:3497 ERROR: status = -5 (loop16,11155,0):ocfs2_refcount_cow:3560 ERROR: status = -5 (loop16,11155,0):ocfs2_prepare_inode_for_refcount:2111 ERROR: status = -5 (loop16,11155,0):ocfs2_prepare_inode_for_write:2190 ERROR: status = -5 (loop16,11155,0):ocfs2_file_write_iter:2331 ERROR: status = -5 (loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: bug expression: lockres->l_flags & OCFS2_LOCK_FREEING (loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: Cluster lock called on freeing lockres T00000000000000000386019775b08d! flags 0x81 kernel BUG at fs/ocfs2/dlmglue.c:1395! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: tun ocfs2 jbd2 xen_blkback xen_netback xen_gntdev .. sd_mod crc_t10dif ext3 jbd mbcache RIP: __ocfs2_cluster_lock+0x31c/0x740 [ocfs2] RSP: e02b:ffff88017c0138a0 EFLAGS: 00010086 Process loop16 (pid: 11155, threadinfo ffff88017c010000, task ffff8801b5374300) Call Trace: ocfs2_refcount_lock+0xae/0x130 [ocfs2] __ocfs2_lock_refcount_tree+0x29/0xe0 [ocfs2] ocfs2_lock_refcount_tree+0xdd/0x320 [ocfs2] ocfs2_refcount_cow_hunk+0x1cb/0x440 [ocfs2] ocfs2_refcount_cow+0xa9/0x1d0 [ocfs2] ocfs2_prepare_inode_for_refcount+0x115/0x200 [ocfs2] ocfs2_prepare_inode_for_write+0x33b/0x470 [ocfs2] ocfs2_file_write_iter+0x220/0x8c0 [ocfs2] aio_write_iter+0x2e/0x30 Fix this by avoiding the second call to ocfs2_refcount_tree_put() Link: http://lkml.kernel.org/r/1473984404-32011-1-git-send-email-ashish.samant@oracle.com Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Reviewed-by: Eric Ren <zren@suse.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: clean up unused 'page' parameter in ocfs2_write_end_nolock()piaojun
'page' parameter in ocfs2_write_end_nolock() is never used. Link: http://lkml.kernel.org/r/582FD91A.5000902@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2/dlm: clean up deadcode in dlm_master_request_handler()piaojun
When 'dispatch_assert' is set, 'response' must be DLM_MASTER_RESP_YES, and 'res' won't be null, so execution can't reach these two branch. Link: http://lkml.kernel.org/r/58174C91.3040004@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: delete redundant code and set the node bit into maybe_map directlyGuozhonghua
The variable `set_maybe' is redundant when the mle has been found in the map. So it is ok to set the node_idx into mle's maybe_map directly. Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4A3D490DD@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: Guozhonghua <guozhonghua@h3c.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2/dlm: clean up useless BUG_ON default case in dlm_finalize_reco_handler()piaojun
The value of 'stage' must be between 1 and 2, so the switch can't reach the default case. Link: http://lkml.kernel.org/r/57FB5EB2.7050002@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12drivers/pcmcia/m32r_pcc.c: check return from add_pcc_socketSudip Mukherjee
If request_irq() fails it passes the error to the caller. The caller now checks it and jumps to the common error path on failure. Link: http://lkml.kernel.org/r/1474237304-897-3-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12drivers/pcmcia/m32r_pcc.c: use common error pathSudip Mukherjee
Use a common error path for the failure. Link: http://lkml.kernel.org/r/1474237304-897-2-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12drivers/pcmcia/m32r_pcc.c: check return from request_irqSudip Mukherjee
While building m32r allmodconfig we were getting warning: drivers/pcmcia/m32r_pcc.c:331:2: warning: ignoring return value of 'request_irq', declared with attribute warn_unused_result request_irq() can fail and we should always be checking the result from it. Check the result and return it to the caller. Link: http://lkml.kernel.org/r/1474237304-897-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12m32r: fix build warningSudip Mukherjee
While building m32r defconfig we got warnings: arch/m32r/platforms/m32700ut/setup.c:249:24: warning: 'm32700ut_lcdpld_irq_type' defined but not used [-Wunused-variable] m32700ut_lcdpld_irq_type is only used when CONFIG_USB is enabled. Modify the code to declare the related variables and functions only when CONFIG_USB is enabled. Link: http://lkml.kernel.org/r/1479244406-7507-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12m32r: add simple dmaSudip Mukherjee
Some builds of m32r were failing as it tried to build few drivers which needed dma but m32r is not having dma support. Objections were raised when it was tried to make those drivers depend on HAS_DMA. So the next best thing is to add dma support to m32r. dma_noop is a very simple dma with 1:1 memory mapping. Link: http://lkml.kernel.org/r/1475949198-31623-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12scripts/tags.sh: handle OMAP platforms properlySam Protsenko
When SUBARCH is "omap1" or "omap2", plat-omap/ directory must be indexed. Handle this special case properly. While at it, check if mach- directory exists at all. Link: http://lkml.kernel.org/r/20161202122148.15001-1-joe.skb7@gmail.com Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Cc: Michal Marek <mmarek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12scripts/bloat-o-meter: compile .NUMBER regexAlexey Dobriyan
Every often used regex is better be compiled in Python. Speedup is about ~9.8% (whee!) $ perf stat -r 16 taskset -c 15 ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux >/dev/null 7.091202853 seconds time elapsed ( +- 0.15% ) +re.compile 6.397564973 seconds time elapsed ( +- 0.34% ) Link: http://lkml.kernel.org/r/20161119004417.GB1200@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12scripts/bloat-o-meter: don't use readlines()Alexey Dobriyan
readlines() conses whole list before doing anything which is slower for big object files. Use per line iterator. Speed up is ~2% on "allyesconfig" type of kernel. $ perf stat -r 16 taskset -c 15 ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux >/dev/null ... Before: 7.247708646 seconds time elapsed ( +- 0.28% ) After: 7.091202853 seconds time elapsed ( +- 0.15% ) Link: http://lkml.kernel.org/r/20161119004143.GA1200@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12prctl: remove one-shot limitation for changing exe linkStanislav Kinsburskiy
This limitation came with the reason to remove "another way for malicious code to obscure a compromised program and masquerade as a benign process" by allowing "security-concious program can use this prctl once during its early initialization to ensure the prctl cannot later be abused for this purpose": http://marc.info/?l=linux-kernel&m=133160684517468&w=2 This explanation doesn't look sufficient. The only thing "exe" link is indicating is the file, used to execve, which is basically nothing and not reliable immediately after process has returned from execve system call. Moreover, to use this feture, all the mappings to previous exe file have to be unmapped and all the new exe file permissions must be satisfied. Which means, that changing exe link is very similar to calling execve on the binary. The need to remove this limitations comes from migration of NFS mount point, which is not accessible during restore and replaced by other file system. Because of this exe link has to be changed twice. [akpm@linux-foundation.org: fix up comment] Link: http://lkml.kernel.org/r/20160927153755.9337.69650.stgit@localhost.localdomain Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: John Stultz <john.stultz@linaro.org> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12kthread: add __printf attributesNicolas Iooss
When commit fbae2d44aa1d ("kthread: add kthread_create_worker*()") introduced some kthread_create_...() functions which were taking printf-like parametter, it introduced __printf attributes to some functions (e.g. kthread_create_worker()). Nevertheless some new functions were forgotten (they have been detected thanks to -Wmissing-format-attribute warning flag). Add the missing __printf attributes to the newly-introduced functions in order to detect formatting issues at build-time with -Wformat flag. Link: http://lkml.kernel.org/r/20161126193543.22672-1-nicolas.iooss_linux@m4x.org Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-13Merge tag 'drm-vc4-next-2016-12-09' of https://github.com/anholt/linux into ↵Dave Airlie
drm-next This pull request brings in VEC (TV-out) support for vc4, along with a pageflipping race fix. * tag 'drm-vc4-next-2016-12-09' of https://github.com/anholt/linux: drm/vc4: Don't use drm_put_dev drm/vc4: Document VEC DT binding drm/vc4: Add support for the VEC (Video Encoder) IP drm: Add TV connector states to drm_connector_state drm: Turn DRM_MODE_SUBCONNECTOR_xx definitions into an enum drm/vc4: Fix ->clock_select setting for the VEC encoder drm/vc4: Fix race between page flip completion event and clean-up
2016-12-13drm/nouveau/kms/nv50: fix atomic regression on original G80Ben Skeggs
Reported-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/bl: Do not register interface if Apple GMUX detectedPierre Moreau
The Apple GMUX is the one managing the backlight, so there is no need for Nouveau to register its own backlight interface. v2: Do not split information message on two lines as it prevents from grepping it, as pointed out by Lukas Wunner v3: Add a missing end-of-line character to the printed message Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/bl: Assign different names to interfacesPierre Moreau
Currently, every backlight interface created by Nouveau uses the same name, nv_backlight. This leads to a sysfs warning as it tries to create an already existing folder. This patch adds a incremented number to the name, but keeps the initial name as nv_backlight, to avoid possibly breaking userspace; the second interface will be named nv_backlight1, and so on. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86539 v2: * Switch to using ida for generating unique IDs, as suggested by Ilia Mirkin; * Allocate backlight name on the stack, as suggested by Ilia Mirkin; * Move `nouveau_get_backlight_name()` to avoid forward declaration, as suggested by Ilia Mirkin; * Fix reference to bug report formatting, as reported by Nick Tenney. v3: * Define a macro for the size of the backlight name, to avoid defining it multiple times; * Use snprintf in place of sprintf. v4: * Do not create similarly named interfaces when reaching the maximum amount of unique names, but fail instead, as pointed out by Lukas Wunner Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/bios/dp: fix handling of LevelEntryTableIndex on DP table 4.2Ben Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/ltc: protect clearing of comptags with mutexBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Cc: stable@vger.kernel.org
2016-12-13drm/nouveau/gr/gf100-: handle GPC/TPC/MPC trapBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/core: recognise GP106 chipsetBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/ttm: wait for bo fence to signal before unmapping vmasBen Skeggs
TTM was changed a while back to allow for pipelining of buffer moves, and part of this was the removal of waiting for a BO to idle before calling move(), placing the responsibility on the driver to do this if required. That's all well and good, except, we make use of move_notify() to handle mapping/unmapping from the GPU VMM as move() isn't called on all paths. This commit adds a wait before unmapping from a VMM in move_notify(), to prevent GPU page faults where a buffer is still being accessed. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Cc: stable@vger.kernel.org [v4.8+]
2016-12-13drm/nouveau/gr/gf100-: FECS intr handling is not relevant on proprietary ucodeBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/gr/gf100-: properly ack all FECS error interruptsBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-13drm/nouveau/fifo/gf100-: recover from host mmu faultsBen Skeggs
This has been on the TODO list for a while now, recovering from things such as attempting to execute a push buffer or touch a semaphore in an unmapped memory area. The only thing required on the HW side here is that the offending channel is removed from the runlist, and *not* a full reset of PFIFO. This used to be a bit messier to handle before the rework to make use of engine topology info, but is apparently now trivial. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-12-12Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "Two changes: - implement various VMWare guest OS improvements/fixes (Alexey Makhalov) - unexport a spurious export from the intel-mid platform driver (Lukas Wunner)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vmware: Add paravirt sched clock x86/vmware: Add basic paravirt ops support x86/vmware: Use tsc_khz value for calibrate_cpu() x86/platform/intel-mid: Unexport intel_mid_pci_set_power_state() x86/vmware: Read tsc_khz only once at boot time
2016-12-12Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode update from Ingo Molnar: "The biggest change (by Borislav Petkov) is a thorough rewrite of the Intel microcode loader and its interactions with the core code. The biggest conceptual change is the decoupling of the microcode loading on boot and application processors (which load the microcode in different scenarios), so that both parse the input patches with as few assumptions as possible - this also fixes various kernel address space randomization bugs. (The AP side then goes on and caches the result to improve boot performance.) Since the AMD side already did this, this change also opened up the path towards more unification/simplification of the core microcode loading infrastructure: 10 files changed, 647 insertions(+), 940 deletions(-) which speaks for itself" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Bump driver version, update copyrights x86/microcode: Rework microcode loading x86/microcode/intel: Remove intel_lib.c x86/microcode/amd: Move private inlines to .c and mark local functions static x86/microcode: Collect CPU info on resume x86/microcode: Issue the debug printk on resume only on success x86/microcode/amd: Hand down the CPU family x86/microcode: Export the microcode cache linked list x86/microcode: Remove one #ifdef clause x86/microcode/intel: Simplify generic_load_microcode() x86/microcode: Move driver authors to CREDITS x86/microcode: Run the AP-loading routine only on the application processors
2016-12-12Merge branch 'x86-idle-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 idle updates from Ingo Molnar: "There were two bigger changes in this development cycle: - remove idle notifiers: 32 files changed, 74 insertions(+), 803 deletions(-) These notifiers were of questionable value and the main usecase, the i7300 driver, was essentially unmaintained and can be removed, plus modern power management concepts don't need the callback - so use this golden opportunity and get rid of this opaque and fragile callback from a latency sensitive code path. (Len Brown, Thomas Gleixner) - improve the AMD Erratum 400 workaround that used high overhead MSR polling in the idle loop (Borisla Petkov, Thomas Gleixner)" * 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove empty idle.h header x86/amd: Simplify AMD E400 aware idle routine x86/amd: Check for the C1E bug post ACPI subsystem init x86/bugs: Separate AMD E400 erratum and C1E bug x86/cpufeature: Provide helper to set bugs bits x86/idle: Remove enter_idle(), exit_idle() x86: Remove x86_test_and_clear_bit_percpu() x86/idle: Remove is_idle flag x86/idle: Remove idle_notifier i7300_idle: Remove this driver
2016-12-12Merge branch 'x86-headers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header fixlet from Ingo Molnar: "Remove unnecessary module.h inclusion from core code (Paul Gortmaker)" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/percpu: Remove unnecessary include of module.h, add asm/desc.h
2016-12-12Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Ingo Molnar: "The main changes in this cycle were: - do a large round of simplifications after all CPUs do 'eager' FPU context switching in v4.9: remove CR0 twiddling, remove leftover eager/lazy bts, etc (Andy Lutomirski) - more FPU code simplifications: remove struct fpu::counter, clarify nomenclature, remove unnecessary arguments/functions and better structure the code (Rik van Riel)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Remove clts() x86/fpu: Remove stts() x86/fpu: Handle #NM without FPU emulation as an error x86/fpu, lguest: Remove CR0.TS support x86/fpu, kvm: Remove host CR0.TS manipulation x86/fpu: Remove irq_ts_save() and irq_ts_restore() x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs() x86/fpu: Get rid of two redundant clts() calls x86/fpu: Finish excising 'eagerfpu' x86/fpu: Split old_fpu & new_fpu handling into separate functions x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state() x86/fpu: Split old & new FPU code paths x86/fpu: Remove __fpregs_(de)activate() x86/fpu: Rename lazy restore functions to "register state valid" x86/fpu, kvm: Remove KVM vcpu->fpu_counter x86/fpu: Remove struct fpu::counter x86/fpu: Remove use_eager_fpu() x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction x86/fpu: Hard-disable lazy FPU mode x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
2016-12-12Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU updates from Ingo Molnar: "The changes in this development cycle were: - AMD CPU topology enhancements that are cleanups on current CPUs but which enable future Fam17 hardware. (Yazen Ghannam) - unify bugs.c and bugs_64.c (Borislav Petkov) - remove the show_msr= boot option (Borislav Petkov) - simplify a boot message (Borislav Petkov)" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/AMD: Clean up cpu_llc_id assignment per topology feature x86/cpu: Get rid of the show_msr= boot option x86/cpu: Merge bugs.c and bugs_64.c x86/cpu: Remove the printk format specifier in "CPU0: "
2016-12-12Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Two cleanups in the LDT handling code, by Dan Carpenter and Thomas Gleixner" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Make all size computations unsigned x86/ldt: Make a size argument unsigned