summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-10-07mm,oom_reaper: do not attempt to reap a task twiceTetsuo Handa
"mm, oom_reaper: do not attempt to reap a task twice" tried to give the OOM reaper one more chance to retry using MMF_OOM_NOT_REAPABLE flag. But the usefulness of the flag is rather limited and actually never shown in practice. If the flag is set, it means that the holder of mm->mmap_sem cannot call up_write() due to presumably being blocked at unkillable wait waiting for other thread's memory allocation. But since one of threads sharing that mm will queue that mm immediately via task_will_free_mem() shortcut (otherwise, oom_badness() will select the same mm again due to oom_score_adj value unchanged), retrying MMF_OOM_NOT_REAPABLE mm is unlikely helpful. Let's always set MMF_OOM_REAPED. Link: http://lkml.kernel.org/r/1472119394-11342-3-git-send-email-mhocko@kernel.org Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, swap: add swap_cluster_listHuang Ying
This is a code clean up patch without functionality changes. The swap_cluster_list data structure and its operations are introduced to provide some better encapsulation for the free cluster and discard cluster list operations. This avoid some code duplication, improved the code readability, and reduced the total line number. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1472067356-16004-1-git-send-email-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shaohua Li <shli@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm: pagewalk: fix the comment for test_walkJames Morse
Modify the comment describing struct mm_walk->test_walk()s behaviour to match the comment on walk_page_test() and the behaviour of walk_page_vma(). Fixes: fafaa4264eba4 ("pagewalk: improve vma handling") Link: http://lkml.kernel.org/r/1471622518-21980-1-git-send-email-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm/page_owner: don't define fields on struct page_ext by hard-codingJoonsoo Kim
There is a memory waste problem if we define field on struct page_ext by hard-coding. Entry size of struct page_ext includes the size of those fields even if it is disabled at runtime. Now, extra memory request at runtime is possible so page_owner don't need to define it's own fields by hard-coding. This patch removes hard-coded define and uses extra memory for storing page_owner information in page_owner. Most of code are just mechanical changes. Link: http://lkml.kernel.org/r/1471315879-32294-7-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm/page_ext: support extra space allocation by page_ext userJoonsoo Kim
Until now, if some page_ext users want to use it's own field on page_ext, it should be defined in struct page_ext by hard-coding. It has a problem that wastes memory in following situation. struct page_ext { #ifdef CONFIG_A int a; #endif #ifdef CONFIG_B int b; #endif }; Assume that kernel is built with both CONFIG_A and CONFIG_B. Even if we enable feature A and doesn't enable feature B at runtime, each entry of struct page_ext takes two int rather than one int. It's undesirable result so this patch tries to fix it. To solve above problem, this patch implements to support extra space allocation at runtime. When need() callback returns true, it's extra memory requirement is summed to entry size of page_ext. Also, offset for each user's extra memory space is returned. With this offset, user can use this extra space and there is no need to define needed field on page_ext by hard-coding. This patch only implements an infrastructure. Following patch will use it for page_owner which is only user having it's own fields on page_ext. Link: http://lkml.kernel.org/r/1471315879-32294-6-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm/page_owner: move page_owner specific function to page_owner.cJoonsoo Kim
There is no reason that page_owner specific function resides on vmstat.c. Link: http://lkml.kernel.org/r/1471315879-32294-4-git-send-email-iamjoonsoo.kim@lge.com Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, vmscan: get rid of throttle_vm_writeoutMichal Hocko
throttle_vm_writeout() was introduced back in 2005 to fix OOMs caused by excessive pageout activity during the reclaim. Too many pages could be put under writeback therefore LRUs would be full of unreclaimable pages until the IO completes and in turn the OOM killer could be invoked. There have been some important changes introduced since then in the reclaim path though. Writers are throttled by balance_dirty_pages when initiating the buffered IO and later during the memory pressure, the direct reclaim is throttled by wait_iff_congested if the node is considered congested by dirty pages on LRUs and the underlying bdi is congested by the queued IO. The kswapd is throttled as well if it encounters pages marked for immediate reclaim or under writeback which signals that that there are too many pages under writeback already. Finally should_reclaim_retry does congestion_wait if the reclaim cannot make any progress and there are too many dirty/writeback pages. Another important aspect is that we do not issue any IO from the direct reclaim context anymore. In a heavy parallel load this could queue a lot of IO which would be very scattered and thus unefficient which would just make the problem worse. This three mechanisms should throttle and keep the amount of IO in a steady state even under heavy IO and memory pressure so yet another throttling point doesn't really seem helpful. Quite contrary, Mikulas Patocka has reported that swap backed by dm-crypt doesn't work properly because the swapout IO cannot make sufficient progress as the writeout path depends on dm_crypt worker which has to allocate memory to perform the encryption. In order to guarantee a forward progress it relies on the mempool allocator. mempool_alloc(), however, prefers to use the underlying (usually page) allocator before it grabs objects from the pool. Such an allocation can dive into the memory reclaim and consequently to throttle_vm_writeout. If there are too many dirty or pages under writeback it will get throttled even though it is in fact a flusher to clear pending pages. kworker/u4:0 D ffff88003df7f438 10488 6 2 0x00000000 Workqueue: kcryptd kcryptd_crypt [dm_crypt] Call Trace: schedule+0x3c/0x90 schedule_timeout+0x1d8/0x360 io_schedule_timeout+0xa4/0x110 congestion_wait+0x86/0x1f0 throttle_vm_writeout+0x44/0xd0 shrink_zone_memcg+0x613/0x720 shrink_zone+0xe0/0x300 do_try_to_free_pages+0x1ad/0x450 try_to_free_pages+0xef/0x300 __alloc_pages_nodemask+0x879/0x1210 alloc_pages_current+0xa1/0x1f0 new_slab+0x2d7/0x6a0 ___slab_alloc+0x3fb/0x5c0 __slab_alloc+0x51/0x90 kmem_cache_alloc+0x27b/0x310 mempool_alloc_slab+0x1d/0x30 mempool_alloc+0x91/0x230 bio_alloc_bioset+0xbd/0x260 kcryptd_crypt+0x114/0x3b0 [dm_crypt] Let's just drop throttle_vm_writeout altogether. It is not very much helpful anymore. I have tried to test a potential writeback IO runaway similar to the one described in the original patch which has introduced that [1]. Small virtual machine (512MB RAM, 4 CPUs, 2G of swap space and disk image on a rather slow NFS in a sync mode on the host) with 8 parallel writers each writing 1G worth of data. As soon as the pagecache fills up and the direct reclaim hits then I start anon memory consumer in a loop (allocating 300M and exiting after populating it) in the background to make the memory pressure even stronger as well as to disrupt the steady state for the IO. The direct reclaim is throttled because of the congestion as well as kswapd hitting congestion_wait due to nr_immediate but throttle_vm_writeout doesn't ever trigger the sleep throughout the test. Dirty+writeback are close to nr_dirty_threshold with some fluctuations caused by the anon consumer. [1] https://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc1/2.6.9-rc1-mm3/broken-out/vm-pageout-throttling.patch Link: http://lkml.kernel.org/r/1471171473-21418-1-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: NeilBrown <neilb@suse.com> Cc: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, compaction: create compact_gap wrapperVlastimil Babka
Compaction uses a watermark gap of (2UL << order) pages at various places and it's not immediately obvious why. Abstract it through a compact_gap() wrapper to create a single place with a thorough explanation. [vbabka@suse.cz: clarify the comment of compact_gap()] Link: http://lkml.kernel.org/r/7b6aed1f-fdf8-2063-9ff4-bbe4de712d37@suse.cz Link: http://lkml.kernel.org/r/20160810091226.6709-9-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, compaction: add the ultimate direct compaction priorityVlastimil Babka
During reclaim/compaction loop, it's desirable to get a final answer from unsuccessful compaction so we can either fail the allocation or invoke the OOM killer. However, heuristics such as deferred compaction or pageblock skip bits can cause compaction to skip parts or whole zones and lead to premature OOM's, failures or excessive reclaim/compaction retries. To remedy this, we introduce a new direct compaction priority called COMPACT_PRIO_SYNC_FULL, which instructs direct compaction to: - ignore deferred compaction status for a zone - ignore pageblock skip hints - ignore cached scanner positions and scan the whole zone The new priority should get eventually picked up by should_compact_retry() and this should improve success rates for costly allocations using __GFP_REPEAT, such as hugetlbfs allocations, and reduce some corner-case OOM's for non-costly allocations. Link: http://lkml.kernel.org/r/20160810091226.6709-6-vbabka@suse.cz [vbabka@suse.cz: use the MIN_COMPACT_PRIORITY alias] Link: http://lkml.kernel.org/r/d443b884-87e7-1c93-8684-3a3a35759fb1@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, compaction: rename COMPACT_PARTIAL to COMPACT_SUCCESSVlastimil Babka
COMPACT_PARTIAL has historically meant that compaction returned after doing some work without fully compacting a zone. It however didn't distinguish if compaction terminated because it succeeded in creating the requested high-order page. This has changed recently and now we only return COMPACT_PARTIAL when compaction thinks it succeeded, or the high-order watermark check in compaction_suitable() passes and no compaction needs to be done. So at this point we can make the return value clearer by renaming it to COMPACT_SUCCESS. The next patch will remove some redundant tests for success where compaction just returned COMPACT_SUCCESS. Link: http://lkml.kernel.org/r/20160810091226.6709-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm, compaction: cleanup unused functionsVlastimil Babka
Since kswapd compaction moved to kcompactd, compact_pgdat() is not called anymore, so we remove it. The only caller of __compact_pgdat() is compact_node(), so we merge them and remove code that was only reachable from kswapd. Link: http://lkml.kernel.org/r/20160810091226.6709-3-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm/vmalloc.c: fix align value calculation errorzijun_hu
It causes double align requirement for __get_vm_area_node() if parameter size is power of 2 and VM_IOREMAP is set in parameter flags, for example size=0x10000 -> fls_long(0x10000)=17 -> align=0x20000 get_count_order_long() is implemented and can be used instead of fls_long() for fixing the bug, for example size=0x10000 -> get_count_order_long(0x10000)=16 -> align=0x10000 [akpm@linux-foundation.org: s/get_order_long()/get_count_order_long()/] [zijun_hu@zoho.com: fixes] Link: http://lkml.kernel.org/r/57AABC8B.1040409@zoho.com [akpm@linux-foundation.org: locate get_count_order_long() next to get_count_order()] [akpm@linux-foundation.org: move get_count_order[_long] definitions to pick up fls_long()] [zijun_hu@htc.com: move out get_count_order[_long]() from __KERNEL__ scope] Link: http://lkml.kernel.org/r/57B2C4CE.80303@zoho.com Link: http://lkml.kernel.org/r/fc045ecf-20fa-0722-b3ac-9a6140488fad@zoho.com Signed-off-by: zijun_hu <zijun_hu@htc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: zijun_hu <zijun_hu@htc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm: oom: deduplicate victim selection code for memcg and global oomVladimir Davydov
When selecting an oom victim, we use the same heuristic for both memory cgroup and global oom. The only difference is the scope of tasks to select the victim from. So we could just export an iterator over all memcg tasks and keep all oom related logic in oom_kill.c, but instead we duplicate pieces of it in memcontrol.c reusing some initially private functions of oom_kill.c in order to not duplicate all of it. That looks ugly and error prone, because any modification of select_bad_process should also be propagated to mem_cgroup_out_of_memory. Let's rework this as follows: keep all oom heuristic related code private to oom_kill.c and make oom_kill.c use exported memcg functions when it's really necessary (like in case of iterating over memcg tasks). Link: http://lkml.kernel.org/r/1470056933-7505-1-git-send-email-vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07jiffies: add time comparison functions for 64 bit jiffiesJason A. Donenfeld
Though the time_before and time_after family of functions were nicely extended to support jiffies64, so that the interface would be consistent, it was forgotten to also extend the before/after jiffies functions to support jiffies64. This commit brings the interface to parity between jiffies and jiffies64, which is quite convenient. Link: http://lkml.kernel.org/r/20160929033319.12188-1-Jason@zx2c4.com Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07fanotify: use notification_lock instead of access_lockJan Kara
Fanotify code has its own lock (access_lock) to protect a list of events waiting for a response from userspace. However this is somewhat awkward as the same list_head in the event is protected by notification_lock if it is part of the notification queue and by access_lock if it is part of the fanotify private queue which makes it difficult for any reliable checks in the generic code. So make fanotify use the same lock - notification_lock - for protecting its private event list. Link: http://lkml.kernel.org/r/1473797711-14111-6-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07fsnotify: convert notification_mutex to a spinlockJan Kara
notification_mutex is used to protect the list of pending events. As such there's no reason to use a sleeping lock for it. Convert it to a spinlock. [jack@suse.cz: fixed version] Link: http://lkml.kernel.org/r/1474031567-1831-1-git-send-email-jack@suse.cz Link: http://lkml.kernel.org/r/1473797711-14111-5-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07xattr: Add __vfs_{get,set,remove}xattr helpersAndreas Gruenbacher
Right now, various places in the kernel check for the existence of getxattr, setxattr, and removexattr inode operations and directly call those operations. Switch to helper functions and test for the IOP_XATTR flag instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07vfs: Add IOP_XATTR inode operations flagAndreas Gruenbacher
The IOP_XATTR inode operations flag in inode->i_opflags indicates that the inode has xattr support. The flag is automatically set by new_inode() on filesystems with xattr support (where sb->s_xattr is defined), and cleared otherwise. Filesystems can explicitly clear it for inodes that should not have xattr support. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07Merge branch 'for-4.9/dax' into libnvdimm-for-nextDan Williams
2016-10-07Merge branch 'for-4.9/libnvdimm' into libnvdimm-for-nextDan Williams
2016-10-07Merge branch 'work.splice_read' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS splice updates from Al Viro: "There's a bunch of branches this cycle, both mine and from other folks and I'd rather send pull requests separately. This one is the conversion of ->splice_read() to ITER_PIPE iov_iter (and introduction of such). Gets rid of a lot of code in fs/splice.c and elsewhere; there will be followups, but these are for the next cycle... Some pipe/splice-related cleanups from Miklos in the same branch as well" * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pipe: fix comment in pipe_buf_operations pipe: add pipe_buf_steal() helper pipe: add pipe_buf_confirm() helper pipe: add pipe_buf_release() helper pipe: add pipe_buf_get() helper relay: simplify relay_file_read() switch default_file_splice_read() to use of pipe-backed iov_iter switch generic_file_splice_read() to use of ->read_iter() new iov_iter flavour: pipe-backed fuse_dev_splice_read(): switch to add_to_pipe() skb_splice_bits(): get rid of callback new helper: add_to_pipe() splice: lift pipe_lock out of splice_to_pipe() splice: switch get_iovec_page_array() to iov_iter splice_to_pipe(): don't open-code wakeup_pipe_readers() consistent treatment of EFAULT on O_DIRECT read/write
2016-10-07Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Lots of bug fixes and cleanups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits) ext4: remove unused variable ext4: use journal inode to determine journal overhead ext4: create function to read journal inode ext4: unmap metadata when zeroing blocks ext4: remove plugging from ext4_file_write_iter() ext4: allow unlocked direct IO when pages are cached ext4: require encryption feature for EXT4_IOC_SET_ENCRYPTION_POLICY fscrypto: use standard macros to compute length of fname ciphertext ext4: do not unnecessarily null-terminate encrypted symlink data ext4: release bh in make_indexed_dir ext4: Allow parallel DIO reads ext4: allow DAX writeback for hole punch jbd2: fix lockdep annotation in add_transaction_credits() blockgroup_lock.h: simplify definition of NR_BG_LOCKS blockgroup_lock.h: remove debris from bgl_lock_ptr() conversion fscrypto: make filename crypto functions return 0 on success fscrypto: rename completion callbacks to reflect usage fscrypto: remove unnecessary includes fscrypto: improved validation when loading inode encryption metadata ext4: fix memory leak when symlink decryption fails ...
2016-10-07Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer updates from Jens Axboe: "This is the main pull request for block layer changes in 4.9. As mentioned at the last merge window, I've changed things up and now do just one branch for core block layer changes, and driver changes. This avoids dependencies between the two branches. Outside of this main pull request, there are two topical branches coming as well. This pull request contains: - A set of fixes, and a conversion to blk-mq, of nbd. From Josef. - Set of fixes and updates for lightnvm from Matias, Simon, and Arnd. Followup dependency fix from Geert. - General fixes from Bart, Baoyou, Guoqing, and Linus W. - CFQ async write starvation fix from Glauber. - Add supprot for delayed kick of the requeue list, from Mike. - Pull out the scalable bitmap code from blk-mq-tag.c and make it generally available under the name of sbitmap. Only blk-mq-tag uses it for now, but the blk-mq scheduling bits will use it as well. From Omar. - bdev thaw error progagation from Pierre. - Improve the blk polling statistics, and allow the user to clear them. From Stephen. - Set of minor cleanups from Christoph in block/blk-mq. - Set of cleanups and optimizations from me for block/blk-mq. - Various nvme/nvmet/nvmeof fixes from the various folks" * 'for-4.9/block' of git://git.kernel.dk/linux-block: (54 commits) fs/block_dev.c: return the right error in thaw_bdev() nvme: Pass pointers, not dma addresses, to nvme_get/set_features() nvme/scsi: Remove power management support nvmet: Make dsm number of ranges zero based nvmet: Use direct IO for writes admin-cmd: Added smart-log command support. nvme-fabrics: Add host_traddr options field to host infrastructure nvme-fabrics: revise host transport option descriptions nvme-fabrics: rework nvmf_get_address() for variable options nbd: use BLK_MQ_F_BLOCKING blkcg: Annotate blkg_hint correctly cfq: fix starvation of asynchronous writes blk-mq: add flag for drivers wanting blocking ->queue_rq() blk-mq: remove non-blocking pass in blk_mq_map_request blk-mq: get rid of manual run of queue with __blk_mq_run_hw_queue() block: export bio_free_pages to other modules lightnvm: propagate device_add() error code lightnvm: expose device geometry through sysfs lightnvm: control life of nvm_dev in driver blk-mq: register device instead of disk ...
2016-10-07Merge branch 'i2c/for-4.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "Here is the 4.9 pull request from I2C including: - centralized error messages when registering to the core - improved lockdep annotations to prevent false positives - DT support for muxes, gates, and arbitrators - bus speeds can now be obtained from ACPI - i2c-octeon got refactored and now supports ThunderX SoCs, too - i2c-tegra and i2c-designware got a bigger bunch of updates - a couple of standard driver fixes and improvements" * 'i2c/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (71 commits) i2c: axxia: disable clks in case of failure in probe i2c: octeon: thunderx: Limit register access retries i2c: uniphier-f: fix misdetection of incomplete STOP condition gpio: pca953x: variable 'id' was used twice i2c: i801: Add support for Kaby Lake PCH-H gpio: pca953x: fix an incorrect lockdep warning i2c: add a warning to i2c_adapter_depth() lockdep: make MAX_LOCKDEP_SUBCLASSES unconditionally visible i2c: export i2c_adapter_depth() i2c: rk3x: Fix variable 'min_total_ns' unused warning i2c: rk3x: Fix sparse warning i2c / ACPI: Do not touch an I2C device if it belongs to another adapter i2c: octeon: Fix high-level controller status check i2c: octeon: Avoid sending STOP during recovery i2c: octeon: Fix set SCL recovery function i2c: rcar: add support for r8a7796 (R-Car M3-W) i2c: imx: make bus recovery through pinctrl optional i2c: meson: add gxbb compatible string i2c: uniphier-f: set the adapter to master mode when probing i2c: uniphier-f: avoid WARN_ON() of clk_disable() in failure path ...
2016-10-07IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packetsJack Morgenstein
In MLX qp packets, the LRH (built by the driver) has both a VL field and an SL field. When building a QP1 packet, the VL field should reflect the SLtoVL mapping and not arbitrarily contain zero (as is done now). This bug causes credit problems in IB switches at high rates of QP1 packets. The fix is to cache the SL to VL mapping in the driver, and look up the VL mapped to the SL provided in the send request when sending QP1 packets. For FW versions which support generating a port_management_config_change event with subtype sl-to-vl-table-change, the driver uses that event to update its sl-to-vl mapping cache. Otherwise, the driver snoops incoming SMP mads to update the cache. There remains the case where the FW is running in secure-host mode (so no QP0 packets are delivered to the driver), and the FW does not generate the sl2vl mapping change event. To support this case, the driver updates (via querying the FW) its sl2vl mapping cache when running in secure-host mode when it receives either a Port Up event or a client-reregister event (where the port is still up, but there may have been an opensm failover). OpenSM modifies the sl2vl mapping before Port Up and Client-reregister events occur, so if there is a mapping change the driver's cache will be properly updated. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/mthca: Move user vendor structuresLeon Romanovsky
This patch moves mthca vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmthca) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/nes: Move user vendor structuresLeon Romanovsky
This patch moves nes vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmlx4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/ocrdma: Move user vendor structuresLeon Romanovsky
This patch moves ocrdma vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmlx4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. In addition, it changes types to be __uXX instead of uXX. Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-By: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/mlx4: Move user vendor structuresLeon Romanovsky
This patch moves mlx4 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmlx4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/cxgb4: Move user vendor structuresLeon Romanovsky
This patch moves cxgb4 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libcxgb4) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/cxgb3: Move user vendor structuresLeon Romanovsky
This patch moves cxgb3 vendor's specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libcxgb3) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/mlx5: Move and decouple user vendor structuresLeon Romanovsky
This patch decouples and moves vendors specific structures to common UAPI folder which will be visible to all consumers. These structures are used by user-space library driver (libmlx5) and currently manually copied to that library. This move will allow cross-compile against these files and simplify introduction of vendor specific data. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/{core,hw}: Add constant for node_descYuval Shaia
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/core: Add more fields to IPv6 flow specificationMaor Gottlieb
Add the following fields to IPv6 flow filter specification: 1. Traffic Class 2. Flow Label 3. Next Header 4. Hop Limit Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/uverbs: Add more fields to IPv4 flow specificationMaor Gottlieb
Add the following fields to IPv4 flow filter specification: 1. Type of Service 2. Time to Live 3. Flags 4. Protocol Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/uverbs: Add support to extend flow steering specificationsMaor Gottlieb
Flow steering specifications structures were implemented as in an extensible way that allows one to add new filters and new fields to existing filters. These specifications have never been extended, therefore the kernel flow specifications size and the user flow specifications size were must to be equal. In downstream patch, the IPv4 flow specifications type is extended to support TOS and TTL fields. To support an extension we change the flow specifications size condition test to be as following: * If the user flow specifications is bigger than the kernel specifications, we verify that all the bits which not in the kernel specifications are zeros and the flow is added only with the kernel specifications fields. * Otherwise, we add flow rule only with the user specifications fields. User space filters must be aligned with 32bits. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/uverbs: Expose RSS related capabilitiesYishai Hadas
Query RSS related attributes and return them to user-space via the extended query device uverbs command. It includes both direct ones (i.e. struct ib_uverbs_rss_caps) and max_wq_type_rq which may be used in both RSS and non RSS flows. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07IB/core: Expose RSS related capabilitiesYishai Hadas
Expose RSS related capabilities, it includes both direct ones (i.e. struct ib_rss_caps) and max_wq_type_rq which may be used in both RSS and non RSS flows. Specifically, supported_qpts: - QP types that support RSS on the device. max_rwq_indirection_tables: - Max number of receive work queue indirection tables that could be opened on the device. max_rwq_indirection_table_size: - Max size of a receive work queue indirection table. max_wq_type_rq: - Max number of work queues of receive type that could be opened on the device. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "The usual rocket science from the trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: tracing/syscalls: fix multiline in error message text lib/Kconfig.debug: fix DEBUG_SECTION_MISMATCH description doc: vfs: fix fadvise() sycall name x86/entry: spell EBX register correctly in documentation securityfs: fix securityfs_create_dir comment irq: Fix typo in tracepoint.xml
2016-10-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - fix for patching modules that contain .altinstructions or .parainstructions sections, from Jessica Yu - make TAINT_LIVEPATCH a per-module flag (so that it's immediately clear which module caused the taint), from Josh Poimboeuf * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch/module: make TAINT_LIVEPATCH module-specific Documentation: livepatch: add section about arch-specific code livepatch/x86: apply alternatives and paravirt patches after relocations livepatch: use arch_klp_init_object_loaded() to finish arch-specific tasks
2016-10-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - Integrated Sensor Hub support (Cherrytrail+) from Srinivas Pandruvada - Big cleanup of Wacom driver; namely it's now using devres, and the standardized LED API so that libinput doesn't need to have root access any more, with substantial amount of other cleanups piggy-backing on top. All this from Benjamin Tissoires - Report descriptor parsing would now ignore and out-of-range System controls in case of the application actually being System Control. This fixes quite some issues with several devices, and allows us to remove a few ->report_fixup callbacks. From Benjamin Tissoires - ... a lot of other assorted small fixes and device ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (76 commits) HID: add missing \n to end of dev_warn messages HID: alps: fix multitouch cursor issue HID: hid-logitech: Documentation updates/corrections HID: hid-logitech: Improve Wingman Formula Force GP support HID: hid-logitech: Rewrite of descriptor for all DF wheels HID: hid-logitech: Compute combined pedals value HID: hid-logitech: Add combined pedal support Logitech wheels HID: hid-logitech: Introduce control for combined pedals feature HID: sony: Update copyright and add Dualshock 4 rate control note HID: sony: Defer the initial USB Sixaxis output report HID: sony: Relax duplicate checking for USB-only devices Revert "HID: microsoft: fix invalid rdesc for 3k kbd" HID: alps: fix error return code in alps_input_configured() HID: alps: fix stick device not working after resume HID: support for keyboard - Corsair STRAFE HID: alps: Fix memory leak HID: uclogic: Add support for UC-Logic TWHA60 v3 HID: uclogic: Override constant descriptors HID: uclogic: Support UGTizer GP0610 partially HID: uclogic: Add support for several more tablets ...
2016-10-07Merge tag 'pci-v4.9-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Summary of PCI changes for the v4.9 merge window: Enumeration: - microblaze: Add multidomain support for procfs (Bharat Kumar Gogada) Resource management: - Ignore requested alignment for PROBE_ONLY and fixed resources (Yongji Xie) - Ignore requested alignment for VF BARs (Yongji Xie) PCI device hotplug: - Make core explicitly non-modular (Paul Gortmaker) PCIe native device hotplug: - Rename pcie_isr() locals for clarity (Bjorn Helgaas) - Return IRQ_NONE when we can't read interrupt status (Bjorn Helgaas) - Remove unnecessary guard (Bjorn Helgaas) - Clean up dmesg "Slot(%s)" messages (Bjorn Helgaas) - Remove useless pciehp_get_latch_status() calls (Bjorn Helgaas) - Clear attention LED on device add (Keith Busch) - Allow exclusive userspace control of indicators (Keith Busch) - Process all hotplug events before looking for new ones (Mayurkumar Patel) - Don't re-read Slot Status when queuing hotplug event (Mayurkumar Patel) - Don't re-read Slot Status when handling surprise event (Mayurkumar Patel) - Make explicitly non-modular (Paul Gortmaker) Power management: - Afford direct-complete to devices with non-standard PM (Lukas Wunner) - Query platform firmware for device power state (Lukas Wunner) - Recognize D3cold in pci_update_current_state() (Lukas Wunner) - Avoid unnecessary resume after direct-complete (Lukas Wunner) - Make explicitly non-modular (Paul Gortmaker) Virtualization: - Mark Atheros AR9580 to avoid bus reset (Maik Broemme) - Check for pci_setup_device() failure in pci_iov_add_virtfn() (Po Liu) MSI: - Enable PCI_MSI_IRQ_DOMAIN support for ARC (Joao Pinto) AER: - Remove aerdriver.nosourceid kernel parameter (Bjorn Helgaas) - Remove aerdriver.forceload kernel parameter (Bjorn Helgaas) - Fix aer_probe() kernel-doc comment (Cao jin) - Add bus flag to skip source ID matching (Jon Derrick) - Avoid memory allocation in interrupt handling path (Jon Derrick) - Cache capability position (Keith Busch) - Make explicitly non-modular (Paul Gortmaker) - Remove duplicate AER severity translation (Tyler Baicar) - Send correct severity to calculate AER severity (Tyler Baicar) Precision Time Measurement: - Add Precision Time Measurement (PTM) support (Jonathan Yong) - Add PTM clock granularity information (Bjorn Helgaas) - Add pci_enable_ptm() for drivers to enable PTM on endpoints (Bjorn Helgaas) Generic host bridge driver: - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) - Make explicitly non-modular (Paul Gortmaker) Altera host bridge driver: - Remove redundant platform_get_resource() return value check (Bjorn Helgaas) - Poll for link training status after retraining the link (Ley Foon Tan) - Rework config accessors for use without a struct pci_bus (Ley Foon Tan) - Move retrain from fixup to altera_pcie_host_init() (Ley Foon Tan) - Make MSI explicitly non-modular (Paul Gortmaker) - Make explicitly non-modular (Paul Gortmaker) - Relax device number checking to allow SR-IOV (Po Liu) ARM Versatile host bridge driver: - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) Axis ARTPEC-6 host bridge driver: - Drop __init from artpec6_add_pcie_port() (Niklas Cassel) Freescale i.MX6 host bridge driver: - Make explicitly non-modular (Paul Gortmaker) Intel VMD host bridge driver: - Add quirk for AER to ignore source ID (Jon Derrick) - Allocate IRQ lists with correct MSI-X count (Jon Derrick) - Convert to use pci_alloc_irq_vectors() API (Jon Derrick) - Eliminate vmd_vector member from list type (Jon Derrick) - Eliminate index member from IRQ list (Jon Derrick) - Synchronize with RCU freeing MSI IRQ descs (Keith Busch) - Request userspace control of PCIe hotplug indicators (Keith Busch) - Move VMD driver to drivers/pci/host (Keith Busch) Marvell Aardvark host bridge driver: - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) - Remove redundant dev_err call in advk_pcie_probe() (Wei Yongjun) Microsoft Hyper-V host bridge driver: - Use zero-length array in struct pci_packet (Dexuan Cui) - Use pci_function_description[0] in struct definitions (Dexuan Cui) - Remove the unused 'wrk' in struct hv_pcibus_device (Dexuan Cui) - Handle vmbus_sendpacket() failure in hv_compose_msi_msg() (Dexuan Cui) - Handle hv_pci_generic_compl() error case (Dexuan Cui) - Use list_move_tail() instead of list_del() + list_add_tail() (Wei Yongjun) NVIDIA Tegra host bridge driver: - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) - Remove redundant _data suffix (Thierry Reding) - Use of_device_get_match_data() (Thierry Reding) Qualcomm host bridge driver: - Make explicitly non-modular (Paul Gortmaker) Renesas R-Car host bridge driver: - Consolidate register space lookup and ioremap (Bjorn Helgaas) - Don't disable/unprepare clocks on prepare/enable failure (Geert Uytterhoeven) - Add multi-MSI support (Grigory Kletsko) - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) - Fix some checkpatch warnings (Sergei Shtylyov) - Try increasing PCIe link speed to 5 GT/s at boot (Sergei Shtylyov) Rockchip host bridge driver: - Add DT bindings for Rockchip PCIe controller (Shawn Lin) - Add Rockchip PCIe controller support (Shawn Lin) - Improve the deassert sequence of four reset pins (Shawn Lin) - Fix wrong transmitted FTS count (Shawn Lin) - Increase the Max Credit update interval (Rajat Jain) Samsung Exynos host bridge driver: - Make explicitly non-modular (Paul Gortmaker) ST Microelectronics SPEAr13xx host bridge driver: - Make explicitly non-modular (Paul Gortmaker) Synopsys DesignWare host bridge driver: - Return data directly from dw_pcie_readl_rc() (Bjorn Helgaas) - Exchange viewport of `MEMORYs' and `CFGs/IOs' (Dong Bo) - Check LTSSM training bit before deciding link is up (Jisheng Zhang) - Move link wait definitions to .c file (Joao Pinto) - Wait for iATU enable (Joao Pinto) - Add iATU Unroll feature (Joao Pinto) - Fix pci_remap_iospace() failure path (Lorenzo Pieralisi) - Make explicitly non-modular (Paul Gortmaker) - Relax device number checking to allow SR-IOV (Po Liu) - Keep viewport fixed for IO transaction if num_viewport > 2 (Pratyush Anand) - Remove redundant platform_get_resource() return value check (Wei Yongjun) TI DRA7xx host bridge driver: - Make explicitly non-modular (Paul Gortmaker) TI Keystone host bridge driver: - Propagate request_irq() failure (Wei Yongjun) Xilinx AXI host bridge driver: - Keep both legacy and MSI interrupt domain references (Bharat Kumar Gogada) - Clear interrupt register for invalid interrupt (Bharat Kumar Gogada) - Clear correct MSI set bit (Bharat Kumar Gogada) - Dispose of MSI virtual IRQ (Bharat Kumar Gogada) - Make explicitly non-modular (Paul Gortmaker) - Relax device number checking to allow SR-IOV (Po Liu) Xilinx NWL host bridge driver: - Expand error logging (Bharat Kumar Gogada) - Enable all MSI interrupts using MSI mask (Bharat Kumar Gogada) - Make explicitly non-modular (Paul Gortmaker) Miscellaneous: - Drop CONFIG_KEXEC_CORE ifdeffery (Lukas Wunner) - portdrv: Make explicitly non-modular (Paul Gortmaker) - Make DPC explicitly non-modular (Paul Gortmaker)" * tag 'pci-v4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (105 commits) x86/PCI: VMD: Move VMD driver to drivers/pci/host PCI: rockchip: Fix wrong transmitted FTS count PCI: rockchip: Improve the deassert sequence of four reset pins PCI: rockchip: Increase the Max Credit update interval PCI: rcar: Try increasing PCIe link speed to 5 GT/s at boot PCI/AER: Fix aer_probe() kernel-doc comment PCI: Ignore requested alignment for VF BARs PCI: Ignore requested alignment for PROBE_ONLY and fixed resources PCI: Avoid unnecessary resume after direct-complete PCI: Recognize D3cold in pci_update_current_state() PCI: Query platform firmware for device power state PCI: Afford direct-complete to devices with non-standard PM PCI/AER: Cache capability position PCI/AER: Avoid memory allocation in interrupt handling path x86/PCI: VMD: Request userspace control of PCIe hotplug indicators PCI: pciehp: Allow exclusive userspace control of indicators ACPI / APEI: Send correct severity to calculate AER severity PCI/AER: Remove duplicate AER severity translation x86/PCI: VMD: Synchronize with RCU freeing MSI IRQ descs x86/PCI: VMD: Eliminate index member from IRQ list ...
2016-10-07Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/mdLinus Torvalds
Pull MD updates from Shaohua Li: "This update includes: - new AVX512 instruction based raid6 gen/recovery algorithm - a couple of md-cluster related bug fixes - fix a potential deadlock - set nonrotational bit for raid array with SSD - set correct max_hw_sectors for raid5/6, which hopefuly can improve performance a little bit - other minor fixes" * tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md: set rotational bit raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays raid5: handle register_shrinker failure raid5: fix to detect failure of register_shrinker md: fix a potential deadlock md/bitmap: fix wrong cleanup raid5: allow arbitrary max_hw_sectors lib/raid6: Add AVX512 optimized xor_syndrome functions lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions lib/raid6: Add AVX512 optimized recovery functions lib/raid6: Add AVX512 optimized gen_syndrome functions md-cluster: make resync lock also could be interruptted md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang md-cluster: convert the completion to wait queue md-cluster: protect md_find_rdev_nr_rcu with rcu lock md-cluster: clean related infos of cluster md: changes for MD_STILL_CLOSED flag md-cluster: remove some unnecessary dlm_unlock_sync md-cluster: use FORCEUNLOCK in lockres_free md-cluster: call md_kick_rdev_from_array once ack failed
2016-10-07Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hpsa, be2iscsi, hisi_sas, zfcp, cxlflash). There's a new incarnation of hpsa called smartpqi for which a driver is added, there's some cleanup work of the ibm vscsi target and updates to libfc, plus a whole host of minor fixes and updates and finally the removal of several ISA drivers which seem not to have been used for years" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (173 commits) scsi: mvsas: Mark symbols static where possible scsi: pm8001: Mark symbols static where possible scsi: arcmsr: Simplify user_len checking scsi: fcoe: fix off by one in eth2fc_speed() scsi: dtc: remove from tree scsi: t128: remove from tree scsi: pas16: remove from tree scsi: u14-34f: remove from tree scsi: ultrastor: remove from tree scsi: in2000: remove from tree scsi: wd7000: remove from tree scsi: scsi_dh_alua: Fix memory leak in alua_rtpg() scsi: lpfc: Mark symbols static where possible scsi: hpsa: correct call to hpsa_do_reset scsi: ufs: Get a TM service response from the correct offset scsi: ibmvfc: Fix I/O hang when port is not mapped scsi: megaraid_sas: clean function declarations in megaraid_sas_base.c up scsi: ipr: Remove redundant messages at adapter init time scsi: ipr: Don't log unnecessary 9084 error details scsi: smartpqi: raid bypass lba calculation fix ...
2016-10-07libnvdimm, namespace: sort namespaces by dpa at initDan Williams
Add more determinism to initial namespace device-name assignments by sorting the namespaces by starting dpa. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-10-07libnvdimm, namespace: allow multiple pmem-namespaces per region at scan timeDan Williams
If label scanning finds multiple valid pmem namespaces allow them to be surfaced rather than fail namespace scanning. Support for creating multiple namespaces per region is saved for a later patch. Note that this adds some new error messages to clarify which of the pmem namespaces in the set are potentially impacted by invalid labels. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-10-07Merge tag 'mfd-for-linus-4.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "Core framework: - Add the MFD bindings doc to MAINTAINERS New drivers: - X-Powers AC100 Audio CODEC and RTC - TI LP873x PMIC - Rockchip RK808 PMIC - Samsung Exynos Low Power Audio New device support: - Add support for STMPE1600 variant to stmpe - Add support for PM8018 PMIC to pm8921-core - Add support for AXP806 PMIC in axp20x - Add support for AXP209 GPIO in axp20x New functionality: - Add support for Reset to all STMPE variants - Add support for MKBP event support to cros_ec - Add support for USB to intel_soc_pmic_bxtwc - Add support for IRQs and Power Button to tps65217 Fix-ups: - Clean-up defunct author emails (da9063, max14577) - Kconfig fixups (wm8350-i2c, as37220 - Constify (altera-a10sr, sm501) - Supply PCI IDs (intel-lpss-pci) - Improve clocking (qcom_rpm) - Fix IRQ probing (ucb1x00-core) - Ensure fault log is cleared (da9052) - Remove NO_IRQ check (ucb1x00-core) - Supply I2C properties (intel-lpss-acpi, intel-lpss-pci) - Non standard declaration (tps65217, max8997-irq) - Remove unused code (lp873x, db8500-prcmu, ab8500-debugfs, cros_ec_spi) - Make non-modular (altera-a10sr, intel_msic, smsc-ece1099, sun6i-prcm, twl-core) - OF bindings (ac100, stmpe, qcom-pm8xxx, qcom-rpm, rk808, axp20x, lp873x, exynos5433-lpass, act8945a, aspeed-scu, twl6040, arizona) Bugfixes: - Release OF pointer (qcom_rpm) - Avoid double shifting in suspend/resume (88pm80x) - Fix 'defined but not used' error (exynos-lpass) - Fix 'sleeping whilst attomic' (atmel-hlcdc)" * tag 'mfd-for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (69 commits) mfd: arizona: Handle probe deferral for reset GPIO mfd: arizona: Remove arizona_of_get_named_gpio helper function mfd: arizona: Add DT options for max_channels_clocked and PDM speaker config mfd: twl6040: Register child device for twl6040-pdmclk mfd: cros_ec_spi: Remove unused variable 'request' mfd: omap-usb-host: Return value is not 'const int' mfd: ab8500-debugfs: Remove 'weak' function suspend_test_wake_cause_interrupt_is_mine() mfd: ab8500-debugfs: Remove ab8500_dump_all_banks_to_mem() mfd: db8500-prcmu: Remove unused *prcmu_set_ddr_opp() calls mfd: ab8500-debugfs: Prevent initialised field from being over-written mfd: max8997-irq: 'inline' should be at the beginning of the declaration mfd: rk808: Fix RK818_IRQ_DISCHG_ILIM initializer mfd: tps65217: Fix nonstandard declaration mfd: lp873x: Remove unused mutex lock from struct lp873x mfd: atmel-hlcdc: Do not sleep in atomic context mfd: exynos-lpass: Mark PM functions as __maybe_unused mfd: intel-lpss: Add default I2C device properties for Apollo Lake mfd: twl-core: Make it explicitly non-modular mfd: sun6i-prcm: Make it explicitly non-modular mfd: smsc-ece1099: Make it explicitly non-modular ...
2016-10-07Merge branches 'for-4.8/upstream-fixes', 'for-4.9/alps', ↵Jiri Kosina
'for-4.9/hid-input', 'for-4.9/intel-ish', 'for-4.9/kye-uclogic-waltop-fixes', 'for-4.9/logitech', 'for-4.9/sony', 'for-4.9/upstream' and 'for-4.9/wacom' into for-linus
2016-10-06sockfs: Get rid of getxattr iopAndreas Gruenbacher
If we allow pseudo-filesystems created with mount_pseudo to have xattr handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler to use the xattr handler name parsing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06Merge tag 'hsi-for-4.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi Pull HSI fix from Sebastian Reichel: "Fix hsi userspace header" * tag 'hsi-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi: HSI: hsi_char.h: use __u32 from linux/types.h