summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-13mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio()Wen Yang
Patch series "use div64_ul() instead of div_u64() if the divisor is unsigned long". We were first inspired by commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom () calculation"), then refer to the recently analyzed mm code, we found this suspicious place. 201 if (min) { 202 min *= this_bw; 203 do_div(min, tot_bw); 204 } And we also disassembled and confirmed it: /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 201 0xffffffff811c37da <__wb_calc_thresh+234>: xor %r10d,%r10d 0xffffffff811c37dd <__wb_calc_thresh+237>: test %rax,%rax 0xffffffff811c37e0 <__wb_calc_thresh+240>: je 0xffffffff811c3800 <__wb_calc_thresh+272> /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 202 0xffffffff811c37e2 <__wb_calc_thresh+242>: imul %r8,%rax /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 203 0xffffffff811c37e6 <__wb_calc_thresh+246>: mov %r9d,%r10d ---> truncates it to 32 bits here 0xffffffff811c37e9 <__wb_calc_thresh+249>: xor %edx,%edx 0xffffffff811c37eb <__wb_calc_thresh+251>: div %r10 0xffffffff811c37ee <__wb_calc_thresh+254>: imul %rbx,%rax 0xffffffff811c37f2 <__wb_calc_thresh+258>: shr $0x2,%rax 0xffffffff811c37f6 <__wb_calc_thresh+262>: mul %rcx 0xffffffff811c37f9 <__wb_calc_thresh+265>: shr $0x2,%rdx 0xffffffff811c37fd <__wb_calc_thresh+269>: mov %rdx,%r10 This series uses div64_ul() instead of div_u64() if the divisor is unsigned long, to avoid truncation to 32-bit on 64-bit platforms. This patch (of 3): The variables 'min' and 'max' are unsigned long and do_div truncates them to 32 bits, which means it can test non-zero and be truncated to zero for division. Fix this issue by using div64_ul() instead. Link: http://lkml.kernel.org/r/20200102081442.8273-2-wenyang@linux.alibaba.com Fixes: 693108a8a667 ("writeback: make bdi->min/max_ratio handling cgroup writeback aware") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Qian Cai <cai@lca.pw> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm, debug_pagealloc: don't rely on static keys too earlyVlastimil Babka
Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") has introduced a static key to reduce overhead when debug_pagealloc is compiled in but not enabled. It relied on the assumption that jump_label_init() is called before parse_early_param() as in start_kernel(), so when the "debug_pagealloc=on" option is parsed, it is safe to enable the static key. However, it turns out multiple architectures call parse_early_param() earlier from their setup_arch(). x86 also calls jump_label_init() even earlier, so no issue was found while testing the commit, but same is not true for e.g. ppc64 and s390 where the kernel would not boot with debug_pagealloc=on as found by our QA. To fix this without tricky changes to init code of multiple architectures, this patch partially reverts the static key conversion from 96a2b03f281d. Init-time and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will again test a simple bool variable. Fastpath mm code is converted to a new debug_pagealloc_enabled_static() variant that relies on the static key, which is enabled in a well-defined point in mm_init() where it's guaranteed that jump_label_init() has been called, regardless of architecture. [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early] Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz Fixes: 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm: memcg/slab: fix percpu slab vmstats flushingRoman Gushchin
Currently slab percpu vmstats are flushed twice: during the memcg offlining and just before freeing the memcg structure. Each time percpu counters are summed, added to the atomic counterparts and propagated up by the cgroup tree. The second flushing is required due to how recursive vmstats are implemented: counters are batched in percpu variables on a local level, and once a percpu value is crossing some predefined threshold, it spills over to atomic values on the local and each ascendant levels. It means that without flushing some numbers cached in percpu variables will be dropped on floor each time a cgroup is destroyed. And with uptime the error on upper levels might become noticeable. The first flushing aims to make counters on ancestor levels more precise. Dying cgroups may resume in the dying state for a long time. After kmem_cache reparenting which is performed during the offlining slab counters of the dying cgroup don't have any chances to be updated, because any slab operations will be performed on the parent level. It means that the inaccuracy caused by percpu batching will not decrease up to the final destruction of the cgroup. By the original idea flushing slab counters during the offlining should minimize the visible inaccuracy of slab counters on the parent level. The problem is that percpu counters are not zeroed after the first flushing. So every cached percpu value is summed twice. It creates a small error (up to 32 pages per cpu, but usually less) which accumulates on parent cgroup level. After creating and destroying of thousands of child cgroups, slab counter on parent level can be way off the real value. For now, let's just stop flushing slab counters on memcg offlining. It can't be done correctly without scheduling a work on each cpu: reading and zeroing it during css offlining can race with an asynchronous update, which doesn't expect values to be changed underneath. With this change, slab counters on parent level will become eventually consistent. Once all dying children are gone, values are correct. And if not, the error is capped by 32 * NR_CPUS pages per dying cgroup. It's not perfect, as slab are reparented, so any updates after the reparenting will happen on the parent level. It means that if a slab page was allocated, a counter on child level was bumped, then the page was reparented and freed, the annihilation of positive and negative counter values will not happen until the child cgroup is released. It makes slab counters different from others, and it might want us to implement flushing in a correct form again. But it's also a question of performance: scheduling a work on each cpu isn't free, and it's an open question if the benefit of having more accurate counters is worth it. We might also consider flushing all counters on offlining, not only slab counters. So let's fix the main problem now: make the slab counters eventually consistent, so at least the error won't grow with uptime (or more precisely the number of created and destroyed cgroups). And think about the accuracy of counters separately. Link: http://lkml.kernel.org/r/20191220042728.1045881-1-guro@fb.com Fixes: bee07b33db78 ("mm: memcontrol: flush percpu slab vmstats on kmem offlining") Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD ↵Kirill A. Shutemov
alignment Shmem/tmpfs tries to provide THP-friendly mappings if huge pages are enabled. But it doesn't work well with above-47bit hint address. Normally, the kernel doesn't create userspace mappings above 47-bit, even if the machine allows this (such as with 5-level paging on x86-64). Not all user space is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. Userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 47-bits. If the application doesn't need a particular address, but wants to allocate from whole address space it can specify -1 as a hint address. Unfortunately, this trick breaks THP alignment in shmem/tmp: shmem_get_unmapped_area() would not try to allocate PMD-aligned area if *any* hint address specified. This can be fixed by requesting the aligned area if the we failed to allocated at user-specified hint address. The request with inflated length will also take the user-specified hint address. This way we will not lose an allocation request from the full address space. [kirill@shutemov.name: fold in a fixup] Link: http://lkml.kernel.org/r/20191223231309.t6bh5hkbmokihpfu@box Link: http://lkml.kernel.org/r/20191220142548.7118-3-kirill.shutemov@linux.intel.com Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Willhalm, Thomas" <thomas.willhalm@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Bruggeman, Otto G" <otto.g.bruggeman@intel.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD ↵Kirill A. Shutemov
alignment Patch series "Fix two above-47bit hint address vs. THP bugs". The two get_unmapped_area() implementations have to be fixed to provide THP-friendly mappings if above-47bit hint address is specified. This patch (of 2): Filesystems use thp_get_unmapped_area() to provide THP-friendly mappings. For DAX in particular. Normally, the kernel doesn't create userspace mappings above 47-bit, even if the machine allows this (such as with 5-level paging on x86-64). Not all user space is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. Userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 47-bits. If the application doesn't need a particular address, but wants to allocate from whole address space it can specify -1 as a hint address. Unfortunately, this trick breaks thp_get_unmapped_area(): the function would not try to allocate PMD-aligned area if *any* hint address specified. Modify the routine to handle it correctly: - Try to allocate the space at the specified hint address with length padding required for PMD alignment. - If failed, retry without length padding (but with the same hint address); - If the returned address matches the hint address return it. - Otherwise, align the address as required for THP and return. The user specified hint address is passed down to get_unmapped_area() so above-47bit hint address will be taken into account without breaking alignment requirements. Link: http://lkml.kernel.org/r/20191220142548.7118-2-kirill.shutemov@linux.intel.com Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Thomas Willhalm <thomas.willhalm@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Bruggeman, Otto G" <otto.g.bruggeman@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/memory_hotplug: don't free usage map when removing a re-added early sectionDavid Hildenbrand
When we remove an early section, we don't free the usage map, as the usage maps of other sections are placed into the same page. Once the section is removed, it is no longer an early section (especially, the memmap is freed). When we re-add that section, the usage map is reused, however, it is no longer an early section. When removing that section again, we try to kfree() a usage map that was allocated during early boot - bad. Let's check against PageReserved() to see if we are dealing with an usage map that was allocated during boot. We could also check against !(PageSlab(usage_page) || PageCompound(usage_page)), but PageReserved() is cleaner. Can be triggered using memtrace under ppc64/powernv: $ mount -t debugfs none /sys/kernel/debug/ $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable ------------[ cut here ]------------ kernel BUG at mm/slub.c:3969! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=3D64K MMU=3DHash SMP NR_CPUS=3D2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 154 Comm: sh Not tainted 5.5.0-rc2-next-20191216-00005-g0be1dba7b7c0 #61 NIP kfree+0x338/0x3b0 LR section_deactivate+0x138/0x200 Call Trace: section_deactivate+0x138/0x200 __remove_pages+0x114/0x150 arch_remove_memory+0x3c/0x160 try_remove_memory+0x114/0x1a0 __remove_memory+0x20/0x40 memtrace_enable_set+0x254/0x850 simple_attr_write+0x138/0x160 full_proxy_write+0x8c/0x110 __vfs_write+0x38/0x70 vfs_write+0x11c/0x2a0 ksys_write+0x84/0x140 system_call+0x5c/0x68 ---[ end trace 4b053cbd84e0db62 ]--- The first invocation will offline+remove memory blocks. The second invocation will first add+online them again, in order to offline+remove them again (usually we are lucky and the exact same memory blocks will get "reallocated"). Tested on powernv with boot memory: The usage map will not get freed. Tested on x86-64 with DIMMs: The usage map will get freed. Using Dynamic Memory under a Power DLAPR can trigger it easily. Triggering removal (I assume after previously removed+re-added) of memory from the HMC GUI can crash the kernel with the same call trace and is fixed by this patch. Link: http://lkml.kernel.org/r/20191217104637.5509-1-david@redhat.com Fixes: 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY flag") Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Pingfan Liu <piliu@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm, thp: tweak reclaim/compaction effort of local-only and all-node allocationsVlastimil Babka
THP page faults now attempt a __GFP_THISNODE allocation first, which should only compact existing free memory, followed by another attempt that can allocate from any node using reclaim/compaction effort specified by global defrag setting and madvise. This patch makes the following changes to the scheme: - Before the patch, the first allocation relies on a check for pageblock order and __GFP_IO to prevent excessive reclaim. This however affects also the second attempt, which is not limited to single node. Instead of that, reuse the existing check for costly order __GFP_NORETRY allocations, and make sure the first THP attempt uses __GFP_NORETRY. As a side-effect, all costly order __GFP_NORETRY allocations will bail out if compaction needs reclaim, while previously they only bailed out when compaction was deferred due to previous failures. This should be still acceptable within the __GFP_NORETRY semantics. - Before the patch, the second allocation attempt (on all nodes) was passing __GFP_NORETRY. This is redundant as the check for pageblock order (discussed above) was stronger. It's also contrary to madvise(MADV_HUGEPAGE) which means some effort to allocate THP is requested. After this patch, the second attempt doesn't pass __GFP_THISNODE nor __GFP_NORETRY. To sum up, THP page faults now try the following attempts: 1. local node only THP allocation with no reclaim, just compaction. 2. for madvised VMA's or when synchronous compaction is enabled always - THP allocation from any node with effort determined by global defrag setting and VMA madvise 3. fallback to base pages on any node Link: http://lkml.kernel.org/r/08a3f4dd-c3ce-0009-86c5-9ee51aba8557@suse.cz Fixes: b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13Merge branch 'md-next' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.6/drivers Pull MD changes from Song. * 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid1: introduce wait_for_serialization md/raid1: use bucket based mechanism for IO serialization md: introduce a new struct for IO serialization md: don't destroy serial_info_pool if serialize_policy is true raid1: serialize the overlap write md: reorgnize mddev_create/destroy_serial_pool md: add serialize_policy sysfs node for raid1 md: prepare for enable raid1 io serialization md: fix a typo s/creat/create md: rename wb stuffs raid5: remove worker_cnt_per_group argument from alloc_thread_groups md/raid6: fix algorithm choice under larger PAGE_SIZE raid6/test: fix a compilation warning raid6/test: fix a compilation error md-bitmap: small cleanups
2020-01-13btrfs: relocation: fix reloc_root lifespan and accessQu Wenruo
[BUG] There are several different KASAN reports for balance + snapshot workloads. Involved call paths include: should_ignore_root+0x54/0xb0 [btrfs] build_backref_tree+0x11af/0x2280 [btrfs] relocate_tree_blocks+0x391/0xb80 [btrfs] relocate_block_group+0x3e5/0xa00 [btrfs] btrfs_relocate_block_group+0x240/0x4d0 [btrfs] btrfs_relocate_chunk+0x53/0xf0 [btrfs] btrfs_balance+0xc91/0x1840 [btrfs] btrfs_ioctl_balance+0x416/0x4e0 [btrfs] btrfs_ioctl+0x8af/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 create_reloc_root+0x9f/0x460 [btrfs] btrfs_reloc_post_snapshot+0xff/0x6c0 [btrfs] create_pending_snapshot+0xa9b/0x15f0 [btrfs] create_pending_snapshots+0x111/0x140 [btrfs] btrfs_commit_transaction+0x7a6/0x1360 [btrfs] btrfs_mksubvol+0x915/0x960 [btrfs] btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs] btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs] btrfs_ioctl+0x241b/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 btrfs_reloc_pre_snapshot+0x85/0xc0 [btrfs] create_pending_snapshot+0x209/0x15f0 [btrfs] create_pending_snapshots+0x111/0x140 [btrfs] btrfs_commit_transaction+0x7a6/0x1360 [btrfs] btrfs_mksubvol+0x915/0x960 [btrfs] btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs] btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs] btrfs_ioctl+0x241b/0x3e60 [btrfs] do_vfs_ioctl+0x831/0xb10 [CAUSE] All these call sites are only relying on root->reloc_root, which can undergo btrfs_drop_snapshot(), and since we don't have real refcount based protection to reloc roots, we can reach already dropped reloc root, triggering KASAN. [FIX] To avoid such access to unstable root->reloc_root, we should check BTRFS_ROOT_DEAD_RELOC_TREE bit first. This patch introduces wrappers that provide the correct way to check the bit with memory barriers protection. Most callers don't distinguish merged reloc tree and no reloc tree. The only exception is should_ignore_root(), as merged reloc tree can be ignored, while no reloc tree shouldn't. [CRITICAL SECTION ANALYSIS] Although test_bit()/set_bit()/clear_bit() doesn't imply a barrier, the DEAD_RELOC_TREE bit has extra help from transaction as a higher level barrier, the lifespan of root::reloc_root and DEAD_RELOC_TREE bit are: NULL: reloc_root is NULL PTR: reloc_root is not NULL 0: DEAD_RELOC_ROOT bit not set DEAD: DEAD_RELOC_ROOT bit set (NULL, 0) Initial state __ | /\ Section A btrfs_init_reloc_root() \/ | __ (PTR, 0) reloc_root initialized /\ | | btrfs_update_reloc_root() | Section B | | (PTR, DEAD) reloc_root has been merged \/ | __ === btrfs_commit_transaction() ==================== | /\ clean_dirty_subvols() | | | Section C (NULL, DEAD) reloc_root cleanup starts \/ | __ btrfs_drop_snapshot() /\ | | Section D (NULL, 0) Back to initial state \/ Every have_reloc_root() or test_bit(DEAD_RELOC_ROOT) caller holds transaction handle, so none of such caller can cross transaction boundary. In Section A, every caller just found no DEAD bit, and grab reloc_root. In the cross section A-B, caller may get no DEAD bit, but since reloc_root is still completely valid thus accessing reloc_root is completely safe. No test_bit() caller can cross the boundary of Section B and Section C. In Section C, every caller found the DEAD bit, so no one will access reloc_root. In the cross section C-D, either caller gets the DEAD bit set, avoiding access reloc_root no matter if it's safe or not. Or caller get the DEAD bit cleared, then access reloc_root, which is already NULL, nothing will be wrong. The memory write barriers are between the reloc_root updates and bit set/clear, the pairing read side is before test_bit. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ barriers ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-13md/raid1: introduce wait_for_serializationGuoqing Jiang
Previously, we call check_and_add_serial when serialization is enabled for write IO, but it could allocate and free memory back and forth. Now, let's just get an element from memory pool with the new function, then insert node to rb tree if no collision happens. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md/raid1: use bucket based mechanism for IO serializationGuoqing Jiang
Since raid1 had already used bucket based mechanism to reduce the conflict between write IO and resync IO, it is possible to speed up performance for io serialization with refer to the same mechanism. To align with the barrier bucket mechanism, we created arrays (with the same number of BARRIER_BUCKETS_NR) for spinlock, rb tree and waitqueue. Then we can reduce lock competition with multiple spinlocks, boost search performance with multiple rb trees and also reduce thundering herd problem with multiple waitqueues. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: introduce a new struct for IO serializationGuoqing Jiang
Obviously, IO serialization could cause the degradation of performance a lot. In order to reduce the degradation, so a rb interval tree is added in raid1 to speed up the check of collision. So, a rb root is needed in md_rdev, then abstract all the serialize related members to a new struct (serial_in_rdev), embed it into md_rdev. Of course, we need to free the struct if it is not needed anymore, so rdev/rdevs_uninit_serial are added accordingly. And they should be called when destroty memory pool or can't alloc memory. And we need to consider to call mddev_destroy_serial_pool in case serialize_policy/write-behind is disabled, bitmap is destroyed or in __md_stop_writes. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: don't destroy serial_info_pool if serialize_policy is trueGuoqing Jiang
The serial_info_pool is needed if array sets serialize_policy to true, so don't destroy it. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid1: serialize the overlap writeGuoqing Jiang
Before dispatch write bio, raid1 array which enables serialize_policy need to check if overlap exists between this bio and previous on-flying bios. If there is overlap, then it has to wait until the collision is disappeared. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: reorgnize mddev_create/destroy_serial_poolGuoqing Jiang
So far, IO serialization is used for two scenarios: 1. raid1 which enables write-behind mode, and there is rdev in the array which is multi-queue device and flaged with writemostly. 2. IO serialization is enabled or disabled by change serialize_policy. So introduce rdev_need_serial to check the first scenario. And for 1, IO serialization is enabled automatically while 2 is controlled manually. And it is possible that both scenarios are true, so for create serial pool, rdev/rdevs_init_serial should be separate from check if the pool existed or not. Then for destroy pool, we need to check if the pool is needed by other rdevs due to the first scenario. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: add serialize_policy sysfs node for raid1Guoqing Jiang
With the new sysfs node, we can use it to control if raid1 array wants io serialization or not. So mddev_create_serial_pool and mddev_destroy_serial_pool are called in serialize_policy_store to enable or disable the serialization. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: prepare for enable raid1 io serializationGuoqing Jiang
1. The related resources (spin_lock, list and waitqueue) are needed for address raid1 reorder overlap issue too, in this case, rdev is set to NULL for mddev_create/destroy_serial_pool which implies all rdevs need to handle these resources. And also add "is_suspend" to mddev_destroy_serial_pool since it will be called under suspended situation, which also makes both create and destroy pool have same arguments. 2. Introduce rdevs_init_serial which is called if raid1 io serialization is enabled since all rdevs need to init related stuffs. 3. rdev_init_serial and clear_bit(CollisionCheck, &rdev->flags) should be called between suspend and resume. No need to export mddev_create_serial_pool since it is only called in md-mod module. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: fix a typo s/creat/createGuoqing Jiang
It actually means create here, so fix the typo. Reported-by: Song Liu <liu.song.a23@gmail.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md: rename wb stuffsGuoqing Jiang
Previously, wb_info_pool and wb_list stuffs are introduced to address potential data inconsistence issue for write behind device. Now rename them to serial related name, since the same mechanism will be used to address reorder overlap write issue for raid1. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid5: remove worker_cnt_per_group argument from alloc_thread_groupsGuoqing Jiang
We can use "cnt" directly to update conf->worker_cnt_per_group if alloc_thread_groups returns 0. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md/raid6: fix algorithm choice under larger PAGE_SIZEZhengyuan Liu
There are several algorithms available for raid6 to generate xor and syndrome parity, including basic int1, int2 ... int32 and SIMD optimized implementation like sse and neon. To test and choose the best algorithms at the initial stage, we need provide enough disk data to feed the algorithms. However, the disk number we provided depends on page size and gfmul table, seeing bellow: const int disks = (65536/PAGE_SIZE) + 2; So when come to 64K PAGE_SIZE, there is only one data disk plus 2 parity disk, as a result the chosed algorithm is not reliable. For example, on my arm64 machine with 64K page enabled, it will choose intx32 as the best one, although the NEON implementation is better. This patch tries to fix the problem by defining a constant raid6 disk number to supporting arbitrary page size. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid6/test: fix a compilation warningZhengyuan Liu
The compilation warning is redefination showed as following: In file included from tables.c:2: ../../../include/linux/export.h:180: warning: "EXPORT_SYMBOL" redefined #define EXPORT_SYMBOL(sym) __EXPORT_SYMBOL(sym, "") In file included from tables.c:1: ../../../include/linux/raid/pq.h:61: note: this is the location of the previous definition #define EXPORT_SYMBOL(sym) Fixes: 69a94abb82ee ("export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13raid6/test: fix a compilation errorZhengyuan Liu
The compilation error is redeclaration showed as following: In file included from ../../../include/linux/limits.h:6, from /usr/include/x86_64-linux-gnu/bits/local_lim.h:38, from /usr/include/x86_64-linux-gnu/bits/posix1_lim.h:161, from /usr/include/limits.h:183, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/limits.h:194, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/syslimits.h:7, from /usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/limits.h:34, from ../../../include/linux/raid/pq.h:30, from algos.c:14: ../../../include/linux/types.h:114:15: error: conflicting types for ‘int64_t’ typedef s64 int64_t; ^~~~~~~ In file included from /usr/include/stdint.h:34, from /usr/lib/gcc/x86_64-linux-gnu/8/include/stdint.h:9, from /usr/include/inttypes.h:27, from ../../../include/linux/raid/pq.h:29, from algos.c:14: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous \ declaration of ‘int64_t’ was here typedef __int64_t int64_t; Fixes: 54d50897d544 ("linux/kernel.h: split *_MAX and *_MIN macros into <linux/limits.h>") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13md-bitmap: small cleanupsZhiqiang Liu
In md_bitmap_unplug, bitmap->storage.filemap is double checked. In md_bitmap_daemon_work, bitmap->storage.filemap should be checked before reference. Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2020-01-13Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add missed "cpld4_version" attribute. Fixes: 52675da1d087 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs ↵Vadim Pasternak
interfaces Fix attribute name from "jtag_enable", which described twice to "cpld3_version", which is expected to be instead of second appearance of "jtag_enable". Fixes: 2752e34442b5 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for next generation systemsVadim Pasternak
Add support for new Mellanox system types of basic class VMOD0010, containing new Mellanox systems equipped with new switch device Spectrum 3 (32x400GbE/64x200G/128x100G Ethernet switch). These are the Top of the Rack 1U/2U/4U systems, equipped with Mellanox Comex card and with the switch board with Mellanox Spectrum-3 device. This class of devices can be equipped with two PS units for 1U/2U or with four PS units for 4U systems. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/mellanox: mlxreg-hotplug: Add support for new capability registerVadim Pasternak
Add support for capability register, which is used for detection of the actual number of interrupt capable components within the particular group, supported by the specific system. Such components could be for example the number of power units and interrupts related to these units. The motivation is to avoid adding a new code in the future in order to distinct between the systems type supported different number of the components like power supplies, FANs, ASICs, line cards. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for new capability registerVadim Pasternak
Add support for capability register, which contains information about the number of PS units equipped on the system and about minimum I2C frequency supported by the all system's I2C devices. Utilization of this register allows to avoid necessity of providing new system description, in case it differs in number of PS units or in minimal I2C frequency. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add support for new system typeVadim Pasternak
Add support for new Mellanox system types of basic class VMOD0009, containing Mellanox systems equipped with the switch devices Spectrum 1 (32x100GbE Ethernet switch) and Switch-IB/Switch-IB2 (36x100Gbe InfiniBand switch). These are the Top of the Rack system, equipped with Mellanox Comex card. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Set system mux configuration based on system typeVadim Pasternak
Separate assignment for systems mux configuration based on system type, instead of setting the same configuration for the all. The motivation is to allow introduction of new systems types with the different mux topology. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Add new attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add documentation for the new attributes for: - Exposing reset causes types asserted by: platform reset, SoC reset, AC power failure, software power off request. - Setting and removing system VPD (EEPROM) hardware write protection. - Voltage regulator devices configuration update status and firmware version. - Setting PCIe ASIC reset to disable or enable state during PCIe root complex reset. - System static topology identification, like system's static I2C topology, number and type of FPGA devices within the system and so on. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Add more definitions for system attributesVadim Pasternak
Add new attributes for "next-generation" type systems: - Reset cause indication, when system reset has been caused by the platform reset request through CPLD, by AC power failure, by software power off request through CPLD. by signal asserted by SOC through ACPI register. It introduces more reset causes, which can be monitored after the reboots. - Setting and removing system VPD (EEPROM) hardware write protection. It allows to access VPD during production cycle and prevents VPD corruption on system in field. - Voltage regulator devices configuration update status and version. It allows to monitor configuration update status and current active version for such sort of devices. - PCIe ASIC reset disable - when set ASIC will go down upon PCIe root complex reset, otherwise ASIC will stay up during PCIe root complex reset. - System configuration Ids to provide system static topology identification, like system's static I2C topology, number and type of FPGA devices within the system and so on. Add the existing attribute for "iio" bank selection for system type "msn21xx". All the above attributes are exposed through "sysfs" by "mlxreg-io" driver. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Style changesVadim Pasternak
Remove blank lines between "What" and "Date" keywords. Start each section with "What" keyword. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add missed "cpld4_version" attribute. Fixes: 52675da1d087 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs ↵Vadim Pasternak
interfaces Fix attribute name from "jtag_enable", which described twice to "cpld3_version", which is expected to be instead of second appearance of "jtag_enable". Fixes: 2752e34442b5 ("Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: mlx-platform: Cosmetic changesVadim Pasternak
Remove redundant semicolons at the end of few functions. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13MAINTAINERS: Update for the intel uncore frequency controlSrinivas Pandruvada
Add an entry for drivers/platform/x86/intel-uncore-frequency.c. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13platform/x86: Add support for Uncore frequency controlSrinivas Pandruvada
Some server users set limits on the uncore frequency using MSR 620H, while running latency sensitive workloads. Here uncore frequency controls RING/LLC(last-level cache) clocks. But MSR control is not always possible from the user space, so this driver provides a sysfs interface to set max and min frequency limits. This MSR 620H is a die scoped in multi-die system or package scoped in non multi-die systems. When this driver is loaded, a new directory is created under /sys/devices/system/cpu. For example on a two package Skylake server: $cd /sys/devices/system/cpu/intel_uncore_frequency $ls package_00_die_00 package_01_die_00 $ls package_00_die_00 max_freq_khz min_freq_khz initial_max_freq_khz initial_min_freq_khz $grep . * max_freq_khz:2400000 min_freq_khz:1200000 initial_max_freq_khz:2400000 initial_min_freq_khz:1200000 Here, initial_max_freq_khz and initial_min_freq_khz are read only attributes to show power up or initial values of max and min frequencies respectively. Other attributes are read-write, so that users can modify. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-01-13netfilter: arp_tables: init netns pointer in xt_tgdtor_param structFlorian Westphal
An earlier commit (1b789577f655060d98d20e, "netfilter: arp_tables: init netns pointer in xt_tgchk_param struct") fixed missing net initialization for arptables, but turns out it was incomplete. We can get a very similar struct net NULL deref during error unwinding: general protection fault: 0000 [#1] PREEMPT SMP KASAN RIP: 0010:xt_rateest_put+0xa1/0x440 net/netfilter/xt_RATEEST.c:77 xt_rateest_tg_destroy+0x72/0xa0 net/netfilter/xt_RATEEST.c:175 cleanup_entry net/ipv4/netfilter/arp_tables.c:509 [inline] translate_table+0x11f4/0x1d80 net/ipv4/netfilter/arp_tables.c:587 do_replace net/ipv4/netfilter/arp_tables.c:981 [inline] do_arpt_set_ctl+0x317/0x650 net/ipv4/netfilter/arp_tables.c:1461 Also init the netns pointer in xt_tgdtor_param struct. Fixes: add67461240c1d ("netfilter: add struct net * to target parameters") Reported-by: syzbot+91bdd8eece0f6629ec8b@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-13Merge tag 'sunxi-clk-fixes-for-5.5' of ↵Stephen Boyd
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes Pull Allwinner clk fixes from Maxime Ripard: Our usual set of fixes for Allwinner, to fix the number of reported clocks on the v3s, fixing the external clock on the R40, and some fixes for the AR100 co-processor clocks. * tag 'sunxi-clk-fixes-for-5.5' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: clk: sunxi-ng: h6-r: Fix AR100/R_APB2 parent order clk: sunxi-ng: h6-r: Simplify R_APB1 clock definition clk: sunxi-ng: sun8i-r: Fix divider on APB0 clock clk: sunxi-ng: r40: Allow setting parent rate for external clock outputs clk: sunxi-ng: v3s: Fix incorrect number of hw_clks.
2020-01-13netfilter: fix a use-after-free in mtype_destroy()Cong Wang
map->members is freed by ip_set_free() right before using it in mtype_ext_cleanup() again. So we just have to move it down. Reported-by: syzbot+4c3cc6dbe7259dbf9054@syzkaller.appspotmail.com Fixes: 40cd63bf33b2 ("netfilter: ipset: Support extensions which need a per data destroy function") Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-13ARM: dts: am335x-boneblack-common: fix memory sizeMatwey V. Kornilov
BeagleBone Black series is equipped with 512MB RAM whereas only 256MB is included from am335x-bone-common.dtsi This leads to an issue with unusual setups when devicetree is loaded by GRUB2 directly. Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru> Signed-off-by: Tony Lindgren <tony@atomide.com>
2020-01-13irqchip/ingenic: Get rid of the legacy IRQ domainPaul Cercueil
Get rid of the legacy IRQ domain and hardcoded IRQ base, since all the Ingenic drivers and platform code have been updated to use devicetree. This also fixes the kernel being flooded with messages like: irq: interrupt-controller@10001000 didn't like hwirq-0x0 to VIRQ8 mapping (rc=-19) Fixes: 8bc7464b5140 ("irqchip: ingenic: Alloc generic chips from IRQ domain"). Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200113163329.34282-2-paul@crapouillou.net
2020-01-13USB: serial: option: Add support for Quectel RM500QKristian Evensen
RM500Q is a 5G module from Quectel, supporting both standalone and non-standalone modes. Unlike other recent Quectel modems, it is possible to identify the diagnostic interface (bInterfaceProtocol is unique). Thus, there is no need to check for the number of endpoints or reserve interfaces. The interface number is still dynamic though, so matching on interface number is not possible and two entries have to be added to the table. Output from usb-devices with all interfaces enabled (order is diag, nmea, at_port, modem, rmnet and adb): Bus 004 Device 007: ID 2c7c:0800 Quectel Wireless Solutions Co., Ltd. Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x2c7c Quectel Wireless Solutions Co., Ltd. idProduct 0x0800 bcdDevice 4.14 iManufacturer 1 Quectel iProduct 2 LTE-A Module iSerial 3 40046d60 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 328 bNumInterfaces 6 bConfigurationValue 1 iConfiguration 4 DIAG_SER_RMNET bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 224mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 48 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x85 EP 5 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x84 EP 4 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x03 EP 3 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 3 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 ** UNRECOGNIZED: 05 24 00 10 01 ** UNRECOGNIZED: 05 24 01 00 00 ** UNRECOGNIZED: 04 24 02 02 ** UNRECOGNIZED: 05 24 06 00 00 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x86 EP 6 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 4 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 255 Vendor Specific Protocol iInterface 5 CDEV Serial Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x88 EP 8 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 9 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x8e EP 14 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 6 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x0f EP 15 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 5 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 66 bInterfaceProtocol 1 iInterface 6 ADB Interface Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x05 EP 5 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x89 EP 9 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 0 Binary Object Store Descriptor: bLength 5 bDescriptorType 15 wTotalLength 42 bNumDeviceCaps 3 USB 2.0 Extension Device Capability: bLength 7 bDescriptorType 16 bDevCapabilityType 2 bmAttributes 0x00000006 Link Power Management (LPM) Supported SuperSpeed USB Device Capability: bLength 10 bDescriptorType 16 bDevCapabilityType 3 bmAttributes 0x00 wSpeedsSupported 0x000f Device can operate at Low Speed (1Mbps) Device can operate at Full Speed (12Mbps) Device can operate at High Speed (480Mbps) Device can operate at SuperSpeed (5Gbps) bFunctionalitySupport 1 Lowest fully-functional device speed is Full Speed (12Mbps) bU1DevExitLat 1 micro seconds bU2DevExitLat 500 micro seconds ** UNRECOGNIZED: 14 10 0a 00 01 00 00 00 00 11 00 00 30 40 0a 00 b0 40 0a 00 Device Status: 0x0000 (Bus Powered) Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2020-01-13ASoC: msm8916-wcd-digital: Reset RX interpolation path after useStephan Gerhold
For some reason, attempting to route audio through QDSP6 on MSM8916 causes the RX interpolation path to get "stuck" after playing audio a few times. In this situation, the analog codec part is still working, but the RX path in the digital codec stops working, so you only hear the analog parts powering up. After a reboot everything works again. So far I was not able to reproduce the problem when using lpass-cpu. The downstream kernel driver avoids this by resetting the RX interpolation path after use. In mainline we do something similar for the TX decimator (LPASS_CDC_CLK_TX_RESET_B1_CTL), but the interpolator reset (LPASS_CDC_CLK_RX_RESET_CTL) got lost when the msm8916-wcd driver was split into analog and digital. Fix this problem by adding the reset to msm8916_wcd_digital_enable_interpolator(). Fixes: 150db8c5afa1 ("ASoC: codecs: Add msm8916-wcd digital codec") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200105102753.83108-1-stephan@gerhold.net Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-13ASoC: msm8916-wcd-analog: Fix MIC BIAS Internal1Stephan Gerhold
MIC BIAS Internal1 is broken at the moment because we always enable the internal rbias resistor to the TX2 line (connected to the headset microphone), rather than enabling the resistor connected to TX1. Move the RBIAS code to pm8916_wcd_analog_enable_micbias_int1/2() to fix this. Fixes: 585e881e5b9e ("ASoC: codecs: Add msm8916-wcd analog codec") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200111164006.43074-3-stephan@gerhold.net Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-13ASoC: cros_ec_codec: Make the device acpi compatibleYu-Hsuan Hsu
Add ACPI entry for cros_ec_codec. Signed-off-by: Yu-Hsuan Hsu <yuhsuan@chromium.org> Link: https://lore.kernel.org/r/20200112054900.236576-1-yuhsuan@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-13ASoC: sti: fix possible sleep-in-atomicArnaud Pouliquen
Change mutex and spinlock management to avoid sleep in atomic issue. Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@st.com> Link: https://lore.kernel.org/r/20200113100400.30472-1-arnaud.pouliquen@st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-13ASoC: msm8916-wcd-analog: Fix selected events for MIC BIAS External1Stephan Gerhold
MIC BIAS External1 sets pm8916_wcd_analog_enable_micbias_ext1() as event handler, which ends up in pm8916_wcd_analog_enable_micbias_ext(). But pm8916_wcd_analog_enable_micbias_ext() only handles the POST_PMU event, which is not specified in the event flags for MIC BIAS External1. This means that the code in the event handler is never actually run. Set SND_SOC_DAPM_POST_PMU as the only event for the handler to fix this. Fixes: 585e881e5b9e ("ASoC: codecs: Add msm8916-wcd analog codec") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200111164006.43074-2-stephan@gerhold.net Signed-off-by: Mark Brown <broonie@kernel.org>