summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-17nvme-pci: refresh visible attrs for cmb attributesKeith Busch
The sysfs group containing the cmb attributes is registered before the driver knows if they need to be visible or not. Update the group when cmb attributes are known to exist so the visibility setting is correct. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-16mm: memcontrol: rename memcg_kmem_enabled()Roman Gushchin
Currently there are two kmem-related helper functions with a confusing semantics: memcg_kmem_enabled() and mem_cgroup_kmem_disabled(). The problem is that an obvious expectation memcg_kmem_enabled() == !mem_cgroup_kmem_disabled(), can be false. mem_cgroup_kmem_disabled() is similar to mem_cgroup_disabled(): it returns true only if CONFIG_MEMCG_KMEM is not set or the kmem accounting is disabled using a boot time kernel option "cgroup.memory=nokmem". It never changes the value dynamically. memcg_kmem_enabled() is different: it always returns false until the first non-root memory cgroup will get online (assuming the kernel memory accounting is enabled). It's goal is to improve the performance on systems without the cgroupfs mounted/memory controller enabled or on the systems with only the root memory cgroup. To make things more obvious and avoid potential bugs, let's rename memcg_kmem_enabled() to memcg_kmem_online(). Link: https://lkml.kernel.org/r/20230213192922.1146370-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Dennis Zhou <dennis@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16sh: initialize max_mapnrMike Rapoport (IBM)
sh never initializes max_mapnr which is used by the generic implementation of pfn_valid(). Initialize max_mapnr with set_max_mapnr() in sh::paging_init(). Link: https://lkml.kernel.org/r/20230214140729.1649961-3-rppt@kernel.org Fixes: e5080a967785 ("mm, arch: add generic implementation of pfn_valid() for FLATMEM") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16m68k/nommu: add missing definition of ARCH_PFN_OFFSETMike Rapoport (IBM)
Patch series "fixups for generic implementation of pfn_valid()". Guenter reported boot failures on m68k-nommu and sh caused by the switch to the generic implementation of pfn_valid(): https://lore.kernel.org/all/20230212173513.GA4052259@roeck-us.net https://lore.kernel.org/all/20230212161320.GA3784076@roeck-us.net These are small fixups that address the issues. This patch (of 2): On m68k/nommu RAM does not necessarily start at 0x0 and when it does not pfn_valid() uses a wrong offset into the memory map which causes silent boot failures. Define ARCH_PFN_OFFSET to make pfn_valid() use the correct offset. Link: https://lkml.kernel.org/r/20230214140729.1649961-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20230214140729.1649961-2-rppt@kernel.org Fixes: d82f07f06cf8 ("m68k: use asm-generic/memory_model.h for both MMU and !MMU") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm: percpu: fix incorrect size in pcpu_obj_full_size()Yafang Shao
The extra space which is used to store the obj_cgroup membership is only valid when kmemcg is enabled. The kmemcg can be disabled via the kernel parameter "cgroup.memory=nokmem" at boot time. This helper is also used in non-memcg code, for example the tracepoint, so we should fix it. It was found by code review when I was implementing bpf memory usage[1]. No real issue happens in production environment. [1]. https://lwn.net/Articles/921991/ Link: https://lkml.kernel.org/r/20230214153549.12291-1-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Vasily Averin <vvs@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16maple_tree: reduce stack usage with gcc-9 and earlierArnd Bergmann
gcc-10 changed the way inlining works to be less aggressive, but older versions run into an oversized stack frame warning whenever CONFIG_KASAN_STACK is enabled, as that forces variables from inlined callees to be non-overlapping: lib/maple_tree.c: In function 'mas_wr_bnode': lib/maple_tree.c:4320:1: error: the frame size of 1424 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Change the annotations on mas_store_b_node() and mas_commit_b_node() to explicitly forbid inlining in this configuration, which is the same behavior that newer versions already have. Link: https://lkml.kernel.org/r/20230214103030.1051950-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm: page_alloc: call panic() when memoryless node allocation failsQi Zheng
In free_area_init(), we will continue to run after allocation of memoryless node pgdat fails. However, in the subsequent process (such as when initializing zonelist), the case that NODE_DATA(nid) is NULL is not handled, which will cause panic. Instead of this, it's better to call panic() directly when the memory allocation fails during system boot. Link: https://lkml.kernel.org/r/20230212111027.95520-1-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm: multi-gen LRU: avoid futile retriesYu Zhao
Recall that the per-node memcg LRU has two generations and they alternate when the last memcg (of a given node) is moved from one to the other. Each generation is also sharded into multiple bins to improve scalability. A reclaimer starts with a random bin (in the old generation) and, if it fails, it will retry, i.e., to try the rest of the bins. If a reclaimer fails with the last memcg, it should move this memcg to the young generation first, which causes the generations to alternate, and then retry. Otherwise, the retries will be futile because all other bins are empty. Link: https://lkml.kernel.org/r/20230213075322.1416966-1-yuzhao@google.com Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: T.J. Mercier <tjmercier@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: move THP/hugetlb migration support check to simplify codeHuang Ying
This is a code cleanup patch, no functionality change is expected. After the change, the line number reduces especially in the long migrate_pages_batch(). Link: https://lkml.kernel.org/r/20230213123444.155149-10-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: batch flushing TLBHuang Ying
The TLB flushing will cost quite some CPU cycles during the folio migration in some situations. For example, when migrate a folio of a process with multiple active threads that run on multiple CPUs. After batching the _unmap and _move in migrate_pages(), the TLB flushing can be batched easily with the existing TLB flush batching mechanism. This patch implements that. We use the following test case to test the patch. On a 2-socket Intel server, - Run pmbench memory accessing benchmark - Run `migratepages` to migrate pages of pmbench between node 0 and node 1 back and forth. With the patch, the TLB flushing IPI reduces 99.1% during the test and the number of pages migrated successfully per second increases 291.7%. Haoxin helped to test the patchset on an ARM64 server with 128 cores, 2 NUMA nodes. Test results show that the page migration performance increases up to 78%. NOTE: TLB flushing is batched only for normal folios, not for THP folios. Because the overhead of TLB flushing for THP folios is much lower than that for normal folios (about 1/512 on x86 platform). Link: https://lkml.kernel.org/r/20230213123444.155149-9-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Xin Hao <xhao@linux.alibaba.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: share more code between _unmap and _moveHuang Ying
This is a code cleanup patch to reduce the duplicated code between the _unmap and _move stages of migrate_pages(). No functionality change is expected. Link: https://lkml.kernel.org/r/20230213123444.155149-8-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: move migrate_folio_unmap()Huang Ying
Just move the position of the functions. There's no any functionality change. This is to make it easier to review the next patch via putting code near its position in the next patch. Link: https://lkml.kernel.org/r/20230213123444.155149-7-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: batch _unmap and _moveHuang Ying
In this patch the _unmap and _move stage of the folio migration is batched. That for, previously, it is, for each folio _unmap() _move() Now, it is, for each folio _unmap() for each folio _move() Based on this, we can batch the TLB flushing and use some hardware accelerator to copy folios between batched _unmap and batched _move stages. Link: https://lkml.kernel.org/r/20230213123444.155149-6-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: split unmap_and_move() to _unmap() and _move()Huang Ying
This is a preparation patch to batch the folio unmapping and moving. In this patch, unmap_and_move() is split to migrate_folio_unmap() and migrate_folio_move(). So, we can batch _unmap() and _move() in different loops later. To pass some information between unmap and move, the original unused dst->mapping and dst->private are used. Link: https://lkml.kernel.org/r/20230213123444.155149-5-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Xin Hao <xhao@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: restrict number of pages to migrate in batchHuang Ying
This is a preparation patch to batch the folio unmapping and moving for non-hugetlb folios. If we had batched the folio unmapping, all folios to be migrated would be unmapped before copying the contents and flags of the folios. If the folios that were passed to migrate_pages() were too many in unit of pages, the execution of the processes would be stopped for too long time, thus too long latency. For example, migrate_pages() syscall will call migrate_pages() with all folios of a process. To avoid this possible issue, in this patch, we restrict the number of pages to be migrated to be no more than HPAGE_PMD_NR. That is, the influence is at the same level of THP migration. Link: https://lkml.kernel.org/r/20230213123444.155149-4-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Xin Hao <xhao@linux.alibaba.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: separate hugetlb folios migrationHuang Ying
This is a preparation patch to batch the folio unmapping and moving for the non-hugetlb folios. Based on that we can batch the TLB shootdown during the folio migration and make it possible to use some hardware accelerator for the folio copying. In this patch the hugetlb folios and non-hugetlb folios migration is separated in migrate_pages() to make it easy to change the non-hugetlb folios migration implementation. Link: https://lkml.kernel.org/r/20230213123444.155149-3-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Xin Hao <xhao@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16migrate_pages: organize stats with struct migrate_pages_statsHuang Ying
Patch series "migrate_pages(): batch TLB flushing", v5. Now, migrate_pages() migrates folios one by one, like the fake code as follows, for each folio unmap flush TLB copy restore map If multiple folios are passed to migrate_pages(), there are opportunities to batch the TLB flushing and copying. That is, we can change the code to something as follows, for each folio unmap for each folio flush TLB for each folio copy for each folio restore map The total number of TLB flushing IPI can be reduced considerably. And we may use some hardware accelerator such as DSA to accelerate the folio copying. So in this patch, we refactor the migrate_pages() implementation and implement the TLB flushing batching. Base on this, hardware accelerated folio copying can be implemented. If too many folios are passed to migrate_pages(), in the naive batched implementation, we may unmap too many folios at the same time. The possibility for a task to wait for the migrated folios to be mapped again increases. So the latency may be hurt. To deal with this issue, the max number of folios be unmapped in batch is restricted to no more than HPAGE_PMD_NR in the unit of page. That is, the influence is at the same level of THP migration. We use the following test to measure the performance impact of the patchset, On a 2-socket Intel server, - Run pmbench memory accessing benchmark - Run `migratepages` to migrate pages of pmbench between node 0 and node 1 back and forth. With the patch, the TLB flushing IPI reduces 99.1% during the test and the number of pages migrated successfully per second increases 291.7%. Xin Hao helped to test the patchset on an ARM64 server with 128 cores, 2 NUMA nodes. Test results show that the page migration performance increases up to 78%. This patch (of 9): Define struct migrate_pages_stats to organize the various statistics in migrate_pages(). This makes it easier to collect and consume the statistics in multiple functions. This will be needed in the following patches in the series. Link: https://lkml.kernel.org/r/20230213123444.155149-1-ying.huang@intel.com Link: https://lkml.kernel.org/r/20230213123444.155149-2-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Xin Hao <xhao@linux.alibaba.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: move documentation comments to stackdepot.hAndrey Konovalov
Move all interface- and usage-related documentation comments to include/linux/stackdepot.h. It makes sense to have them in the header where they are available to the interface users. [akpm@linux-foundation.org: grammar fix, per Alexander] Link: https://lkml.kernel.org/r/fbfee41495b306dd8881f9b1c1b80999c885e82f.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: various comments clean-upsAndrey Konovalov
Clean up comments in include/linux/stackdepot.h and lib/stackdepot.c: 1. Rework the initialization comment in stackdepot.h. 2. Rework the header comment in stackdepot.c. 3. Various clean-ups for other comments. Also adjust whitespaces for find_stack and depot_alloc_stack call sites. No functional changes. Link: https://lkml.kernel.org/r/5836231b7954355e2311fc9b5870f697ea8e1f7d.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: annotate racy pool_index accessesAndrey Konovalov
Accesses to pool_index are protected by pool_lock everywhere except in a sanity check in stack_depot_fetch. The read access there can race with the write access in depot_alloc_stack. Use WRITE/READ_ONCE() to annotate the racy accesses. As the sanity check is only used to print a warning in case of a violation of the stack depot interface usage, it does not make a lot of sense to use proper synchronization. [andreyknvl@google.com: s/pool_index/pool_index_cached/ in stack_depot_fetch()] Link: https://lkml.kernel.org/r/95cf53f0da2c112aa2cc54456cbcd6975c3ff343.1676129911.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/359ac9c13cd0869c56740fb2029f505e41593830.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stacktrace, kasan, kmsan: rework extra_bits interfaceAndrey Konovalov
The current implementation of the extra_bits interface is confusing: passing extra_bits to __stack_depot_save makes it seem that the extra bits are somehow stored in stack depot. In reality, they are only embedded into a stack depot handle and are not used within stack depot. Drop the extra_bits argument from __stack_depot_save and instead provide a new stack_depot_set_extra_bits function (similar to the exsiting stack_depot_get_extra_bits) that saves extra bits into a stack depot handle. Update the callers of __stack_depot_save to use the new interace. This change also fixes a minor issue in the old code: __stack_depot_save does not return NULL if saving stack trace fails and extra_bits is used. Link: https://lkml.kernel.org/r/317123b5c05e2f82854fc55d8b285e0869d3cb77.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename next_pool_inited to next_pool_requiredAndrey Konovalov
Stack depot uses next_pool_inited to mark that either the next pool is initialized or the limit on the number of pools is reached. However, the flag name only reflects the former part of its purpose, which is confusing. Rename next_pool_inited to next_pool_required and invert its value. Also annotate usages of next_pool_required with comments. Link: https://lkml.kernel.org/r/484fd2695dff7a9bdc437a32f8a6ee228535aa02.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: annotate depot_init_pool and depot_alloc_stackAndrey Konovalov
Clean up the exisiting comments and add new ones to depot_init_pool and depot_alloc_stack. As a part of the clean-up, remove mentions of which variable is accessed by smp_store_release and smp_load_acquire: it is clear as is from the code. Link: https://lkml.kernel.org/r/f80b02951364e6b40deda965b4003de0cd1a532d.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stacktrace: drop impossible WARN_ON for depot_init_poolAndrey Konovalov
depot_init_pool has two call sites: 1. In depot_alloc_stack with a potentially NULL prealloc. 2. In __stack_depot_save with a non-NULL prealloc. At the same time depot_init_pool can only return false when prealloc is NULL. As the second call site makes sure that prealloc is not NULL, the WARN_ON there can never trigger. Thus, drop the WARN_ON and also move the prealloc check from depot_init_pool to its first call site. Also change the return type of depot_init_pool to void as it now always returns true. Link: https://lkml.kernel.org/r/ce149f9bdcbc80a92549b54da67eafb27f846b7b.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename init_stack_poolAndrey Konovalov
Rename init_stack_pool to depot_init_pool to align the name with depot_alloc_stack. No functional changes. Link: https://lkml.kernel.org/r/23106a3e291d8df0aba33c0e2fe86dc596286479.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename handle and pool constantsAndrey Konovalov
Change the "STACK_ALLOC_" prefix to "DEPOT_" for the constants that define the number of bits in stack depot handles and the maximum number of pools. The old prefix is unclear and makes wonder about how these constants are related to stack allocations. The new prefix is also shorter. Also simplify the comment for DEPOT_POOL_ORDER. No functional changes. Link: https://lkml.kernel.org/r/84fcceb0acc261a356a0ad4bdfab9ff04bea2445.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename slab to poolAndrey Konovalov
Use "pool" instead of "slab" for naming memory regions stack depot uses to store stack traces. Using "slab" is confusing, as stack depot pools have nothing to do with the slab allocator. Also give better names to pool-related global variables: change "depot_" prefix to "pool_" to point out that these variables are related to stack depot pools. Also rename the slabindex (poolindex) field in handle_parts to pool_index to align its name with the pool_index global variable. No functional changes. Link: https://lkml.kernel.org/r/923c507edb350c3b6ef85860f36be489dfc0ad21.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename hash table constants and variablesAndrey Konovalov
Give more meaningful names to hash table-related constants and variables: 1. Rename STACK_HASH_SCALE to STACK_HASH_TABLE_SCALE to point out that it is related to scaling the hash table. 2. Rename STACK_HASH_ORDER_MIN/MAX to STACK_BUCKET_NUMBER_ORDER_MIN/MAX to point out that it is related to the number of hash table buckets. 3. Rename stack_hash_order to stack_bucket_number_order for the same reason as #2. No functional changes. Link: https://lkml.kernel.org/r/f166dd6f3cb2378aea78600714393dd568c33ee9.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: reorder and annotate global variablesAndrey Konovalov
Group stack depot global variables by their purpose: 1. Hash table-related variables, 2. Slab-related variables, and add comments. Also clean up comments for hash table-related constants. Link: https://lkml.kernel.org/r/5606a6c70659065a25bee59cd10e57fc60bb4110.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: lower the indentation in stack_depot_initAndrey Konovalov
stack_depot_init does most things inside an if check. Move them out and use a goto statement instead. No functional changes. Link: https://lkml.kernel.org/r/8e382f1f0c352e4b2ad47326fec7782af961fe8e.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: annotate init and early init functionsAndrey Konovalov
Add comments to stack_depot_early_init and stack_depot_init to explain certain parts of their implementation. Also add a pr_info message to stack_depot_early_init similar to the one in stack_depot_init. Also move the scale variable in stack_depot_init to the scope where it is being used. Link: https://lkml.kernel.org/r/d17fbfbd4d73f38686c5e3d4824a6d62047213a1.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: rename stack_depot_disableAndrey Konovalov
Rename stack_depot_disable to stack_depot_disabled to make its name look similar to the names of other stack depot flags. Also put stack_depot_disabled's definition together with the other flags. Also rename is_stack_depot_disabled to disable_stack_depot: this name looks more conventional for a function that processes a boot parameter. No functional changes. Link: https://lkml.kernel.org/r/d78a07d222e689926e5ead229e4a2e3d87dc9aa7.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot, mm: rename stack_depot_want_early_initAndrey Konovalov
Rename stack_depot_want_early_init to stack_depot_request_early_init. The old name is confusing, as it hints at returning some kind of intention of stack depot. The new name reflects that this function requests an action from stack depot instead. No functional changes. [akpm@linux-foundation.org: update mm/kmemleak.c] Link: https://lkml.kernel.org/r/359f31bf67429a06e630b4395816a967214ef753.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: use pr_fmt to define message formatAndrey Konovalov
Use pr_fmt to define the format for printing stack depot messages instead of duplicating the "Stack Depot" prefix in each message. Link: https://lkml.kernel.org/r/3d09db0171a0e92ff3eb0ee74de74558bc9b56c4.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16lib/stackdepot: put functions in logical orderAndrey Konovalov
Patch series "lib/stackdepot: fixes and clean-ups", v2. A set of fixes, comments, and clean-ups I came up with while reading the stack depot code. This patch (of 18): Put stack depot functions' declarations and definitions in a more logical order: 1. Functions that save stack traces into stack depot. 2. Functions that fetch and print stack traces. 3. stack_depot_get_extra_bits that operates on stack depot handles and does not interact with the stack depot storage. No functional changes. Link: https://lkml.kernel.org/r/cover.1676063693.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/daca1319b665d826b94c596b992a8d8117846147.1676063693.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Marco Elver <elver@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Just a final collection of misc fixes, the biggest disables the recently added dynamic debugging support, it has a regression that needs some bigger fixes. Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx mostly, with some smaller i915 and ast fixes. drm: - dynamic debug disable for now fbdev: - deferred i/o device close fix amdgpu: - Fix GC11.x suspend warning - Fix display warning vc4: - YUV planes fix - hdmi display fix - crtc reduced blanking fix ast: - fix start address computation vmwgfx: - fix bo/handle races i915: - gen11 WA fix" * tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Fail atomic_check early on normalize_zpos error drm/amd/amdgpu: fix warning during suspend drm/vmwgfx: Do not drop the reference to the handle too soon drm/vmwgfx: Stop accessing buffer objects which failed init drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list drm: Disable dynamic debug as broken drm/ast: Fix start address computation fbdev: Fix invalid page access after closing deferred I/O devices drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking drm/vc4: hdmi: Always enable GCP with AVMUTE cleared drm/vc4: Fix YUV plane handling when planes are in different buffers
2023-02-16block: use proper return value from bio_failfast()Jens Axboe
kernel test robot complains about a type mismatch: block/blk-merge.c:984:42: sparse: expected restricted blk_opf_t const [usertype] ff block/blk-merge.c:984:42: sparse: got unsigned int block/blk-merge.c:1010:42: sparse: sparse: incorrect type in initializer (different base types) @@ expected restricted blk_opf_t const [usertype] ff @@ got unsigned int @@ block/blk-merge.c:1010:42: sparse: expected restricted blk_opf_t const [usertype] ff block/blk-merge.c:1010:42: sparse: got unsigned int because bio_failfast() is return an unsigned int rather than the appropriate blk_opt_f type. Fix it up. Fixes: 3ce6a115980c ("block: sync mixed merged request's failfast with 1st bio's") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302170743.GXypM9Rt-lkp@intel.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-17Merge remote-tracking branch 'regmap/for-6.3' into regmap-nextMark Brown
2023-02-16MAINTAINERS: update FPU EMULATOR web pageRandy Dunlap
The web page entry for the FPU EMULATOR no longer works. I notified Bill of this and he asked me to update it to this new entry. Link: https://lkml.kernel.org/r/20230214170208.17287-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Bill Metzenthen <billm@melbpc.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcountZach O'Keefe
During collapse, in a few places we check to see if a given small page has any unaccounted references. If the refcount on the page doesn't match our expectations, it must be there is an unknown user concurrently interested in the page, and so it's not safe to move the contents elsewhere. However, the unaccounted pins are likely an ephemeral state. In this situation, MADV_COLLAPSE returns -EINVAL when it should return -EAGAIN. This could cause userspace to conclude that the syscall failed, when it in fact could succeed by retrying. Link: https://lkml.kernel.org/r/20230125015738.912924-1-zokeefe@google.com Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") Signed-off-by: Zach O'Keefe <zokeefe@google.com> Reported-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm/filemap: fix page end in filemap_get_read_batchQian Yingjin
I was running traces of the read code against an RAID storage system to understand why read requests were being misaligned against the underlying RAID strips. I found that the page end offset calculation in filemap_get_read_batch() was off by one. When a read is submitted with end offset 1048575, then it calculates the end page for read of 256 when it should be 255. "last_index" is the index of the page beyond the end of the read and it should be skipped when get a batch of pages for read in @filemap_get_read_batch(). The below simple patch fixes the problem. This code was introduced in kernel 5.12. Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by: Qian Yingjin <qian@ddn.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-17powerpc/64s: Prevent fallthrough to hash TLB flush when using radixBenjamin Gray
In the fix reconnecting hash__tlb_flush() to tlb_flush() the void return on radix__tlb_flush() was not restored and subsequently falls through to the restored hash__tlb_flush(). Guard hash__tlb_flush() under an else to prevent this. Fixes: 1665c027afb2 ("powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()") Reported-by: "Erhard F." <erhard_f@mailbox.org> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230217011434.115554-1-bgray@linux.ibm.com
2023-02-16Fix typos in selftest/bpf filesTaichi Nishimura
Run spell checker on files in selftest/bpf and fixed typos. Signed-off-by: Taichi Nishimura <awkrail01@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/bpf/20230216085537.519062-1-awkrail01@gmail.com
2023-02-16Documentation/watchdog/hpwdt: Fix FormatJerry Hoemann
Add list format so that compiled documentation looks like it was intended to. Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230210184247.221134-3-jerry.hoemann@hpe.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-02-16Documentation/watchdog/hpwdt: Fix ReferenceJerry Hoemann
The IPMI documentation moved to Documentation/driver-api/ipmi.rst. Update reference to reflect new location. Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230210184247.221134-2-jerry.hoemann@hpe.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-02-17regmap-irq: Remove unused mask_invert flagAidan MacDonald
mask_invert is deprecated and no longer used; it can now be removed. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20230216223200.150679-2-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-17regmap-irq: Remove unused type_invert flagAidan MacDonald
type_invert is deprecated and no longer used; it can now be removed. Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Link: https://lore.kernel.org/r/20230216223200.150679-1-aidanmacdonald.0x0@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-16Documentation: core-api: padata: correct spellingRandy Dunlap
Correct spelling problems for Documentation/core-api/padata.rst as reported by codespell. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: netdev@vger.kernel.org Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: linux-crypto@vger.kernel.org Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Link: https://lore.kernel.org/r/20230215053744.11716-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-02-16docs/mm: Physical Memory: correct spelling in reference to CONFIG_PAGE_EXTENSIONLukas Bulwahn
Commit 5d8c5e430a63 ("docs/mm: Physical Memory: add structure, introduction and nodes description") slips in a minor spelling mistake for the config PAGE_EXTENSION. Correct the config name in the physical-memory documentation. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://lore.kernel.org/r/20230215100808.9613-1-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-02-17Merge tag 'drm-intel-fixes-2023-02-16' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Moving gen11 hw wa to the right place. (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y+47eUvwbafER35/@intel.com