Age | Commit message (Collapse) | Author |
|
Non-blocking allocation with __GFP_NOFAIL is not supported and may still
result in NULL pointers (if we don't return NULL, we result in busy-loop
within non-sleepable contexts):
static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
struct alloc_context *ac)
{
...
/*
* Make sure that __GFP_NOFAIL request doesn't leak out and make sure
* we always retry
*/
if (gfp_mask & __GFP_NOFAIL) {
/*
* All existing users of the __GFP_NOFAIL are blockable, so warn
* of any new users that actually require GFP_NOWAIT
*/
if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
goto fail;
...
}
...
fail:
warn_alloc(gfp_mask, ac->nodemask,
"page allocation failure: order:%u", order);
got_pg:
return page;
}
Highlight this in the documentation of __GFP_NOFAIL so that non-mm
subsystems can reject any illegal usage of __GFP_NOFAIL with GFP_ATOMIC,
GFP_NOWAIT, etc.
Link: https://lkml.kernel.org/r/20240830202823.21478-3-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: Hailong.Liu <hailong.liu@oppo.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xie Yongji <xieyongji@bytedance.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL
and improve related doc and warn", v4.
__GFP_NOFAIL carries the semantics of never failing, so its callers do not
check the return value:
%__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
cannot handle allocation failures. The allocation could block
indefinitely but will never return with failure. Testing for
failure is pointless.
However, __GFP_NOFAIL can sometimes fail if it exceeds size limits or is
used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context. This patchset
handles illegal using __GFP_NOFAIL together with GFP_ATOMIC lacking
__GFP_DIRECT_RECLAIM(without this, we can't do anything to reclaim memory
to satisfy the nofail requirement) and improve related document and
warnings.
The proper size limits for __GFP_NOFAIL will be handled separately after
more discussions.
This patch (of 3):
mm doesn't support non-blockable __GFP_NOFAIL allocation. Because
persisting in providing __GFP_NOFAIL services for non-block users who
cannot perform direct memory reclaim may only result in an endless busy
loop.
Therefore, in such cases, the current mm-core may directly return a NULL
pointer:
static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
struct alloc_context *ac)
{
...
if (gfp_mask & __GFP_NOFAIL) {
/*
* All existing users of the __GFP_NOFAIL are blockable, so warn
* of any new users that actually require GFP_NOWAIT
*/
if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
goto fail;
...
}
...
fail:
warn_alloc(gfp_mask, ac->nodemask,
"page allocation failure: order:%u", order);
got_pg:
return page;
}
Unfortuantely, vpda does that nofail allocation under non-sleepable lock.
A possible way to fix that is to move the pages allocation out of the lock
into the caller, but having to allocate a huge number of pages and
auxiliary page array seems to be problematic as well per Tetsuon: " You
should implement proper error handling instead of using __GFP_NOFAIL if
count can become large."
So I chose another way, which does not release kernel bounce pages when
user tries to register userspace bounce pages. Then we can avoid
allocating in paths where failure is not expected.(e.g in the release).
We pay this for more memory usage as we don't release kernel bounce pages
but further optimizations could be done on top.
[v-songbaohua@oppo.com: Refine the changelog]
Link: https://lkml.kernel.org/r/20240830202823.21478-1-21cnbao@gmail.com
Link: https://lkml.kernel.org/r/20240830202823.21478-2-21cnbao@gmail.com
Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer")
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Tested-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hailong.Liu <hailong.liu@oppo.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The mutex array pointer shares a cacheline with the spinlock:
ffffffff84187480 B hugetlb_fault_mutex_table
ffffffff84187488 B hugetlb_lock
This is because the former is annotated with a macro forcing cacheline
alignment. I suspect it was meant to be the variant which on top of it
makes sure the object does not share the cacheline with anyone.
Since array pointer itself is de facto read-only such an annotation does
not make sense there anyway. Instead mark it __ro_after_init along with
the size var.
Do however move the spinlock out of the way.
[akpm@linux-foundation.org: move section directives to the end of the definitions, per convention]
[akpm@linux-foundation.org: DEFINE_SPINLOCK doesn't permit section modifiers at end-of-definition]
Link: https://lkml.kernel.org/r/20240828160704.1425767-1-mjguzik@gmail.com
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Although shmem_get_folio_gfp() is correctly putting inodes on the
shrinklist according to the folio size, shmem_unused_huge_shrink() was
still dealing with that shrinklist in terms of HPAGE_PMD_SIZE.
Generalize that; and to handle the mixture of sizes more sensibly,
shmem_alloc_and_add_folio() give it a number of pages to be freed
(approximate: no need to minimize that with an exact calculation) instead
of a number of inodes to split.
[akpm@linux-foundation.org: comment tweak, per David]
Link: https://lkml.kernel.org/r/d8c40850-6774-7a93-1e2c-8d941683b260@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There has been a long-standing and very minor off-by-one, where
shmem_get_folio_gfp() decides if a large folio extends beyond i_size far
enough to leave a page or more for freeing later under pressure.
This is not something needed for stable: but it will be proportionately
more significant as support for smaller large folios is added, and is best
fixed before duplicating the check in other places.
Link: https://lkml.kernel.org/r/d8e75079-af2d-8519-56df-6be1dccc247a@google.com
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Just do what mt_dump_range64() does.
Dump the error message based on format.
Link: https://lkml.kernel.org/r/20240826012422.29935-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
mt_dump_arange64() only applies to an entry whose type is maple_arange_64,
in which mte_is_leaf() must return false.
Since mte_is_leaf() here is always false, we can remove this condition
check.
Link: https://lkml.kernel.org/r/20240826012422.29935-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We added a public Google calendar for easy sharing of DAMON bi-weekly
meetups[1]. Add it to the official document for a better visibility.
[1] https://lore.kernel.org/all/20240717235812.53087-1-sj@kernel.org/
Link: https://lkml.kernel.org/r/20240826015741.80707-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
maintainer-profile.rst for DAMON separates the links and target
definitions. It is not really necessary, and only makes the readability
worse. At least the definitions need the section title (say,
"References"). Just add the links in place on the doc.
Link: https://lkml.kernel.org/r/20240826015741.80707-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Docs/damon: update GitHub repo URLs and maintainer-profile".
Replace GitHub URLS on DAMON documents for none-kernel parts DAMON repos
with new ones[1] via the first patch. With following two patches,
wordsmith maitnainer-profile for better readability, and document the
Google clendsar for bi-weekly meetups, respectively.
[1] https://lore.kernel.org/20240813232158.83903-1-sj@kernel.org
This patch (of 3):
GitHub repos for non-kernel parts of DAMON project including 'damo',
'damon-tests' and 'damoos' will be moved[1] from 'awslabs' org to
'damonitor', by 2024-09-05. Update related URLs in kernel tree.
[1] https://lore.kernel.org/20240813232158.83903-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20240826015741.80707-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20240826015741.80707-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit 0742cadf5e4c ("mm/damon/lru_sort: adjust local
variable to dynamic allocation").
The commit was introduced to avoid unnecessary usage of stack memory for
per-scheme region priorities histogram buffer. The fix is nice, but the
point of the fix looks not very clear if the commit message is not read
together. That's mainly because the buffer is a private field, which
means it is hidden from the DAMON API users. That's not the fault of the
fix but the underlying data structure.
Now the per-scheme histogram buffer is gone, so the problem that the
commit was fixing is also removed. The use of kmemdup() has no more point
but just making the code bit difficult to understand. Revert the fix.
Link: https://lkml.kernel.org/r/20240826042323.87025-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Nobody is reading from or writing to the per-scheme region priorities
histogram buffer. It is only wasting memory. Remove it.
Link: https://lkml.kernel.org/r/20240826042323.87025-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
with per-context one
Replace the usage of per-quota region priorities histogram buffer with the
per-context one. After this change, the per-quota histogram is not used
by anyone, and hence it is ready to be removed.
Link: https://lkml.kernel.org/r/20240826042323.87025-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "replace per-quota region priorities histogram buffer with
per-context one".
Each DAMOS quota (struct damos_quota) maintains a histogram for total
regions size per its prioritization score. DAMOS calcultes minimum
prioritization score of regions that are ok to apply the DAMOS action to
while respecting the quota. The histogram is constructed only for the
calculation of the minimum score in damos_adjust_quota() for each quota
which called by kdamond_fn().
Hence, there is no real reason to have per-quota histogram. Only
per-kdamond histogram is needed, since parallel kdamonds could have races
otherwise. The current implementation is only wasting the memory, and can
easily cause unintended stack usage[1].
So, introducing a per-kdamond histogram and replacing the per-quota one
with it would be the right solution for the issue. However, supporting
multiple DAMON contexts per kdamond is still an ongoing work[2] without a
clear estimated time of arrival. Meanwhile, per-context histogram could
be an effective and straightforward solution having no blocker. Let's fix
the problem first in the way.
This patch (of 4):
Introduce per-context buffer for region priority scores-total size
histogram. Same to the per-quota one (->histogram of struct damos_quota),
the new buffer is hidden from DAMON API users by being defined as a
private field of DAMON context structure. It is dynamically allocated and
de-allocated at the beginning and ending of the execution of the kdamond
by kdamond_fn() itself.
[1] commit 0742cadf5e4c ("mm/damon/lru_sort: adjust local variable to dynamic allocation")
[2] https://lore.kernel.org/20240531122320.909060-1-yorha.op@gmail.com
Link: https://lkml.kernel.org/r/20240826042323.87025-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20240826042323.87025-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There are no more callers of putback_lru_page(), remove it.
Link: https://lkml.kernel.org/r/20240826065814.1336616-7-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There are no more callers of isolate_lru_page(), remove it.
[wangkefeng.wang@huawei.com: convert page to folio in comment and document, per Matthew]
Link: https://lkml.kernel.org/r/20240826144114.1928071-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20240826065814.1336616-6-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Saves a couple of calls to compound_head() and remove last two callers of
putback_lru_page().
Link: https://lkml.kernel.org/r/20240826065814.1336616-5-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The page for migrate_device_unmap() already has a reference, so it is safe
to convert the page to folio to save a few calls to compound_head(), which
removes the last isolate_lru_page() call.
Link: https://lkml.kernel.org/r/20240826065814.1336616-4-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Save two calls to compound_head() and use folio throughout.
Link: https://lkml.kernel.org/r/20240826065814.1336616-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: finish isolate/putback_lru_page()".
Convert to use more folios in migrate_device.c, then we could remove
isolate_lru_page() and putback_lru_page().
This patch (of 6):
Save a few calls to compound_head() and use folio throughout.
Link: https://lkml.kernel.org/r/20240826065814.1336616-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20240826065814.1336616-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Retrieve a folio from the page cache rather than a page. Saves a couple
of conversions between page & folio.
Link: https://lkml.kernel.org/r/20240826202138.3804238-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When a THP is added to the deferred_list due to partially mapped, its
partial pages are unused, leading to wasted memory and potentially
increasing memory reclamation pressure.
Detailing the specifics of how unmapping occurs is quite difficult and not
that useful, so we adopt a simple approach: each time a THP enters the
deferred_list, we increment the count by 1; whenever it leaves for any
reason, we decrement the count by 1.
Link: https://lkml.kernel.org/r/20240824010441.21308-3-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Chuanhua Han <hanchuanhua@oppo.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuai Yuan <yuanshuai@oppo.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: count the number of anonymous THPs per size", v4.
Knowing the number of transparent anon THPs in the system is crucial
for performance analysis. It helps in understanding the ratio and
distribution of THPs versus small folios throughout the system.
Additionally, partial unmapping by userspace can lead to significant waste
of THPs over time and increase memory reclamation pressure. We need this
information for comprehensive system tuning.
This patch (of 2):
Let's track for each anonymous THP size, how many of them are currently
allocated. We'll track the complete lifespan of an anon THP, starting
when it becomes an anon THP ("large anon folio") (->mapping gets set),
until it gets freed (->mapping gets cleared).
Introduce a new "nr_anon" counter per THP size and adjust the
corresponding counter in the following cases:
* We allocate a new THP and call folio_add_new_anon_rmap() to map
it the first time and turn it into an anon THP.
* We split an anon THP into multiple smaller ones.
* We migrate an anon THP, when we prepare the destination.
* We free an anon THP back to the buddy.
Note that AnonPages in /proc/meminfo currently tracks the total number of
*mapped* anonymous *pages*, and therefore has slightly different
semantics. In the future, we might also want to track "nr_anon_mapped"
for each THP size, which might be helpful when comparing it to the number
of allocated anon THPs (long-term pinning, stuck in swapcache, memory
leaks, ...).
Further note that for now, we only track anon THPs after they got their
->mapping set, for example via folio_add_new_anon_rmap(). If we would
allocate some in the swapcache, they will only show up in the statistics
for now after they have been mapped to user space the first time, where we
call folio_add_new_anon_rmap().
[akpm@linux-foundation.org: documentation fixups, per David]
Link: https://lkml.kernel.org/r/3e8add35-e26b-443b-8a04-1078f4bc78f6@redhat.com
Link: https://lkml.kernel.org/r/20240824010441.21308-1-21cnbao@gmail.com
Link: https://lkml.kernel.org/r/20240824010441.21308-2-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Chuanhua Han <hanchuanhua@oppo.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuai Yuan <yuanshuai@oppo.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Previously we had a situation where shmem mTHP controls and stats were not
exposed for some supported sizes and were exposed for some unsupported
sizes. So let's clean that up.
Anon mTHP can support all large orders [2, PMD_ORDER]. But shmem can
support all large orders [1, MAX_PAGECACHE_ORDER]. However, per-size
shmem controls and stats were previously being exposed for all the anon
mTHP orders, meaning order-1 was not present, and for arm64 64K base
pages, orders 12 and 13 were exposed but were not supported internally.
Tidy this all up by defining ctrl and stats attribute groups for anon and
file separately. Anon ctrl and stats groups are populated for all orders
in THP_ORDERS_ALL_ANON and file ctrl and stats groups are populated for
all orders in THP_ORDERS_ALL_FILE_DEFAULT.
Additionally, create "any" ctrl and stats attribute groups which are
populated for all orders in (THP_ORDERS_ALL_ANON |
THP_ORDERS_ALL_FILE_DEFAULT). swpout stats use this since they apply to
anon and shmem.
The side-effect of all this is that different hugepage-*kB directories
contain different sets of controls and stats, depending on which memory
types support that size. This approach is preferred over the alternative,
which is to populate dummy controls and stats for memory types that do not
support a given size.
[ryan.roberts@arm.com: file pages and shmem can also be split]
Link: https://lkml.kernel.org/r/f7ced14c-8bc5-405f-bee7-94f63980f525@arm.comLink: https://lkml.kernel.org/r/20240808111849.651867-3-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Barry Song <baohua@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Shmem mTHP controls and stats improvements", v3.
This is a small series to tidy up the way the shmem controls and stats are
exposed. These patches were previously part of the series at [2], but I
decided to split them out since they can go in independently.
This patch (of 2):
Let's move count_mthp_stat() so that it's always defined, even when THP is
disabled. Previously uses of the function in files such as shmem.c, which
are compiled even when THP is disabled, required ugly THP ifdeferry. With
this cleanup, we can remove those ifdefs and the function resolves to a
nop when THP is disabled.
I shortly plan to call count_mthp_stat() from more THP-invariant source
files.
Link: https://lkml.kernel.org/r/20240808111849.651867-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240808111849.651867-2-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
scx_dump_data is only used inside ext.c but doesn't have static. Add it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409070218.RB5WsQ07-lkp@intel.com/
|
|
Block comments should align the * on each line, as checkpatch rightfully
pointed out, so fix that style issue on the newly added rk3576 headers.
Fixes: 49c04453db81 ("dt-bindings: clock, reset: Add support for rk3576")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20240909223149.85364-1-heiko@sntech.de
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Fix some spelling errors in the code comments of libbpf:
betwen -> between
paremeters -> parameters
knowning -> knowing
definiton -> definition
compatiblity -> compatibility
overriden -> overridden
occured -> occurred
proccess -> process
managment -> management
nessary -> necessary
Signed-off-by: Yusheng Zheng <yunwei356@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240909225952.30324-1-yunwei356@gmail.com
|
|
The previous e-mail address from Synopsys is not available anymore.
Signed-off-by: Shahab Vahedi <list+bpf@vahedi.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240909184754.27634-1-list+bpf@vahedi.org
|
|
When "arg#%d expected pointer to ctx, but got %s" error is printed, both
template parts actually point to the type of the argument, therefore, it
will also say "but got PTR", regardless of what was the actual register
type.
Fix the message to print the register type in the second part of the
template, change the existing test to adapt to the new format, and add a
new test to test the case when arg is a pointer to context, but reg is a
scalar.
Fixes: 00b85860feb8 ("bpf: Rewrite kfunc argument handling")
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20240909133909.1315460-1-maxim@isovalent.com
|
|
Fix typos in documentation.
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240909092452.4293-1-algonell@gmail.com
|
|
Replace shifts of '1' with '1U' in bitwise operations within
__show_dev_tc_bpf() to prevent undefined behavior caused by shifting
into the sign bit of a signed integer. By using '1U', the operations
are explicitly performed on unsigned integers, avoiding potential
integer overflow or sign-related issues.
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240908140009.3149781-1-visitorckw@gmail.com
|
|
ARM64 has a separate lr register to store the return address, so here
you only need to read the lr register to get the return address, no need
to dereference it again.
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1725787433-77262-1-git-send-email-chengshuyi@linux.alibaba.com
|
|
Group all cxl related kernel headers into include/cxl/ directory.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
On chipsets with a second 'Integrated Device Function' SMBus controller use
a different adapter-name for the second IDF adapter.
This allows platform glue code which is looking for the primary i801
adapter to manually instantiate i2c_clients on to differentiate
between the 2.
This allows such code to find the primary i801 adapter by name, without
needing to duplicate the PCI-ids to feature-flags mapping from i2c-i801.c.
Reviewed-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
Platform glue code, which is not build into the kernel and thus cannot
use i2c_register_board_info() may want to use bus_register_notifier()
to listen for i2c-adapters to show up on which the platform code needs
to manually instantiate platform specific i2c_clients.
This results in calling i2c_new_client_device() from the bus notifier
which happens near the device_add() call.
If the i2c-core has not yet setup runtime-pm (specifically the
no-callbacks and ignore-children flags) for the device embedded
inside struct i2c_adapter and the driver for the i2c_client
calls pm_runtime_set_active() this will trigger the following
error inside __pm_runtime_set_status():
"runtime PM trying to activate child device %s but parent (%s) is not active\n"
and the i2c_client's runtime-status will not be updated.
Split the device_register() call for the adapter into device_initialize()
and device_add() and move the pm-runtime init calls inbetween these 2 calls
so that the runtime-status can be correctly set when a driver binds from
the bus-notifier.
Note the moved pm-runtime init calls just override the initial value of
some flags in struct device set by device_initialize() and calling these
before device_add() is safe.
Reviewed-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
Convert the Spreadtrum SC9860 I2C controller bindings to DT schema.
Adjust filename to match compatible.
Signed-off-by: Stanislav Jakubek <stano.jakubek@gmail.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
On Intel Denverton SoC ismt controller may enter weird state when
transaction gets stuck. It times out in the driver, but unless
transaction is explicitly killed in the controller, it won't be able to
perform new transactions anymore.
The issue is extremely difficult to reproduce and may take weeks of non-
stop smbus traffic.
Numerous hours with logic analyzer didn't yield any useful results, it
looks like the controller stops toggling SCK line, i.e. the issue is
likely in the controller, since device doesn't do clock stretching, so
nothing is driving SCK except the host.
Explicitly kill transaction on timeout to recover the controller from
this state.
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
There are quite a few drivers and options for the DesignWare
I2C adapter in the Kconfig. Grouping all of them under the
I2C_DESIGNWARE_CORE. That makes the menuconfig a bit more
easier to understand.
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the platform and PCI buses depend on
I2C_DESIGNWARE_CORE. Right now this driver prevents that
update because it selects I2C_DESIGNWARE_PLATFORM.
To make the dependency on I2C_DESIGNWARE_PLATFORM consistent
with the other drivers in kernel that depend on it, and
allow the dependency handling of the Synopsys DesignWare I2C
drivers to be updated, change the "select" into "depends on".
Cc: Jiawen Wu <jiawenwu@trustnetic.com>
Cc: Mengyuan Lou <mengyuanlou@net-swift.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: UNGLinuxDriver@microchip.com
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
'struct i2c_algorithm' and 'struct virtio_device_id' are not modified in
this driver.
Constifying this structure moves some data to a read-only section, so
increase overall security, especially when the structure holds some
function pointers, which is the case for struct i2c_algorithm.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
6663 568 16 7247 1c4f drivers/i2c/busses/i2c-virtio.o
After:
=====
text data bss dec hex filename
6735 472 16 7223 1c37 drivers/i2c/busses/i2c-virtio.o
--
Compile tested only
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
rcar_i2c_probe() has priv->devtype operation, but handling (A) and (C)
in same place is more understandable ( (A) and (B) are independent).
(A) if (priv->devtype < I2C_RCAR_GEN3) {
...
}
(B) ...
(C) if (priv->devtype >= I2C_RCAR_GEN3) {
...
}
Let's merge it with if-else
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
To ensure code clarity and prevent potential errors, it's advisable
to employ the ';' as a statement separator, except when ',' are
intentionally used for specific purposes.
Signed-off-by: Shen Lichuan <shenlichuan@vivo.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The devm_clk_get_enabled() helpers:
- call devm_clk_get()
- call clk_prepare_enable() and register what is needed in order to
call clk_disable_unprepare() when needed, as a managed resource.
This simplifies the code and avoids the calls to clk_disable_unprepare().
While at it, no more special handling needed here, remove the goto
label "err:".
Signed-off-by: Rong Qianfeng <rongqianfeng@vivo.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The devm_clk_get_enabled() helpers:
- call devm_clk_get()
- call clk_prepare_enable() and register what is needed in order to
call clk_disable_unprepare() when needed, as a managed resource.
This simplifies the code and avoids the calls to clk_disable_unprepare().
While at it, no need to save clk pointer, drop sclk from struct
em_i2c_device.
Signed-off-by: Rong Qianfeng <rongqianfeng@vivo.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|