Age | Commit message (Collapse) | Author |
|
When committing new scheme parameters from the sysfs, the target_nid field
of the damos struct would not be copied. This would result in the
target_nid field to retain its original value, despite being updated in
the sysfs interface.
This patch fixes this issue by copying target_nid in damos_commit().
Link: https://lkml.kernel.org/r/20250709004729.17252-1-bijan311@gmail.com
Fixes: 83dc7bbaecae ("mm/damon/sysfs: use damon_commit_ctx()")
Signed-off-by: Bijan Tabatabai <bijantabatab@micron.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ravi Shankar Jonnalagadda <ravis.opensrc@micron.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch adds a new knob `detect_node_addresses`, which determines
whether the physical address range is set manually using the existing
knobs or automatically by the mtier module. When `detect_node_addresses`
set to 'Y', mtier automatically converts node0 and node1 to their physical
addresses. If set to 'N', it uses the existing 'node#_start_addr' and
'node#_end_addr' to define regions as before.
Link: https://lkml.kernel.org/r/20250707235919.513-1-yunjeong.mun@sk.com
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
Suggested-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The damon_{lru_sort,reclaim,stat} kernel modules use "enabled" parameter
knobs as follows.
/sys/module/damon_lru_sort/parameters/enabled
/sys/module/damon_reclaim/parameters/enabled
/sys/module/damon_stat/parameters/enabled
However, other sample modules of damon use "enable" parameter knobs so
it'd be better to rename them from "enable" to "enabled" to keep the
consistency with other damon modules.
Before:
/sys/module/damon_sample_wsse/parameters/enable
/sys/module/damon_sample_prcl/parameters/enable
/sys/module/damon_sample_mtier/parameters/enable
After:
/sys/module/damon_sample_wsse/parameters/enabled
/sys/module/damon_sample_prcl/parameters/enabled
/sys/module/damon_sample_mtier/parameters/enabled
There is no functional changes in this patch.
Link: https://lkml.kernel.org/r/20250707024548.1964-1-honggyu.kim@sk.com
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The only user of the counter (FUSE) was removed in commit 0c58a97f919c
("fuse: remove tmp folio for writebacks and internal rb tree") so follow
the established pattern of removing the counter and hardcoding 0 in
meminfo output, as done recently with NR_BOUNCE. Update documentation for
procfs, including for the value for Bounce that was missed when removing
its counter.
Also remove the mention of NR_WRITEBACK_TEMP implications from a comment
in wb_position_ratio(). The rest of the comment there about fuse setting
bdi->max_ratio to 1% is still correct.
[vbabka@suse.cz: v2]
Link: https://lkml.kernel.org/r/5a848e15-6a57-4ecb-a015-d4f358b8a5d3@suse.cz
Link: https://lkml.kernel.org/r/20250625-nr_writeback_removal-v1-1-7f2a0df70faa@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joanne Koong <joannelkoong@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Maxim Patlasov <mpatlasov@parallels.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The vmstat_text array defines labels for counters displayed in
/proc/vmstat. The current definition of the array implies a specific
order of the counters in their enums, making it fragile.
To make it clear which counter the label is for, use designated
initializers.
Link: https://lkml.kernel.org/r/20250603084556.113975-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
/proc/vmstat displays counters from various sources. It is easy to forget
to add or remove a label when a new counter is added or removed.
There is a BUILD_BUG_ON() function that catches missing labels. However,
for some reason, it ignores extra labels.
Let's make the check strict. This would help to catch issues when a
counter is removed.
Link: https://lkml.kernel.org/r/20250529110541.2960330-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The vmstat_text array contains labels for counters displayed in
/proc/vmstat. It is important to keep the labels in sync with the
counters.
There is a BUILD_BUG_ON() check in vmstat_start() that ensures the size of
the vmstat_text is not smaller than VM_EVENT_COUNTERS. This helps to
catch cases where a new counter is added but the label is not. However,
it does not help if a counter is removed but the label remains.
It would be nice to make the BUILD_BUG_ON() check more strict to catch
such cases. However, when compiling with MEMCG enabled but
VM_EVENT_COUNTERS disabled, the vmstat_text array is larger than
NR_VMSTAT_ITEMS.
This issue arises because some elements of the vmstat_text array are
present when either MEMCG or VM_EVENT_COUNTERS is enabled, but
NR_VMSTAT_ITEMS only accounts for these elements if VM_EVENT_COUNTERS is
enabled.
Instead of adjusting the NR_VMSTAT_ITEMS definition to account for MEMCG,
make MEMCG select VM_EVENT_COUNTERS. VM_EVENT_COUNTERS is enabled in most
configurations anyway.
Link: https://lkml.kernel.org/r/20250604095111.533783-1-kirill.shutemov@linux.intel.com
Fixes: ebc5d83d0443 ("mm/memcontrol: use vmstat names for printing statistics")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Instead, let's just allow for specifying through flags whether we want to
have bits merged into the original PTE.
For the madvise() case, simplify by having only a single parameter for
merging young+dirty. For madvise_cold_or_pageout_pte_range() merging the
dirty bit is not required, but also not harmful. This code is not that
performance critical after all to really force all micro-optimizations.
As we now have two pte_t * parameters, use PageTable() to make sure we are
actually given a pointer at a copy of the PTE, not a pointer into an
actual page table.
Link: https://lkml.kernel.org/r/20250702104926.212243-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Many users (including upcoming ones) don't really need the flags etc, and
can live with the possible overhead of a function call.
So let's provide a basic, non-inlined folio_pte_batch(), to avoid code
bloat while still providing a variant that optimizes out all flag checks
at runtime. folio_pte_batch_flags() will get inlined into
folio_pte_batch(), optimizing out any conditionals that depend on input
flags.
folio_pte_batch() will behave like folio_pte_batch_flags() when no flags
are specified. It's okay to add new users of folio_pte_batch_flags(), but
using folio_pte_batch() if applicable is preferred.
So, before this change, folio_pte_batch() was inlined into the C file
optimized by propagating constants within the resulting object file.
With this change, we now also have a folio_pte_batch() that is optimized
by propagating all constants. But instead of having one instance per
object file, we have a single shared one.
In zap_present_ptes(), where we care about performance, the compiler
already seem to generate a call to a common inlined folio_pte_batch()
variant, shared with fork() code. So calling the new non-inlined variant
should not make a difference.
While at it, drop the "addr" parameter that is unused.
Link: https://lkml.kernel.org/r/20250702104926.212243-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/linux-mm/20250503182858.5a02729fcffd6d4723afcfc2@linux-foundation.org/
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Let's clean up a bit:
(1) No need for start_ptep vs. ptep anymore, we can simply use ptep.
(2) Let's switch to "unsigned int" for everything. Negative values do
not make sense.
(3) We can simplify the code by leaving the pte unchanged after the
pte_same() check.
(4) Clarify that we should never exceed a single VMA; it indicates a
problem in the caller.
No functional change intended.
Link: https://lkml.kernel.org/r/20250702104926.212243-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: folio_pte_batch() improvements", v2.
Ever since we added folio_pte_batch() for fork() + munmap() purposes, a
lot more users appeared (and more are being proposed), and more
functionality was added.
Most of the users only need basic functionality, and could benefit from a
non-inlined version.
So let's clean up folio_pte_batch() and split it into a basic
folio_pte_batch() (no flags) and a more advanced folio_pte_batch_ext().
Using either variant will now look much cleaner.
This series will likely conflict with some changes in some (old+new)
folio_pte_batch() users, but conflicts should be trivial to resolve.
This patch (of 4):
Respecting these PTE bits is the exception, so let's invert the meaning.
With this change, most callers don't have to pass any flags. This is a
preparation for splitting folio_pte_batch() into a non-inlined variant
that doesn't consume any flags.
Long-term, we want folio_pte_batch() to probably ignore most common PTE
bits (e.g., write/dirty/young/soft-dirty) that are not relevant for most
page table walkers: uffd-wp and protnone might be bits to consider in the
future. Only walkers that care about them can opt-in to respect them.
No functional change intended.
Link: https://lkml.kernel.org/r/20250702104926.212243-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The -EEXIST conversion is introduced in commit 65462462ffb2 ("mm/gup:
follow_pfn_pte(): -EEXIST cleanup"), since follow_page() may call
follow_pfn_pte() which may return -EEXIST.
But after commit 7dff875c9436 ("mm/migrate: convert
add_page_for_migration() from follow_page() to folio_walk"), it use
folio_walk instead. This limit the error code and won't return -EEXIST.
Remove the error code conversion here.
Link: https://lkml.kernel.org/r/20250707065711.18056-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add test cases to test the correctness of PFN ZERO flag of pagemap_scan
ioctl. Test with normal pages backed memory and huge pages backed memory.
Link: https://lkml.kernel.org/r/20250707073321.106431-1-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Recently a new mm tree for new patches, namely mm-new, has been added.
Update DAMON maintainer's profile doc for DAMON patches life cycle, which
depend on those of mm trees.
Link: https://lkml.kernel.org/r/20250705175000.56259-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
damon_sysfs_before_terminate() is a DAMON callback that is executed from
the kdamond's context. Hence it is safe to access DAMON context internal
data. But the function is unnecessarily holding kdamond_lock of the
context. It is just unnecessary. Remove the locking code.
Link: https://lkml.kernel.org/r/20250705175000.56259-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
DAMON core implements a static function to see if a given DAMON context is
running. DAMON sysfs interface is implementing the same one on its own.
Make the core function non-static and reuse it from the DAMON sysfs
interface.
Link: https://lkml.kernel.org/r/20250705175000.56259-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
DAMON sample module, mtier has its name 'mtier'. It could conflict with
future modules, and not very easy to identify it by name. Use a prefix,
"damon_sample_" for the name.
Note that this could break users if they depend on the old name. But it
is just a sample, so no such usage is expected, or known. Even if such
usage exists, updating it for the new name should be straightforward.
Link: https://lkml.kernel.org/r/20250705175000.56259-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
DAMON sample module, prcl has its name 'prcl'. It could conflict with
future modules, and not very easy to identify it by name. Use a prefix,
"damon_sample_" for the name.
Note that this could break users if they depend on the old name. But it
is just a sample, so no such usage is expected, or known. Even if such
usage exists, updating it for the new name should be straightforward.
Link: https://lkml.kernel.org/r/20250705175000.56259-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/damon: misc cleanups".
Yet another round of miscellaneous DAMON cleanups.
This patch (of 6):
DAMON sample module, wsse has its name 'wsse'. It could conflict with
future modules, and not very easy to identify it by name. Use a prefix,
"damon_sample_" for the name.
Note that this could break users if they depend on the old name. But it
is just a sample, so no such usage is expected, or known. Even if such
usage exists, updating it for the new name should be straightforward.
Link: https://lkml.kernel.org/r/20250705175000.56259-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250705175000.56259-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
After commit acd7ccb284b8 ("mm: shmem: add large folio support for
tmpfs"), tmpfs can also support large folio allocation (not just PMD-sized
large folios).
However, when accessing tmpfs via mmap(), although tmpfs supports large
folios, we still establish mappings at the base page granularity, which is
unreasonable.
We can map multiple consecutive pages of tmpfs folios at once according to
the size of the large folio. On one hand, this can reduce the overhead of
page faults; on the other hand, it can leverage hardware architecture
optimizations to reduce TLB misses, such as contiguous PTEs on the ARM
architecture.
Moreover, tmpfs mount will use the 'huge=' option to control large folio
allocation explicitly. So it can be understood that the process's RSS
statistics might increase, and I think this will not cause any obvious
effects for users.
Performance test:
I created a 1G tmpfs file, populated with 64K large folios, and write-accessed it
sequentially via mmap(). I observed a significant performance improvement:
Before the patch:
real 0m0.158s
user 0m0.008s
sys 0m0.150s
After the patch:
real 0m0.021s
user 0m0.004s
sys 0m0.017s
Link: https://lkml.kernel.org/r/440940e78aeb7430c5cc8b6d2088ae98265b9809.1751599072.git.baolin.wang@linux.alibaba.com
Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
- Fix potential memory leak reported by kmemleak
* tag 'efi-fixes-for-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efivarfs: Fix memory leak of efivarfs_fs_info in fs_context error paths
|
|
In light of bindgen being unable to generate bindings for macros, and
owing to the widespread use of these macros in drivers, manually define
the bit and genmask C macros in Rust.
The *_checked version of the functions provide runtime checking while
the const version performs compile-time assertions on the arguments via
the build_assert!() macro.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250714-topics-tyr-genmask2-v9-1-9e6422cbadb6@collabora.com
[ `expect`ed Clippy warning in doctests, hid single `use`, grouped
examples. Reworded title. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Replace `ListLinksSelfPtr::LIST_LINKS_SELF_PTR_OFFSET` with `unsafe fn
raw_get_self_ptr` which returns a pointer to the field rather than
requiring the caller to do pointer arithmetic.
Implement `HasListLinks::raw_get_list_links` in `impl_has_list_links!`,
narrowing the interface of `HasListLinks` and replacing pointer
arithmetic with `container_of!`.
Modify `impl_list_item` to also invoke `impl_has_list_links!` or
`impl_has_list_links_self_ptr!`. This is necessary to allow
`impl_list_item` to see more of the tokens used by
`impl_has_list_links{,_self_ptr}!`.
A similar API change was discussed on the hrtimer series[1].
Link: https://lore.kernel.org/all/20250224-hrtimer-v3-v6-12-rc2-v9-1-5bd3bf0ce6cc@kernel.org/ [1]
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-6-a429e75840a9@gmail.com
[ Fixed broken intra-doc links. Used the renamed
`Opaque::cast_into`. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
There's a comprehensive example in `rust/kernel/list.rs` but it doesn't
exercise the `using ListLinksSelfPtr` variant nor the generic cases. Add
that here. Generalize `impl_has_list_links_self_ptr` to handle nested
fields in the same manner as `impl_has_list_links`.
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-5-a429e75840a9@gmail.com
[ Fixed Rust < 1.82 build by enabling the `offset_of_nested`
feature. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Use a fully qualified path rooted at `$crate` rather than relying on
imports in the invoking scope.
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-4-a429e75840a9@gmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Refer to the self parameter of `impl_list_item!` by the same name used
in `impl_has_list_links{,_self_ptr}!`.
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-3-a429e75840a9@gmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Refer to the type parameters of `impl_has_list_links{,_self_ptr}!` by
the same name used in `impl_list_item!`. Capture type parameters of
`impl_list_item!` as `tt` using `{}` to match the style of all other
macros that work with generics.
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-2-a429e75840a9@gmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Avoid manually capturing generics; use `ty` to capture the whole type
instead.
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250709-list-no-offset-v4-1-a429e75840a9@gmail.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
When we renamed `Opaque::raw_get` to `cast_into`, there was one
replacement that was not supposed to be there.
It does not cause an issue so far because it is inside a macro rule (the
`ListLinksSelfPtr` one) that is unused so far. However, it will start
to be used soon.
Thus fix it now.
Fixes: 64fb810bce03 ("rust: types: rename Opaque::raw_get to cast_into")
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250719183649.596051-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
When a module is loaded, it adds trace events defined by the module. It
may also need to modify the modules trace printk formats to replace enum
names with their values.
If two modules are loaded at the same time, the adding of the event to the
ftrace_events list can corrupt the walking of the list in the code that is
modifying the printk format strings and crash the kernel.
The addition of the event should take the trace_event_sem for write while
it adds the new event.
Also add a lockdep_assert_held() on that semaphore in
__trace_add_event_dirs() as it iterates the list.
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/20250718223158.799bfc0c@batman.local.home
Reported-by: Fusheng Huang(黄富生) <Fusheng.Huang@luxshare-ict.com>
Closes: https://lore.kernel.org/all/20250717105007.46ccd18f@batman.local.home/
Fixes: 110bf2b764eb6 ("tracing: add protection around module events unload")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix handling of migration disabled tasks in default idle selection
- update_locked_rq() called __this_cpu_write() spuriously with NULL
when @rq was not locked. As the writes were spurious, it didn't break
anything directly. However, the function could be called in a
preemptible leading to a context warning in __this_cpu_write(). Skip
the spurious NULL writes.
- Selftest fix on UP
* tag 'sched_ext-for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: idle: Handle migration-disabled tasks in idle selection
sched/ext: Prevent update_locked_rq() calls with NULL rq
selftests/sched_ext: Fix exit selftest hang on UP
|
|
Set a DMA mask for the `pci::Device` in the Rust DMA sample driver.
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250716150354.51081-6-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
The platform bus is potentially capable of performing DMA, hence implement
the `dma:Device` trait for `platform::Device`.
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250716150354.51081-5-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
The PCI bus is potentially capable of performing DMA, hence implement
the `dma:Device` trait for `pci::Device`.
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250716150354.51081-4-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Implement `dma_set_mask()`, `dma_set_coherent_mask()` and
`dma_set_mask_and_coherent()` in the `dma::Device` trait.
Those methods are used to set up the device's DMA addressing
capabilities.
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250716150354.51081-3-dakr@kernel.org
[ Add DmaMask::try_new(). - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux into locking/core
Locking changes for v6.17:
- General
- Mark devm_mutex_init() as __must_check
- Add #[must_use] to Lock::try_lock()
- Remove OWNER_SPINNABLE in rwsem
- Remove redundant #ifdefs in mutex
- Lockdep
- Avoid returning struct in lock_stats()
- Change `static const` into enum for LOCKF_*_IRQ_*
- Temporarily use synchronize_rcu_expedited() in
lockdep_unregister_key() to speed things up.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
|
|
Add a trait that defines the DMA specific methods of devices.
The `dma::Device` trait is to be implemented by bus device
representations, where the underlying bus is capable of DMA, such as
`pci::Device` or `platform::Device`.
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250716150354.51081-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
"An earlier commit to suppress a warning introduced a race condition
where tasks can escape cgroup1 freezer. Revert the commit and simply
remove the warning which was spurious to begin with"
* tag 'cgroup-for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
Revert "cgroup_freezer: cgroup_freezing: Check if not frozen"
sched,freezer: Remove unnecessary warning in __thaw_task
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- corsair-cpro: Validate the size of the received input buffer
- ina238: Report energy in microjoules as expected by the ABI
- pmbus/ucd9000: Fixed GPIO functionality
* tag 'hwmon-for-v6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus/ucd9000) Fix error in ucd9000_gpio_set
hwmon: (ina238) Report energy in microjoules
hwmon: (corsair-cpro) Validate the size of the received input buffer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Fix build and modpost confusion for the upcoming Rust 1.89.0
release
- Clean objtool warning for the upcoming Rust 1.89.0 release by
adding one more noreturn function
'kernel' crate:
- Fix build error when using generics in the 'try_{,pin_}init!'
macros"
* tag 'rust-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: use `#[used(compiler)]` to fix build and `modpost` with Rust >= 1.89.0
objtool/rust: add one more `noreturn` Rust function for Rust 1.89.0
rust: init: Fix generics in *_init! macros
|
|
Currently, the mac80211 layer handles construction and parsing
of 802.11 headers during packet transmission and reception.
Offloading encapsulation and decapsulation to hardware can
significantly enhance performance. Check the service bit to determine
if the firmware supports Ethernet offload. If supported, advertise the
capability to mac80211 to bypass software-based 802.11 header
processing.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00217-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718025513.32982-4-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Currently, the ath12k driver supports only the native Wi-Fi
frame format. In this mode, the ieee80211_tx_status() function
works correctly to report transmission status to mac80211, as
it retrieves station information using sta_info_get_by_addrs().
However, this method is not applicable for Ethernet-converted
packets, since sta_info_get_by_addrs() cannot extract station
information from such formats.
Retrieve station information using ath12k_peer_find_by_id() to
support all frame formats, including native Wi-Fi, raw, and Ethernet.
Report transmission status using ieee80211_tx_status_ext(), and
include rate information as part of the datapath TX status report.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00217-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718025513.32982-3-nithyanantham.paramasivam@oss.qualcomm.com
[changed instances of { 0 } to {}]
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
/proc/cgroups lists only v1 controllers by default, however, this is
only enforced since the commit af000ce85293b ("cgroup: Do not report
unavailable v1 controllers in /proc/cgroups") and there is software in
the wild that uses content of /proc/cgroups to decide on availability of
v2 (sic) controllers.
Add a boottime param that can bring back the previous behavior for
setups where the check in the software cannot be changed and it causes
e.g. unintended OOMs.
Also, this patch takes out cgrp_v1_visible from cgroup1_subsys_absent()
guard since it's only important to check which hierarchy (v1 vs v2) the
subsys is attached to. This has no effect on the printed message but
the code is cleaner since cgrp_v1_visible is really about mounted
hierarchies, not the content of /proc/cgroups.
Link: https://lore.kernel.org/r/b26b60b7d0d2a5ecfd2f3c45f95f32922ed24686.camel@decadent.org.uk
Fixes: af000ce85293b ("cgroup: Do not report unavailable v1 controllers in /proc/cgroups")
Fixes: a0ab1453226d8 ("cgroup: Print message when /proc/cgroups is read on v2-only system")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently, in the transmit (TX) direction, EAPOL, QoS NULL,
and multicast frames are sent in native Wi-Fi (802.11) format.
However, when the virtual interface is configured in Ethernet
mode, transmission fails for packets enqueued in native Wi-fi format.
To address this issue, the firmware should be instructed to
treat these packets as RAW type packets, ensuring proper
handling even when the interface operates in Ethernet mode.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00217-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718025513.32982-2-nithyanantham.paramasivam@oss.qualcomm.com
[fix indentation]
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
The test for "if (!desc_data)" does not work correctly because the list
iterator in a list_for_each_entry() loop is always non-NULL. If we don't
exit via a break, then it points to invalid memory. Instead, use a tmp
variable for the list iterator and only set the "desc_data" when we have
found a match.
Fixes: 1cd53df733c2 ("gpio: sysfs: don't look up exported lines as class devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/747545bf-05f0-4f89-ba77-cb96bf9041f1@sabinyo.mountain
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Instead of passing individual fields of struct pinfunction to
pinmux_generic_add_function(), use pinmux_generic_add_pinfunction() and
pass the entire structure directly.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/20250709-pinctrl-gpio-pinfuncs-v2-7-b6135149c0d9@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Instead of passing individual fields of struct pinfunction to
pinmux_generic_add_function(), use pinmux_generic_add_pinfunction() and
pass the entire structure directly.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250709-pinctrl-gpio-pinfuncs-v2-6-b6135149c0d9@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Instead of passing individual fields of struct pinfunction to
pinmux_generic_add_function(), use pinmux_generic_add_pinfunction() and
pass the entire structure directly.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250709-pinctrl-gpio-pinfuncs-v2-5-b6135149c0d9@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Instead of passing individual fields of struct pinfunction to
pinmux_generic_add_function(), use pinmux_generic_add_pinfunction() and
pass the entire structure directly.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250709-pinctrl-gpio-pinfuncs-v2-4-b6135149c0d9@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Instead of passing individual fields of struct pinfunction to
pinmux_generic_add_function(), use pinmux_generic_add_pinfunction() and
pass the entire structure directly.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250709-pinctrl-gpio-pinfuncs-v2-3-b6135149c0d9@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|