summaryrefslogtreecommitdiff
path: root/Documentation/mm
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/mm')
-rw-r--r--Documentation/mm/arch_pgtable_helpers.rst14
-rw-r--r--Documentation/mm/damon/design.rst4
-rw-r--r--Documentation/mm/damon/maintainer-profile.rst35
-rw-r--r--Documentation/mm/page_migration.rst39
-rw-r--r--Documentation/mm/physical_memory.rst2
-rw-r--r--Documentation/mm/process_addrs.rst54
6 files changed, 101 insertions, 47 deletions
diff --git a/Documentation/mm/arch_pgtable_helpers.rst b/Documentation/mm/arch_pgtable_helpers.rst
index af245161d8e7..ba2f658bc241 100644
--- a/Documentation/mm/arch_pgtable_helpers.rst
+++ b/Documentation/mm/arch_pgtable_helpers.rst
@@ -30,8 +30,6 @@ PTE Page Table Helpers
+---------------------------+--------------------------------------------------+
| pte_protnone | Tests a PROT_NONE PTE |
+---------------------------+--------------------------------------------------+
-| pte_devmap | Tests a ZONE_DEVICE mapped PTE |
-+---------------------------+--------------------------------------------------+
| pte_soft_dirty | Tests a soft dirty PTE |
+---------------------------+--------------------------------------------------+
| pte_swp_soft_dirty | Tests a soft dirty swapped PTE |
@@ -104,8 +102,6 @@ PMD Page Table Helpers
+---------------------------+--------------------------------------------------+
| pmd_protnone | Tests a PROT_NONE PMD |
+---------------------------+--------------------------------------------------+
-| pmd_devmap | Tests a ZONE_DEVICE mapped PMD |
-+---------------------------+--------------------------------------------------+
| pmd_soft_dirty | Tests a soft dirty PMD |
+---------------------------+--------------------------------------------------+
| pmd_swp_soft_dirty | Tests a soft dirty swapped PMD |
@@ -177,8 +173,6 @@ PUD Page Table Helpers
+---------------------------+--------------------------------------------------+
| pud_write | Tests a writable PUD |
+---------------------------+--------------------------------------------------+
-| pud_devmap | Tests a ZONE_DEVICE mapped PUD |
-+---------------------------+--------------------------------------------------+
| pud_mkyoung | Creates a young PUD |
+---------------------------+--------------------------------------------------+
| pud_mkold | Creates an old PUD |
@@ -242,13 +236,13 @@ SWAP Page Table Helpers
========================
+---------------------------+--------------------------------------------------+
-| __pte_to_swp_entry | Creates a swapped entry (arch) from a mapped PTE |
+| __pte_to_swp_entry | Creates a swp_entry_t (arch) from a swap PTE |
+---------------------------+--------------------------------------------------+
-| __swp_to_pte_entry | Creates a mapped PTE from a swapped entry (arch) |
+| __swp_entry_to_pte | Creates a swap PTE from a swp_entry_t (arch) |
+---------------------------+--------------------------------------------------+
-| __pmd_to_swp_entry | Creates a swapped entry (arch) from a mapped PMD |
+| __pmd_to_swp_entry | Creates a swp_entry_t (arch) from a swap PMD |
+---------------------------+--------------------------------------------------+
-| __swp_to_pmd_entry | Creates a mapped PMD from a swapped entry (arch) |
+| __swp_entry_to_pmd | Creates a swap PMD from a swp_entry_t (arch) |
+---------------------------+--------------------------------------------------+
| is_migration_entry | Tests a migration (read or write) swapped entry |
+-------------------------------+----------------------------------------------+
diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst
index ddc50db3afa4..03f8137256f5 100644
--- a/Documentation/mm/damon/design.rst
+++ b/Documentation/mm/damon/design.rst
@@ -452,9 +452,9 @@ that supports each action are as below.
- ``lru_deprio``: Deprioritize the region on its LRU lists.
Supported by ``paddr`` operations set.
- ``migrate_hot``: Migrate the regions prioritizing warmer regions.
- Supported by ``paddr`` operations set.
+ Supported by ``vaddr``, ``fvaddr`` and ``paddr`` operations set.
- ``migrate_cold``: Migrate the regions prioritizing colder regions.
- Supported by ``paddr`` operations set.
+ Supported by ``vaddr``, ``fvaddr`` and ``paddr`` operations set.
- ``stat``: Do nothing but count the statistics.
Supported by all operations sets.
diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst
index ce3e98458339..5cd07905a193 100644
--- a/Documentation/mm/damon/maintainer-profile.rst
+++ b/Documentation/mm/damon/maintainer-profile.rst
@@ -7,9 +7,9 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR'
section of 'MAINTAINERS' file.
The mailing lists for the subsystem are damon@lists.linux.dev and
-linux-mm@kvack.org. Patches should be made against the `mm-unstable tree
-<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted
-to the mailing lists.
+linux-mm@kvack.org. Patches should be made against the `mm-new tree
+<https://git.kernel.org/akpm/mm/h/mm-new>`_ whenever possible and posted to the
+mailing lists.
SCM Trees
---------
@@ -17,17 +17,19 @@ SCM Trees
There are multiple Linux trees for DAMON development. Patches under
development or testing are queued in `damon/next
<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer.
-Sufficiently reviewed patches will be queued in `mm-unstable
-<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management
-subsystem maintainer. After more sufficient tests, the patches will be queued
-in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally
-pull-requested to the mainline by the memory management subsystem maintainer.
-
-Note again the patches for `mm-unstable tree
-<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory
-management subsystem maintainer. If the patches requires some patches in
-`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged
-in mm-unstable, please make sure the requirement is clearly specified.
+Sufficiently reviewed patches will be queued in `mm-new
+<https://git.kernel.org/akpm/mm/h/mm-new>`_ by the memory management subsystem
+maintainer. As more sufficient tests are done, the patches will move to
+`mm-unstable <https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and then to
+`mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_. And finally those
+will be pull-requested to the mainline by the memory management subsystem
+maintainer.
+
+Note again the patches for `mm-new tree
+<https://git.kernel.org/akpm/mm/h/mm-new>`_ are queued by the memory management
+subsystem maintainer. If the patches requires some patches in `damon/next tree
+<https://git.kernel.org/sj/h/damon/next>`_ which not yet merged in mm-new,
+please make sure the requirement is clearly specified.
Submit checklist addendum
-------------------------
@@ -53,8 +55,9 @@ Further doing below and putting the results will be helpful.
Key cycle dates
---------------
-Patches can be sent anytime. Key cycle dates of the `mm-unstable
-<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable
+Patches can be sent anytime. Key cycle dates of the `mm-new
+<https://git.kernel.org/akpm/mm/h/mm-new>`_, `mm-unstable
+<https://git.kernel.org/akpm/mm/h/mm-unstable>`_and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory
management subsystem maintainer.
diff --git a/Documentation/mm/page_migration.rst b/Documentation/mm/page_migration.rst
index 519b35a4caf5..34602b254aa6 100644
--- a/Documentation/mm/page_migration.rst
+++ b/Documentation/mm/page_migration.rst
@@ -146,18 +146,33 @@ Steps:
18. The new page is moved to the LRU and can be scanned by the swapper,
etc. again.
-Non-LRU page migration
-======================
-
-Although migration originally aimed for reducing the latency of memory
-accesses for NUMA, compaction also uses migration to create high-order
-pages. For compaction purposes, it is also useful to be able to move
-non-LRU pages, such as zsmalloc and virtio-balloon pages.
-
-If a driver wants to make its pages movable, it should define a struct
-movable_operations. It then needs to call __SetPageMovable() on each
-page that it may be able to move. This uses the ``page->mapping`` field,
-so this field is not available for the driver to use for other purposes.
+movable_ops page migration
+==========================
+
+Selected typed, non-folio pages (e.g., pages inflated in a memory balloon,
+zsmalloc pages) can be migrated using the movable_ops migration framework.
+
+The "struct movable_operations" provide callbacks specific to a page type
+for isolating, migrating and un-isolating (putback) these pages.
+
+Once a page is indicated as having movable_ops, that condition must not
+change until the page was freed back to the buddy. This includes not
+changing/clearing the page type and not changing/clearing the
+PG_movable_ops page flag.
+
+Arbitrary drivers cannot currently make use of this framework, as it
+requires:
+
+(a) a page type
+(b) indicating them as possibly having movable_ops in page_has_movable_ops()
+ based on the page type
+(c) returning the movable_ops from page_movable_ops() based on the page
+ type
+(d) not reusing the PG_movable_ops and PG_movable_ops_isolated page flags
+ for other purposes
+
+For example, balloon drivers can make use of this framework through the
+balloon-compaction infrastructure residing in the core kernel.
Monitoring Migration
=====================
diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
index d3ac106e6b14..9af11b5bd145 100644
--- a/Documentation/mm/physical_memory.rst
+++ b/Documentation/mm/physical_memory.rst
@@ -584,7 +584,7 @@ Compaction control
``compact_blockskip_flush``
Set to true when compaction migration scanner and free scanner meet, which
- means the ``PB_migrate_skip`` bits should be cleared.
+ means the ``PB_compact_skip`` bits should be cleared.
``contiguous``
Set to true when the zone is contiguous (in other words, no hole).
diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
index e6756e78b476..be49e2a269e4 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
1. **Traversing** page tables - Simply reading page tables in order to traverse
them. This only requires that the VMA is kept stable, so a lock which
establishes this suffices for traversal (there are also lockless variants
- which eliminate even this requirement, such as :c:func:`!gup_fast`).
+ which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
+ also a special case of page table traversal for non-VMA regions which we
+ consider separately below.
2. **Installing** page table mappings - Whether creating a new mapping or
modifying an existing one in such a way as to change its identity. This
requires that the VMA is kept stable via an mmap or VMA lock (explicitly not
@@ -335,15 +337,13 @@ ahead and perform these operations on page tables (though internally, kernel
operations that perform writes also acquire internal page table locks to
serialise - see the page table implementation detail section for more details).
+.. note:: We free empty PTE tables on zap under the RCU lock - this does not
+ change the aforementioned locking requirements around zapping.
+
When **installing** page table entries, the mmap or VMA lock must be held to
keep the VMA stable. We explore why this is in the page table locking details
section below.
-.. warning:: Page tables are normally only traversed in regions covered by VMAs.
- If you want to traverse page tables in areas that might not be
- covered by VMAs, heavier locking is required.
- See :c:func:`!walk_page_range_novma` for details.
-
**Freeing** page tables is an entirely internal memory management operation and
has special requirements (see the page freeing section below for more details).
@@ -355,6 +355,44 @@ has special requirements (see the page freeing section below for more details).
from the reverse mappings, but no other VMAs can be permitted to be
accessible and span the specified range.
+Traversing non-VMA page tables
+------------------------------
+
+We've focused above on traversal of page tables belonging to VMAs. It is also
+possible to traverse page tables which are not represented by VMAs.
+
+Kernel page table mappings themselves are generally managed but whatever part of
+the kernel established them and the aforementioned locking rules do not apply -
+for instance vmalloc has its own set of locks which are utilised for
+establishing and tearing down page its page tables.
+
+However, for convenience we provide the :c:func:`!walk_kernel_page_table_range`
+function which is synchronised via the mmap lock on the :c:macro:`!init_mm`
+kernel instantiation of the :c:struct:`!struct mm_struct` metadata object.
+
+If an operation requires exclusive access, a write lock is used, but if not, a
+read lock suffices - we assert only that at least a read lock has been acquired.
+
+Since, aside from vmalloc and memory hot plug, kernel page tables are not torn
+down all that often - this usually suffices, however any caller of this
+functionality must ensure that any additionally required locks are acquired in
+advance.
+
+We also permit a truly unusual case is the traversal of non-VMA ranges in
+**userland** ranges, as provided for by :c:func:`!walk_page_range_debug`.
+
+This has only one user - the general page table dumping logic (implemented in
+:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug purposes
+even if they are highly unusual (possibly architecture-specific) and are not
+backed by a VMA.
+
+We must take great care in this case, as the :c:func:`!munmap` implementation
+detaches VMAs under an mmap write lock before tearing down page tables under a
+downgraded mmap read lock.
+
+This means such an operation could race with this, and thus an mmap **write**
+lock is required.
+
Lock ordering
-------------
@@ -461,6 +499,10 @@ Locking Implementation Details
Page table locking details
--------------------------
+.. note:: This section explores page table locking requirements for page tables
+ encompassed by a VMA. See the above section on non-VMA page table
+ traversal for details on how we handle that case.
+
In addition to the locks described in the terminology section above, we have
additional locks dedicated to page tables: