From dd34d9fe3b427dea47f81443d022dbaec4339f7e Mon Sep 17 00:00:00 2001 From: Anthony Yznaga Date: Tue, 29 Aug 2023 17:43:24 -0700 Subject: mm: fix unaccount of memory on vma_link() failure Fix insert_vm_struct() so that only accounted memory is unaccounted if vma_link() fails. Link: https://lkml.kernel.org/r/20230830004324.16101-1-anthony.yznaga@oracle.com Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") Signed-off-by: Anthony Yznaga Reviewed-by: Liam R. Howlett Cc: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton --- mm/mmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index b56a7f0c9f85..1ec6c875a7f9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3278,7 +3278,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma) } if (vma_link(mm, vma)) { - vm_unacct_memory(charged); + if (vma->vm_flags & VM_ACCOUNT) + vm_unacct_memory(charged); return -ENOMEM; } -- cgit From 7fa38d0ea00ffe2cd3c95c96c85221b8ae221875 Mon Sep 17 00:00:00 2001 From: Xiu Jianfeng Date: Wed, 6 Sep 2023 10:33:12 +0000 Subject: mm: remove duplicated vma->vm_flags check when expanding stack expand_upwards() and expand_downwards() will return -EFAULT if VM_GROWSUP or VM_GROWSDOWN is not correctly set in vma->vm_flags, however in !CONFIG_STACK_GROWSUP case, expand_stack_locked() returns -EINVAL first if !(vma->vm_flags & VM_GROWSDOWN) before calling expand_downwards(), to keep the consistency with CONFIG_STACK_GROWSUP case, remove this check. The usages of this function are as below: A:fs/exec.c ret = expand_stack_locked(vma, stack_base); if (ret) ret = -EFAULT; or B:mm/memory.c mm/mmap.c if (expand_stack_locked(vma, addr)) return NULL; which means the return value will not propagate to other places, so I believe there is no user-visible effects of this change, and it's unnecessary to backport to earlier versions. Link: https://lkml.kernel.org/r/20230906103312.645712-1-xiujianfeng@huaweicloud.com Fixes: f440fa1ac955 ("mm: make find_extend_vma() fail if write lock not held") Signed-off-by: Xiu Jianfeng Cc: Liam R. Howlett Cc: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton --- mm/mmap.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 1ec6c875a7f9..42870c4d8704 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2159,8 +2159,6 @@ struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm, unsigned lon #else int expand_stack_locked(struct vm_area_struct *vma, unsigned long address) { - if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) - return -EINVAL; return expand_downwards(vma, address); } -- cgit From 9a12d103f7d2d524f5e3d4358b9a9c03e4a9f1d2 Mon Sep 17 00:00:00 2001 From: "Liam R. Howlett" Date: Fri, 29 Sep 2023 14:30:41 -0400 Subject: mmap: add clarifying comment to vma_merge() code When tracing through the code in vma_merge(), it was not completely clear why the error return to a dup_anon_vma() call would not overwrite a previous attempt to the same function. This commit adds a comment specifying why it is safe. Link: https://lkml.kernel.org/r/20230929183041.2835469-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett Suggested-by: Jann Horn Link: https://lore.kernel.org/linux-mm/CAG48ez3iDwFPR=Ed1BfrNuyUJPMK_=StjxhUsCkL6po1s7bONg@mail.gmail.com/ Reviewed-by: Lorenzo Stoakes Acked-by: Vlastimil Babka Cc: Matthew Wilcox (Oracle) Cc: Suren Baghdasaryan Signed-off-by: Andrew Morton --- mm/mmap.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 011ba48d9e72..673429ee8a9e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -943,6 +943,11 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, vma_start_write(curr); remove = curr; remove2 = next; + /* + * Note that the dup_anon_vma below cannot overwrite err + * since the first caller would do nothing unless next + * has an anon_vma. + */ if (!next->anon_vma) err = dup_anon_vma(prev, curr, &anon_dup); } -- cgit From 94d7d923395129b9248777e575c877e40007f9dc Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 11 Oct 2023 18:04:28 +0100 Subject: mm: abstract the vma_merge()/split_vma() pattern for mprotect() et al. mprotect() and other functions which change VMA parameters over a range each employ a pattern of:- 1. Attempt to merge the range with adjacent VMAs. 2. If this fails, and the range spans a subset of the VMA, split it accordingly. This is open-coded and duplicated in each case. Also in each case most of the parameters passed to vma_merge() remain the same. Create a new function, vma_modify(), which abstracts this operation, accepting only those parameters which can be changed. To avoid the mess of invoking each function call with unnecessary parameters, create inline wrapper functions for each of the modify operations, parameterised only by what is required to perform the action. We can also significantly simplify the logic - by returning the VMA if we split (or merged VMA if we do not) we no longer need specific handling for merge/split cases in any of the call sites. Note that the userfaultfd_release() case works even though it does not split VMAs - since start is set to vma->vm_start and end is set to vma->vm_end, the split logic does not trigger. In addition, since we calculate pgoff to be equal to vma->vm_pgoff + (start - vma->vm_start) >> PAGE_SHIFT, and start - vma->vm_start will be 0 in this instance, this invocation will remain unchanged. We eliminate a VM_WARN_ON() in mprotect_fixup() as this simply asserts that vma_merge() correctly ensures that flags remain the same, something that is already checked in is_mergeable_vma() and elsewhere, and in any case is not specific to mprotect(). Link: https://lkml.kernel.org/r/0dfa9368f37199a423674bf0ee312e8ea0619044.1697043508.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Cc: Alexander Viro Cc: Christian Brauner Cc: Liam R. Howlett Signed-off-by: Andrew Morton --- mm/mmap.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 673429ee8a9e..bca685820763 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2437,6 +2437,54 @@ int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, return __split_vma(vmi, vma, addr, new_below); } +/* + * We are about to modify one or multiple of a VMA's flags, policy, userfaultfd + * context and anonymous VMA name within the range [start, end). + * + * As a result, we might be able to merge the newly modified VMA range with an + * adjacent VMA with identical properties. + * + * If no merge is possible and the range does not span the entirety of the VMA, + * we then need to split the VMA to accommodate the change. + * + * The function returns either the merged VMA, the original VMA if a split was + * required instead, or an error if the split failed. + */ +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long vm_flags, + struct mempolicy *policy, + struct vm_userfaultfd_ctx uffd_ctx, + struct anon_vma_name *anon_name) +{ + pgoff_t pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); + struct vm_area_struct *merged; + + merged = vma_merge(vmi, vma->vm_mm, prev, start, end, vm_flags, + vma->anon_vma, vma->vm_file, pgoff, policy, + uffd_ctx, anon_name); + if (merged) + return merged; + + if (vma->vm_start < start) { + int err = split_vma(vmi, vma, start, 1); + + if (err) + return ERR_PTR(err); + } + + if (vma->vm_end > end) { + int err = split_vma(vmi, vma, end, 0); + + if (err) + return ERR_PTR(err); + } + + return vma; +} + /* * do_vmi_align_munmap() - munmap the aligned region from @start to @end. * @vmi: The vma iterator -- cgit From adb20b0c785e9e8d5e4d4a07b9c3340d7de0e5dc Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 11 Oct 2023 18:04:29 +0100 Subject: mm: make vma_merge() and split_vma() internal Now the common pattern of - attempting a merge via vma_merge() and should this fail splitting VMAs via split_vma() - has been abstracted, the former can be placed into mm/internal.h and the latter made static. In addition, the split_vma() nommu variant also need not be exported. Link: https://lkml.kernel.org/r/405f2be10e20c4e9fbcc9fe6b2dfea105f6642e0.1697043508.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Cc: Alexander Viro Cc: Christian Brauner Cc: Liam R. Howlett Signed-off-by: Andrew Morton --- mm/mmap.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index bca685820763..a516f2412f79 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2346,8 +2346,8 @@ static void unmap_region(struct mm_struct *mm, struct ma_state *mas, * has already been checked or doesn't make sense to fail. * VMA Iterator will point to the end VMA. */ -int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long addr, int new_below) +static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long addr, int new_below) { struct vma_prepare vp; struct vm_area_struct *new; @@ -2428,8 +2428,8 @@ out_free_vma: * Split a vma into two pieces at address 'addr', a new vma is allocated * either for the first part or the tail. */ -int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long addr, int new_below) +static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long addr, int new_below) { if (vma->vm_mm->map_count >= sysctl_max_map_count) return -ENOMEM; -- cgit From 4b5f2d20169827add8875e78a352cf956ccdd55a Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 11 Oct 2023 18:04:30 +0100 Subject: mm: abstract merge for new VMAs into vma_merge_new_vma() Only in mmap_region() and copy_vma() do we attempt to merge VMAs which occupy entirely new regions of virtual memory. We can abstract this logic and make the intent of this invocations of it completely explicit, rather than invoking vma_merge() with an inscrutable wall of parameters. This also paves the way for a simplification of the core vma_merge() implementation, as we seek to make it entirely an implementation detail. The VMA merge call in mmap_region() occurs only for file-backed mappings, where each of the parameters previously specified as NULL are defaulted to NULL in vma_init() (called by vm_area_alloc()). This matches the previous behaviour of specifying NULL for a number of fields, however note that prior to this call we pass the VMA to the file system driver via call_mmap(), which may in theory adjust fields that we pass in to vma_merge_new_vma(). Therefore we actually resolve an oversight here by allowing for the fact that the driver may have done this. Link: https://lkml.kernel.org/r/3dc71d17e307756a54781d4a4ce7315cf8b18bea.1697043508.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Cc: Alexander Viro Cc: Christian Brauner Cc: Liam R. Howlett Signed-off-by: Andrew Morton --- mm/mmap.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index a516f2412f79..e5e50e547ebf 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2485,6 +2485,20 @@ struct vm_area_struct *vma_modify(struct vma_iterator *vmi, return vma; } +/* + * Attempt to merge a newly mapped VMA with those adjacent to it. The caller + * must ensure that [start, end) does not overlap any existing VMA. + */ +static struct vm_area_struct +*vma_merge_new_vma(struct vma_iterator *vmi, struct vm_area_struct *prev, + struct vm_area_struct *vma, unsigned long start, + unsigned long end, pgoff_t pgoff) +{ + return vma_merge(vmi, vma->vm_mm, prev, start, end, vma->vm_flags, + vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), + vma->vm_userfaultfd_ctx, anon_vma_name(vma)); +} + /* * do_vmi_align_munmap() - munmap the aligned region from @start to @end. * @vmi: The vma iterator @@ -2840,10 +2854,9 @@ cannot_expand: * vma again as we may succeed this time. */ if (unlikely(vm_flags != vma->vm_flags && prev)) { - merge = vma_merge(&vmi, mm, prev, vma->vm_start, - vma->vm_end, vma->vm_flags, NULL, - vma->vm_file, vma->vm_pgoff, NULL, - NULL_VM_UFFD_CTX, NULL); + merge = vma_merge_new_vma(&vmi, prev, vma, + vma->vm_start, vma->vm_end, + vma->vm_pgoff); if (merge) { /* * ->mmap() can change vma->vm_file and fput @@ -3385,9 +3398,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, if (new_vma && new_vma->vm_start < addr + len) return NULL; /* should never get here */ - new_vma = vma_merge(&vmi, mm, prev, addr, addr + len, vma->vm_flags, - vma->anon_vma, vma->vm_file, pgoff, vma_policy(vma), - vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + new_vma = vma_merge_new_vma(&vmi, prev, vma, addr, addr + len, pgoff); if (new_vma) { /* * Source vma may have been merged into new_vma -- cgit From 93bf5d4aa27d4ba528f71483ae51fbd70edb3ce8 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 11 Oct 2023 18:04:31 +0100 Subject: mm: abstract VMA merge and extend into vma_merge_extend() helper mremap uses vma_merge() in the case where a VMA needs to be extended. This can be significantly simplified and abstracted. This makes it far easier to understand what the actual function is doing, avoids future mistakes in use of the confusing vma_merge() function and importantly allows us to make future changes to how vma_merge() is implemented by knowing explicitly which merge cases each invocation uses. Note that in the mremap() extend case, we perform this merge only when old_len == vma->vm_end - addr. The extension_start, i.e. the start of the extended portion of the VMA is equal to addr + old_len, i.e. vma->vm_end. With this refactoring, vma_merge() is no longer required anywhere except mm/mmap.c, so mark it static. Link: https://lkml.kernel.org/r/f16cbdc2e72d37a1a097c39dc7d1fee8919a1c93.1697043508.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Cc: Alexander Viro Cc: Christian Brauner Cc: Liam R. Howlett Signed-off-by: Andrew Morton --- mm/mmap.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index e5e50e547ebf..3ea52451623b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -860,13 +860,13 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, * **** is not represented - it will be merged and the vma containing the * area is returned, or the function will return NULL */ -struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, - struct vm_area_struct *prev, unsigned long addr, - unsigned long end, unsigned long vm_flags, - struct anon_vma *anon_vma, struct file *file, - pgoff_t pgoff, struct mempolicy *policy, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) +static struct vm_area_struct +*vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, + struct vm_area_struct *prev, unsigned long addr, unsigned long end, + unsigned long vm_flags, struct anon_vma *anon_vma, struct file *file, + pgoff_t pgoff, struct mempolicy *policy, + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name) { struct vm_area_struct *curr, *next, *res; struct vm_area_struct *vma, *adjust, *remove, *remove2; @@ -2499,6 +2499,23 @@ static struct vm_area_struct vma->vm_userfaultfd_ctx, anon_vma_name(vma)); } +/* + * Expand vma by delta bytes, potentially merging with an immediately adjacent + * VMA with identical properties. + */ +struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi, + struct vm_area_struct *vma, + unsigned long delta) +{ + pgoff_t pgoff = vma->vm_pgoff + vma_pages(vma); + + /* vma is specified as prev, so case 1 or 2 will apply. */ + return vma_merge(vmi, vma->vm_mm, vma, vma->vm_end, vma->vm_end + delta, + vma->vm_flags, vma->anon_vma, vma->vm_file, pgoff, + vma_policy(vma), vma->vm_userfaultfd_ctx, + anon_vma_name(vma)); +} + /* * do_vmi_align_munmap() - munmap the aligned region from @start to @end. * @vmi: The vma iterator -- cgit From e8e17ee90eaf650c855adb0a3e5e965fd6692ff1 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Thu, 12 Oct 2023 18:04:28 +0100 Subject: mm: drop the assumption that VM_SHARED always implies writable Patch series "permit write-sealed memfd read-only shared mappings", v4. The man page for fcntl() describing memfd file seals states the following about F_SEAL_WRITE:- Furthermore, trying to create new shared, writable memory-mappings via mmap(2) will also fail with EPERM. With emphasis on 'writable'. In turns out in fact that currently the kernel simply disallows all new shared memory mappings for a memfd with F_SEAL_WRITE applied, rendering this documentation inaccurate. This matters because users are therefore unable to obtain a shared mapping to a memfd after write sealing altogether, which limits their usefulness. This was reported in the discussion thread [1] originating from a bug report [2]. This is a product of both using the struct address_space->i_mmap_writable atomic counter to determine whether writing may be permitted, and the kernel adjusting this counter when any VM_SHARED mapping is performed and more generally implicitly assuming VM_SHARED implies writable. It seems sensible that we should only update this mapping if VM_MAYWRITE is specified, i.e. whether it is possible that this mapping could at any point be written to. If we do so then all we need to do to permit write seals to function as documented is to clear VM_MAYWRITE when mapping read-only. It turns out this functionality already exists for F_SEAL_FUTURE_WRITE - we can therefore simply adapt this logic to do the same for F_SEAL_WRITE. We then hit a chicken and egg situation in mmap_region() where the check for VM_MAYWRITE occurs before we are able to clear this flag. To work around this, perform this check after we invoke call_mmap(), with careful consideration of error paths. Thanks to Andy Lutomirski for the suggestion! [1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@linux-foundation.org/ [2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238 This patch (of 3): There is a general assumption that VMAs with the VM_SHARED flag set are writable. If the VM_MAYWRITE flag is not set, then this is simply not the case. Update those checks which affect the struct address_space->i_mmap_writable field to explicitly test for this by introducing [vma_]is_shared_maywrite() helper functions. This remains entirely conservative, as the lack of VM_MAYWRITE guarantees that the VMA cannot be written to. Link: https://lkml.kernel.org/r/cover.1697116581.git.lstoakes@gmail.com Link: https://lkml.kernel.org/r/d978aefefa83ec42d18dfa964ad180dbcde34795.1697116581.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Suggested-by: Andy Lutomirski Reviewed-by: Jan Kara Cc: Alexander Viro Cc: Christian Brauner Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton --- mm/mmap.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 3ea52451623b..0041e3631f6c 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -107,7 +107,7 @@ void vma_set_page_prot(struct vm_area_struct *vma) static void __remove_shared_vm_struct(struct vm_area_struct *vma, struct file *file, struct address_space *mapping) { - if (vma->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(vma)) mapping_unmap_writable(mapping); flush_dcache_mmap_lock(mapping); @@ -384,7 +384,7 @@ static unsigned long count_vma_pages_range(struct mm_struct *mm, static void __vma_link_file(struct vm_area_struct *vma, struct address_space *mapping) { - if (vma->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(vma)) mapping_allow_writable(mapping); flush_dcache_mmap_lock(mapping); @@ -2846,7 +2846,7 @@ cannot_expand: vma->vm_pgoff = pgoff; if (file) { - if (vm_flags & VM_SHARED) { + if (is_shared_maywrite(vm_flags)) { error = mapping_map_writable(file->f_mapping); if (error) goto free_vma; @@ -2920,7 +2920,7 @@ cannot_expand: mm->map_count++; if (vma->vm_file) { i_mmap_lock_write(vma->vm_file->f_mapping); - if (vma->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(vma)) mapping_allow_writable(vma->vm_file->f_mapping); flush_dcache_mmap_lock(vma->vm_file->f_mapping); @@ -2937,7 +2937,7 @@ cannot_expand: /* Once vma denies write, undo our temporary denial count */ unmap_writable: - if (file && vm_flags & VM_SHARED) + if (file && is_shared_maywrite(vm_flags)) mapping_unmap_writable(file->f_mapping); file = vma->vm_file; ksm_add_vma(vma); @@ -2985,7 +2985,7 @@ unmap_and_free_vma: unmap_region(mm, &vmi.mas, vma, prev, next, vma->vm_start, vma->vm_end, vma->vm_end, true); } - if (file && (vm_flags & VM_SHARED)) + if (file && is_shared_maywrite(vm_flags)) mapping_unmap_writable(file->f_mapping); free_vma: vm_area_free(vma); -- cgit From 158978945f3173b8c1a88f8c5684a629736a57ac Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Thu, 12 Oct 2023 18:04:30 +0100 Subject: mm: perform the mapping_map_writable() check after call_mmap() In order for a F_SEAL_WRITE sealed memfd mapping to have an opportunity to clear VM_MAYWRITE, we must be able to invoke the appropriate vm_ops->mmap() handler to do so. We would otherwise fail the mapping_map_writable() check before we had the opportunity to avoid it. This patch moves this check after the call_mmap() invocation. Only memfd actively denies write access causing a potential failure here (in memfd_add_seals()), so there should be no impact on non-memfd cases. This patch makes the userland-visible change that MAP_SHARED, PROT_READ mappings of an F_SEAL_WRITE sealed memfd mapping will now succeed. There is a delicate situation with cleanup paths assuming that a writable mapping must have occurred in circumstances where it may now not have. In order to ensure we do not accidentally mark a writable file unwritable by mistake, we explicitly track whether we have a writable mapping and unmap only if we do. [lstoakes@gmail.com: do not set writable_file_mapping in inappropriate case] Link: https://lkml.kernel.org/r/c9eb4cc6-7db4-4c2b-838d-43a0b319a4f0@lucifer.local Link: https://bugzilla.kernel.org/show_bug.cgi?id=217238 Link: https://lkml.kernel.org/r/55e413d20678a1bb4c7cce889062bbb07b0df892.1697116581.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Jan Kara Cc: Alexander Viro Cc: Andy Lutomirski Cc: Christian Brauner Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton --- mm/mmap.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 0041e3631f6c..8b57e42fd980 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2752,6 +2752,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long charged = 0; unsigned long end = addr + len; unsigned long merge_start = addr, merge_end = end; + bool writable_file_mapping = false; pgoff_t vm_pgoff; int error; VMA_ITERATOR(vmi, mm, addr); @@ -2846,17 +2847,19 @@ cannot_expand: vma->vm_pgoff = pgoff; if (file) { - if (is_shared_maywrite(vm_flags)) { - error = mapping_map_writable(file->f_mapping); - if (error) - goto free_vma; - } - vma->vm_file = get_file(file); error = call_mmap(file, vma); if (error) goto unmap_and_free_vma; + if (vma_is_shared_maywrite(vma)) { + error = mapping_map_writable(file->f_mapping); + if (error) + goto close_and_free_vma; + + writable_file_mapping = true; + } + /* * Expansion is handled above, merging is handled below. * Drivers should not alter the address of the VMA. @@ -2937,7 +2940,7 @@ cannot_expand: /* Once vma denies write, undo our temporary denial count */ unmap_writable: - if (file && is_shared_maywrite(vm_flags)) + if (writable_file_mapping) mapping_unmap_writable(file->f_mapping); file = vma->vm_file; ksm_add_vma(vma); @@ -2985,7 +2988,7 @@ unmap_and_free_vma: unmap_region(mm, &vmi.mas, vma, prev, next, vma->vm_start, vma->vm_end, vma->vm_end, true); } - if (file && is_shared_maywrite(vm_flags)) + if (writable_file_mapping) mapping_unmap_writable(file->f_mapping); free_vma: vm_area_free(vma); -- cgit From be16dd764a69752a31096d1a6b2ad775b728b1bd Mon Sep 17 00:00:00 2001 From: Muhammad Muzammil Date: Mon, 23 Oct 2023 17:44:05 +0500 Subject: mm: fix multiple typos in multiple files Link: https://lkml.kernel.org/r/20231023124405.36981-1-m.muzzammilashraf@gmail.com Signed-off-by: Muhammad Muzammil Reviewed-by: Randy Dunlap Cc: "James E.J. Bottomley" Cc: Matthew Wilcox (Oracle) Cc: Muhammad Muzammil Signed-off-by: Andrew Morton --- mm/mmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'mm/mmap.c') diff --git a/mm/mmap.c b/mm/mmap.c index 8b57e42fd980..984804d77ae1 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1223,7 +1223,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, * Does the application expect PROT_READ to imply PROT_EXEC? * * (the exception is when the underlying filesystem is noexec - * mounted, in which case we dont add PROT_EXEC.) + * mounted, in which case we don't add PROT_EXEC.) */ if ((prot & PROT_READ) && (current->personality & READ_IMPLIES_EXEC)) if (!(file && path_noexec(&file->f_path))) -- cgit