From 426e5c429d16e4cd5ded46e21ff8e939bf8abd0f Mon Sep 17 00:00:00 2001 From: Muchun Song Date: Wed, 30 Jun 2021 18:47:00 -0700 Subject: mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c Patch series "Free some vmemmap pages of HugeTLB page", v23. This patch series will free some vmemmap pages(struct page structures) associated with each HugeTLB page when preallocated to save memory. In order to reduce the difficulty of the first version of code review. In this version, we disable PMD/huge page mapping of vmemmap if this feature was enabled. This acutely eliminates a bunch of the complex code doing page table manipulation. When this patch series is solid, we cam add the code of vmemmap page table manipulation in the future. The struct page structures (page structs) are used to describe a physical page frame. By default, there is an one-to-one mapping from a page frame to it's corresponding page struct. The HugeTLB pages consist of multiple base page size pages and is supported by many architectures. See hugetlbpage.rst in the Documentation directory for more details. On the x86 architecture, HugeTLB pages of size 2MB and 1GB are currently supported. Since the base page size on x86 is 4KB, a 2MB HugeTLB page consists of 512 base pages and a 1GB HugeTLB page consists of 4096 base pages. For each base page, there is a corresponding page struct. Within the HugeTLB subsystem, only the first 4 page structs are used to contain unique information about a HugeTLB page. HUGETLB_CGROUP_MIN_ORDER provides this upper limit. The only 'useful' information in the remaining page structs is the compound_head field, and this field is the same for all tail pages. By removing redundant page structs for HugeTLB pages, memory can returned to the buddy allocator for other uses. When the system boot up, every 2M HugeTLB has 512 struct page structs which size is 8 pages(sizeof(struct page) * 512 / PAGE_SIZE). HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+ | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | -------------> | 1 | | | +-----------+ +-----------+ | | | 2 | -------------> | 2 | | | +-----------+ +-----------+ | | | 3 | -------------> | 3 | | | +-----------+ +-----------+ | | | 4 | -------------> | 4 | | 2MB | +-----------+ +-----------+ | | | 5 | -------------> | 5 | | | +-----------+ +-----------+ | | | 6 | -------------> | 6 | | | +-----------+ +-----------+ | | | 7 | -------------> | 7 | | | +-----------+ +-----------+ | | | | | | +-----------+ The value of page->compound_head is the same for all tail pages. The first page of page structs (page 0) associated with the HugeTLB page contains the 4 page structs necessary to describe the HugeTLB. The only use of the remaining pages of page structs (page 1 to page 7) is to point to page->compound_head. Therefore, we can remap pages 2 to 7 to page 1. Only 2 pages of page structs will be used for each HugeTLB page. This will allow us to free the remaining 6 pages to the buddy allocator. Here is how things look after remapping. HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+ | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | -------------> | 1 | | | +-----------+ +-----------+ | | | 2 | ----------------^ ^ ^ ^ ^ ^ | | +-----------+ | | | | | | | | 3 | ------------------+ | | | | | | +-----------+ | | | | | | | 4 | --------------------+ | | | | 2MB | +-----------+ | | | | | | 5 | ----------------------+ | | | | +-----------+ | | | | | 6 | ------------------------+ | | | +-----------+ | | | | 7 | --------------------------+ | | +-----------+ | | | | | | +-----------+ When a HugeTLB is freed to the buddy system, we should allocate 6 pages for vmemmap pages and restore the previous mapping relationship. Apart from 2MB HugeTLB page, we also have 1GB HugeTLB page. It is similar to the 2MB HugeTLB page. We also can use this approach to free the vmemmap pages. In this case, for the 1GB HugeTLB page, we can save 4094 pages. This is a very substantial gain. On our server, run some SPDK/QEMU applications which will use 1024GB HugeTLB page. With this feature enabled, we can save ~16GB (1G hugepage)/~12GB (2MB hugepage) memory. Because there are vmemmap page tables reconstruction on the freeing/allocating path, it increases some overhead. Here are some overhead analysis. 1) Allocating 10240 2MB HugeTLB pages. a) With this patch series applied: # time echo 10240 > /proc/sys/vm/nr_hugepages real 0m0.166s user 0m0.000s sys 0m0.166s # bpftrace -e 'kprobe:alloc_fresh_huge_page { @start[tid] = nsecs; } kretprobe:alloc_fresh_huge_page /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }' Attaching 2 probes... @latency: [8K, 16K) 5476 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [16K, 32K) 4760 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32K, 64K) 4 | | b) Without this patch series: # time echo 10240 > /proc/sys/vm/nr_hugepages real 0m0.067s user 0m0.000s sys 0m0.067s # bpftrace -e 'kprobe:alloc_fresh_huge_page { @start[tid] = nsecs; } kretprobe:alloc_fresh_huge_page /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }' Attaching 2 probes... @latency: [4K, 8K) 10147 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8K, 16K) 93 | | Summarize: this feature is about ~2x slower than before. 2) Freeing 10240 2MB HugeTLB pages. a) With this patch series applied: # time echo 0 > /proc/sys/vm/nr_hugepages real 0m0.213s user 0m0.000s sys 0m0.213s # bpftrace -e 'kprobe:free_pool_huge_page { @start[tid] = nsecs; } kretprobe:free_pool_huge_page /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }' Attaching 2 probes... @latency: [8K, 16K) 6 | | [16K, 32K) 10227 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [32K, 64K) 7 | | b) Without this patch series: # time echo 0 > /proc/sys/vm/nr_hugepages real 0m0.081s user 0m0.000s sys 0m0.081s # bpftrace -e 'kprobe:free_pool_huge_page { @start[tid] = nsecs; } kretprobe:free_pool_huge_page /@start[tid]/ { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }' Attaching 2 probes... @latency: [4K, 8K) 6805 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8K, 16K) 3427 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [16K, 32K) 8 | | Summary: The overhead of __free_hugepage is about ~2-3x slower than before. Although the overhead has increased, the overhead is not significant. Like Mike said, "However, remember that the majority of use cases create HugeTLB pages at or shortly after boot time and add them to the pool. So, additional overhead is at pool creation time. There is no change to 'normal run time' operations of getting a page from or returning a page to the pool (think page fault/unmap)". Despite the overhead and in addition to the memory gains from this series. The following data is obtained by Joao Martins. Very thanks to his effort. There's an additional benefit which is page (un)pinners will see an improvement and Joao presumes because there are fewer memmap pages and thus the tail/head pages are staying in cache more often. Out of the box Joao saw (when comparing linux-next against linux-next + this series) with gup_test and pinning a 16G HugeTLB file (with 1G pages): get_user_pages(): ~32k -> ~9k unpin_user_pages(): ~75k -> ~70k Usually any tight loop fetching compound_head(), or reading tail pages data (e.g. compound_head) benefit a lot. There's some unpinning inefficiencies Joao was fixing[2], but with that in added it shows even more: unpin_user_pages(): ~27k -> ~3.8k [1] https://lore.kernel.org/linux-mm/20210409205254.242291-1-mike.kravetz@oracle.com/ [2] https://lore.kernel.org/linux-mm/20210204202500.26474-1-joao.m.martins@oracle.com/ This patch (of 9): Move bootmem info registration common API to individual bootmem_info.c. And we will use {get,put}_page_bootmem() to initialize the page for the vmemmap pages or free the vmemmap pages to buddy in the later patch. So move them out of CONFIG_MEMORY_HOTPLUG_SPARSE. This is just code movement without any functional change. Link: https://lkml.kernel.org/r/20210510030027.56044-1-songmuchun@bytedance.com Link: https://lkml.kernel.org/r/20210510030027.56044-2-songmuchun@bytedance.com Signed-off-by: Muchun Song Acked-by: Mike Kravetz Reviewed-by: Oscar Salvador Reviewed-by: David Hildenbrand Reviewed-by: Miaohe Lin Tested-by: Chen Huang Tested-by: Bodeddula Balasubramaniam Cc: Jonathan Corbet Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: x86@kernel.org Cc: "H. Peter Anvin" Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Alexander Viro Cc: Paul E. McKenney Cc: Pawan Gupta Cc: Randy Dunlap Cc: Oliver Neukum Cc: Anshuman Khandual Cc: Joerg Roedel Cc: Mina Almasry Cc: David Rientjes Cc: Matthew Wilcox Cc: Michal Hocko Cc: Barry Song Cc: HORIGUCHI NAOYA Cc: Joao Martins Cc: Xiongchun Duan Cc: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/sparc/mm/init_64.c | 1 + arch/x86/mm/init_64.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 06e938d03f3b..1b23639e2fcd 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index e527d829e1ed..3aaf1d30c777 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -1623,7 +1624,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, return err; } -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HAVE_BOOTMEM_INFO_NODE) +#ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long nr_pages) { -- cgit From 6be24bed9da367c29b04e6fba8c9f27db39aa665 Mon Sep 17 00:00:00 2001 From: Muchun Song Date: Wed, 30 Jun 2021 18:47:04 -0700 Subject: mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP The option HUGETLB_PAGE_FREE_VMEMMAP allows for the freeing of some vmemmap pages associated with pre-allocated HugeTLB pages. For example, on X86_64 6 vmemmap pages of size 4KB each can be saved for each 2MB HugeTLB page. 4094 vmemmap pages of size 4KB each can be saved for each 1GB HugeTLB page. When a HugeTLB page is allocated or freed, the vmemmap array representing the range associated with the page will need to be remapped. When a page is allocated, vmemmap pages are freed after remapping. When a page is freed, previously discarded vmemmap pages must be allocated before remapping. The config option is introduced early so that supporting code can be written to depend on the option. The initial version of the code only provides support for x86-64. If config HAVE_BOOTMEM_INFO_NODE is enabled, the freeing vmemmap page code denpend on it to free vmemmap pages. Otherwise, just use free_reserved_page() to free vmemmmap pages. The routine register_page_bootmem_info() is used to register bootmem info. Therefore, make sure register_page_bootmem_info is enabled if HUGETLB_PAGE_FREE_VMEMMAP is defined. Link: https://lkml.kernel.org/r/20210510030027.56044-3-songmuchun@bytedance.com Signed-off-by: Muchun Song Reviewed-by: Oscar Salvador Acked-by: Mike Kravetz Reviewed-by: Miaohe Lin Tested-by: Chen Huang Tested-by: Bodeddula Balasubramaniam Reviewed-by: Balbir Singh Cc: Alexander Viro Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Barry Song Cc: Borislav Petkov Cc: Dave Hansen Cc: David Hildenbrand Cc: David Rientjes Cc: HORIGUCHI NAOYA Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joao Martins Cc: Joerg Roedel Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: Mina Almasry Cc: Oliver Neukum Cc: Paul E. McKenney Cc: Pawan Gupta Cc: Peter Zijlstra Cc: Randy Dunlap Cc: Thomas Gleixner Cc: Xiongchun Duan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/mm/init_64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 3aaf1d30c777..65ea58527176 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1270,7 +1270,7 @@ static struct kcore_list kcore_vsyscall; static void __init register_page_bootmem_info(void) { -#ifdef CONFIG_NUMA +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP) int i; for_each_online_node(i) -- cgit From e9fdff87e893ec5b7c32836675db80cf691b2a8b Mon Sep 17 00:00:00 2001 From: Muchun Song Date: Wed, 30 Jun 2021 18:47:25 -0700 Subject: mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap Add a kernel parameter hugetlb_free_vmemmap to enable the feature of freeing unused vmemmap pages associated with each hugetlb page on boot. We disable PMD mapping of vmemmap pages for x86-64 arch when this feature is enabled. Because vmemmap_remap_free() depends on vmemmap being base page mapped. Link: https://lkml.kernel.org/r/20210510030027.56044-8-songmuchun@bytedance.com Signed-off-by: Muchun Song Reviewed-by: Oscar Salvador Reviewed-by: Barry Song Reviewed-by: Miaohe Lin Tested-by: Chen Huang Tested-by: Bodeddula Balasubramaniam Reviewed-by: Mike Kravetz Cc: Alexander Viro Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Balbir Singh Cc: Borislav Petkov Cc: Dave Hansen Cc: David Hildenbrand Cc: David Rientjes Cc: HORIGUCHI NAOYA Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joao Martins Cc: Joerg Roedel Cc: Jonathan Corbet Cc: Matthew Wilcox Cc: Michal Hocko Cc: Mina Almasry Cc: Oliver Neukum Cc: Paul E. McKenney Cc: Pawan Gupta Cc: Peter Zijlstra Cc: Randy Dunlap Cc: Thomas Gleixner Cc: Xiongchun Duan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/mm/init_64.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 65ea58527176..9d9d18d0c2a1 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -1609,7 +1610,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE)); - if (end - start < PAGES_PER_SECTION * sizeof(struct page)) + if ((is_hugetlb_free_vmemmap_enabled() && !altmap) || + end - start < PAGES_PER_SECTION * sizeof(struct page)) err = vmemmap_populate_basepages(start, end, node, NULL); else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); @@ -1637,6 +1639,8 @@ void register_page_bootmem_memmap(unsigned long section_nr, pmd_t *pmd; unsigned int nr_pmd_pages; struct page *page; + bool base_mapping = !boot_cpu_has(X86_FEATURE_PSE) || + is_hugetlb_free_vmemmap_enabled(); for (; addr < end; addr = next) { pte_t *pte = NULL; @@ -1662,7 +1666,7 @@ void register_page_bootmem_memmap(unsigned long section_nr, } get_page_bootmem(section_nr, pud_page(*pud), MIX_SECTION_INFO); - if (!boot_cpu_has(X86_FEATURE_PSE)) { + if (base_mapping) { next = (addr + PAGE_SIZE) & PAGE_MASK; pmd = pmd_offset(pud, addr); if (pmd_none(*pmd)) -- cgit From 79c1c594f49a88fba9744cb5c85978c6b1b365ec Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Wed, 30 Jun 2021 18:48:00 -0700 Subject: mm/hugetlb: change parameters of arch_make_huge_pte() Patch series "Subject: [PATCH v2 0/5] Implement huge VMAP and VMALLOC on powerpc 8xx", v2. This series implements huge VMAP and VMALLOC on powerpc 8xx. Powerpc 8xx has 4 page sizes: - 4k - 16k - 512k - 8M At the time being, vmalloc and vmap only support huge pages which are leaf at PMD level. Here the PMD level is 4M, it doesn't correspond to any supported page size. For now, implement use of 16k and 512k pages which is done at PTE level. Support of 8M pages will be implemented later, it requires use of hugepd tables. To allow this, the architecture provides two functions: - arch_vmap_pte_range_map_size() which tells vmap_pte_range() what page size to use. A stub returning PAGE_SIZE is provided when the architecture doesn't provide this function. - arch_vmap_pte_supported_shift() which tells __vmalloc_node_range() what page shift to use for a given area size. A stub returning PAGE_SHIFT is provided when the architecture doesn't provide this function. This patch (of 5): At the time being, arch_make_huge_pte() has the following prototype: pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, struct page *page, int writable); vma is used to get the pages shift or size. vma is also used on Sparc to get vm_flags. page is not used. writable is not used. In order to use this function without a vma, replace vma by shift and flags. Also remove the used parameters. Link: https://lkml.kernel.org/r/cover.1620795204.git.christophe.leroy@csgroup.eu Link: https://lkml.kernel.org/r/f4633ac6a7da2f22f31a04a89e0a7026bb78b15b.1620795204.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Acked-by: Mike Kravetz Cc: Nicholas Piggin Cc: Mike Kravetz Cc: Mike Rapoport Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Uladzislau Rezki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/include/asm/hugetlb.h | 3 +-- arch/arm64/mm/hugetlbpage.c | 5 ++--- arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h | 5 ++--- arch/sparc/include/asm/pgtable_64.h | 3 +-- arch/sparc/mm/hugetlbpage.c | 6 ++---- 5 files changed, 8 insertions(+), 14 deletions(-) (limited to 'arch') diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 5abf91e3494c..1242f71937f8 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -23,8 +23,7 @@ static inline void arch_clear_hugepage_flags(struct page *page) } #define arch_clear_hugepage_flags arch_clear_hugepage_flags -extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, - struct page *page, int writable); +pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags); #define arch_make_huge_pte arch_make_huge_pte #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 58987a98e179..23505fc35324 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -339,10 +339,9 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return NULL; } -pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, - struct page *page, int writable) +pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) { - size_t pagesize = huge_page_size(hstate_vma(vma)); + size_t pagesize = 1UL << shift; if (pagesize == CONT_PTE_SIZE) { entry = pte_mkcont(entry); diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h index 39be9aea86db..64b6c608eca4 100644 --- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h @@ -66,10 +66,9 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, } #ifdef CONFIG_PPC_4K_PAGES -static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, - struct page *page, int writable) +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) { - size_t size = huge_page_size(hstate_vma(vma)); + size_t size = 1UL << shift; if (size == SZ_16K) return __pte(pte_val(entry) & ~_PAGE_HUGE); diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 550d3904de65..2cd80a0a9795 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -377,8 +377,7 @@ static inline pgprot_t pgprot_noncached(pgprot_t prot) #define pgprot_noncached pgprot_noncached #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE) -extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, - struct page *page, int writable); +pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags); #define arch_make_huge_pte arch_make_huge_pte static inline unsigned long __pte_default_huge_mask(void) { diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index 04d8790f6c32..0f49fada2093 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -177,10 +177,8 @@ static pte_t hugepage_shift_to_tte(pte_t entry, unsigned int shift) return sun4u_hugepage_shift_to_tte(entry, shift); } -pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, - struct page *page, int writeable) +pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) { - unsigned int shift = huge_page_shift(hstate_vma(vma)); pte_t pte; pte = hugepage_shift_to_tte(entry, shift); @@ -188,7 +186,7 @@ pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, #ifdef CONFIG_SPARC64 /* If this vma has ADI enabled on it, turn on TTE.mcd */ - if (vma->vm_flags & VM_SPARC_ADI) + if (flags & VM_SPARC_ADI) return pte_mkmcd(pte); else return pte_mknotmcd(pte); -- cgit From c742199a014de23ee92055c2473d91fe5561ffdf Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Wed, 30 Jun 2021 18:48:03 -0700 Subject: mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge For architectures with no PMD and/or no PUD, add stubs similar to what we have for architectures without P4D. [christophe.leroy@csgroup.eu: arm64: define only {pud/pmd}_{set/clear}_huge when useful] Link: https://lkml.kernel.org/r/73ec95f40cafbbb69bdfb43a7f53876fd845b0ce.1620990479.git.christophe.leroy@csgroup.eu [christophe.leroy@csgroup.eu: x86: define only {pud/pmd}_{set/clear}_huge when useful] Link: https://lkml.kernel.org/r/7fbf1b6bc3e15c07c24fa45278d57064f14c896b.1620930415.git.christophe.leroy@csgroup.eu Link: https://lkml.kernel.org/r/5ac5976419350e8e048d463a64cae449eb3ba4b0.1620795204.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Mike Kravetz Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Uladzislau Rezki Cc: Naresh Kamboju Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/mm/mmu.c | 20 ++++++++++++-------- arch/x86/mm/pgtable.c | 34 +++++++++++++++++++--------------- 2 files changed, 31 insertions(+), 23 deletions(-) (limited to 'arch') diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 89b66ef43a0f..add67deb4910 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1338,6 +1338,7 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot) return dt_virt; } +#if CONFIG_PGTABLE_LEVELS > 3 int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot) { pud_t new_pud = pfn_pud(__phys_to_pfn(phys), mk_pud_sect_prot(prot)); @@ -1352,6 +1353,16 @@ int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot) return 1; } +int pud_clear_huge(pud_t *pudp) +{ + if (!pud_sect(READ_ONCE(*pudp))) + return 0; + pud_clear(pudp); + return 1; +} +#endif + +#if CONFIG_PGTABLE_LEVELS > 2 int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot) { pmd_t new_pmd = pfn_pmd(__phys_to_pfn(phys), mk_pmd_sect_prot(prot)); @@ -1366,14 +1377,6 @@ int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot) return 1; } -int pud_clear_huge(pud_t *pudp) -{ - if (!pud_sect(READ_ONCE(*pudp))) - return 0; - pud_clear(pudp); - return 1; -} - int pmd_clear_huge(pmd_t *pmdp) { if (!pmd_sect(READ_ONCE(*pmdp))) @@ -1381,6 +1384,7 @@ int pmd_clear_huge(pmd_t *pmdp) pmd_clear(pmdp); return 1; } +#endif int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr) { diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index d27cf69e811d..1303ff6ef7be 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -682,6 +682,7 @@ int p4d_clear_huge(p4d_t *p4d) } #endif +#if CONFIG_PGTABLE_LEVELS > 3 /** * pud_set_huge - setup kernel PUD mapping * @@ -720,6 +721,23 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pud_clear_huge - clear kernel PUD mapping when it is set + * + * Returns 1 on success and 0 on failure (no PUD map is found). + */ +int pud_clear_huge(pud_t *pud) +{ + if (pud_large(*pud)) { + pud_clear(pud); + return 1; + } + + return 0; +} +#endif + +#if CONFIG_PGTABLE_LEVELS > 2 /** * pmd_set_huge - setup kernel PMD mapping * @@ -750,21 +768,6 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) return 1; } -/** - * pud_clear_huge - clear kernel PUD mapping when it is set - * - * Returns 1 on success and 0 on failure (no PUD map is found). - */ -int pud_clear_huge(pud_t *pud) -{ - if (pud_large(*pud)) { - pud_clear(pud); - return 1; - } - - return 0; -} - /** * pmd_clear_huge - clear kernel PMD mapping when it is set * @@ -779,6 +782,7 @@ int pmd_clear_huge(pmd_t *pmd) return 0; } +#endif #ifdef CONFIG_X86_64 /** -- cgit From a6a8f7c4aa7eb50304b5c4e68eccd24313f3a785 Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Wed, 30 Jun 2021 18:48:12 -0700 Subject: powerpc/8xx: add support for huge pages on VMAP and VMALLOC powerpc 8xx has 4 page sizes: - 4k - 16k - 512k - 8M At the time being, vmalloc and vmap only support huge pages which are leaf at PMD level. Here the PMD level is 4M, it doesn't correspond to any supported page size. For now, implement use of 16k and 512k pages which is done at PTE level. Support of 8M pages will be implemented later, it requires vmalloc to support hugepd tables. Link: https://lkml.kernel.org/r/8b972f1c03fb6bd59953035f0a3e4d26659de4f8.1620795204.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Mike Kravetz Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Uladzislau Rezki Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 ++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 14b132cf95e2..98206f3ebae0 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -187,7 +187,7 @@ config PPC select GENERIC_VDSO_TIME_NS select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP - select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU + select HAVE_ARCH_HUGE_VMAP if PPC_RADIX_MMU || PPC_8xx select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h index 6e4faa0a9b35..997cec973406 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h @@ -178,6 +178,7 @@ #ifndef __ASSEMBLY__ #include +#include void mmu_pin_tlb(unsigned long top, bool readonly); @@ -225,6 +226,48 @@ static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize) BUG(); } +static inline bool arch_vmap_try_size(unsigned long addr, unsigned long end, u64 pfn, + unsigned int max_page_shift, unsigned long size) +{ + if (end - addr < size) + return false; + + if ((1UL << max_page_shift) < size) + return false; + + if (!IS_ALIGNED(addr, size)) + return false; + + if (!IS_ALIGNED(PFN_PHYS(pfn), size)) + return false; + + return true; +} + +static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, unsigned long end, + u64 pfn, unsigned int max_page_shift) +{ + if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_512K)) + return SZ_512K; + if (PAGE_SIZE == SZ_16K) + return SZ_16K; + if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_16K)) + return SZ_16K; + return PAGE_SIZE; +} +#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size + +static inline int arch_vmap_pte_supported_shift(unsigned long size) +{ + if (size >= SZ_512K) + return 19; + else if (size >= SZ_16K) + return 14; + else + return PAGE_SHIFT; +} +#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift + /* patch sites */ extern s32 patch__itlbmiss_exit_1, patch__dtlbmiss_exit_1; extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf; -- cgit From 2d7a21715f25122779e2bed17db8c57aa01e922f Mon Sep 17 00:00:00 2001 From: Muchun Song Date: Wed, 30 Jun 2021 18:48:25 -0700 Subject: mm: sparsemem: use huge PMD mapping for vmemmap pages The preparation of splitting huge PMD mapping of vmemmap pages is ready, so switch the mapping from PTE to PMD. Link: https://lkml.kernel.org/r/20210616094915.34432-3-songmuchun@bytedance.com Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz Cc: Chen Huang Cc: David Hildenbrand Cc: Jonathan Corbet Cc: Michal Hocko Cc: Oscar Salvador Cc: Xiongchun Duan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/mm/init_64.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) (limited to 'arch') diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 9d9d18d0c2a1..65ea58527176 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -34,7 +34,6 @@ #include #include #include -#include #include #include @@ -1610,8 +1609,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE)); - if ((is_hugetlb_free_vmemmap_enabled() && !altmap) || - end - start < PAGES_PER_SECTION * sizeof(struct page)) + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) err = vmemmap_populate_basepages(start, end, node, NULL); else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); @@ -1639,8 +1637,6 @@ void register_page_bootmem_memmap(unsigned long section_nr, pmd_t *pmd; unsigned int nr_pmd_pages; struct page *page; - bool base_mapping = !boot_cpu_has(X86_FEATURE_PSE) || - is_hugetlb_free_vmemmap_enabled(); for (; addr < end; addr = next) { pte_t *pte = NULL; @@ -1666,7 +1662,7 @@ void register_page_bootmem_memmap(unsigned long section_nr, } get_page_bootmem(section_nr, pud_page(*pud), MIX_SECTION_INFO); - if (base_mapping) { + if (!boot_cpu_has(X86_FEATURE_PSE)) { next = (addr + PAGE_SIZE) & PAGE_MASK; pmd = pmd_offset(pud, addr); if (pmd_none(*pmd)) -- cgit From 781eb2cdd26f3748be57da9bed98bbe5b0dd99fb Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Wed, 30 Jun 2021 18:49:57 -0700 Subject: mm/kconfig: move HOLES_IN_ZONE into mm commit a55749639dc1 ("ia64: drop marked broken DISCONTIGMEM and VIRTUAL_MEM_MAP") drop VIRTUAL_MEM_MAP, so there is no need HOLES_IN_ZONE on ia64. Also move HOLES_IN_ZONE into mm/Kconfig, select it if architecture needs this feature. Link: https://lkml.kernel.org/r/20210417075946.181402-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Acked-by: Catalin Marinas [arm64] Cc: Will Deacon Cc: Thomas Bogendoerfer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/Kconfig | 4 +--- arch/ia64/Kconfig | 3 --- arch/mips/Kconfig | 3 --- 3 files changed, 1 insertion(+), 9 deletions(-) (limited to 'arch') diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d01a1545ab8f..937bdf0c3859 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -201,6 +201,7 @@ config ARM64 select HAVE_KPROBES select HAVE_KRETPROBES select HAVE_GENERIC_VDSO + select HOLES_IN_ZONE select IOMMU_DMA if IOMMU_SUPPORT select IRQ_DOMAIN select IRQ_FORCED_THREADING @@ -1052,9 +1053,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA -config HOLES_IN_ZONE - def_bool y - source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index da22a35e6f03..f4ff6bd5fc13 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -308,9 +308,6 @@ config NODES_SHIFT MAX_NUMNODES will be 2^(This value). If in doubt, use the default. -config HOLES_IN_ZONE - bool - config HAVE_ARCH_NODEDATA_EXTENSION def_bool y depends on NUMA diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 4704a16c2e44..df02e36a4a8d 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1233,9 +1233,6 @@ config HAVE_PLAT_MEMCPY config ISA_DMA_API bool -config HOLES_IN_ZONE - bool - config SYS_SUPPORTS_RELOCATABLE bool help -- cgit From 873ba463914cf484371cba06959d320f9d3121ca Mon Sep 17 00:00:00 2001 From: Mike Rapoport Date: Wed, 30 Jun 2021 18:51:19 -0700 Subject: arm64: decouple check whether pfn is in linear map from pfn_valid() The intended semantics of pfn_valid() is to verify whether there is a struct page for the pfn in question and nothing else. Yet, on arm64 it is used to distinguish memory areas that are mapped in the linear map vs those that require ioremap() to access them. Introduce a dedicated pfn_is_map_memory() wrapper for memblock_is_map_memory() to perform such check and use it where appropriate. Using a wrapper allows to avoid cyclic include dependencies. While here also update style of pfn_valid() so that both pfn_valid() and pfn_is_map_memory() declarations will be consistent. Link: https://lkml.kernel.org/r/20210511100550.28178-4-rppt@kernel.org Signed-off-by: Mike Rapoport Acked-by: David Hildenbrand Acked-by: Ard Biesheuvel Reviewed-by: Kefeng Wang Cc: Anshuman Khandual Cc: Catalin Marinas Cc: Marc Zyngier Cc: Mark Rutland Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 12 ++++++++++++ arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- 6 files changed, 19 insertions(+), 6 deletions(-) (limited to 'arch') diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 87b90dc27a43..9027b7e16c4c 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -369,7 +369,7 @@ static inline void *phys_to_virt(phys_addr_t x) #define virt_addr_valid(addr) ({ \ __typeof__(addr) __addr = __tag_reset(addr); \ - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ + __is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr)); \ }) void dump_mem_limit(void); diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..75ddfe671393 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -37,7 +37,8 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; -extern int pfn_valid(unsigned long); +int pfn_valid(unsigned long pfn); +int pfn_is_map_memory(unsigned long pfn); #include diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 74b3c1a3ff5a..5e0d40f9fb86 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) static bool kvm_is_device_pfn(unsigned long pfn) { - return !pfn_valid(pfn); + return !pfn_is_map_memory(pfn); } static void *stage2_memcache_zalloc_page(void *arg) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index e55409caaee3..148e752a70f7 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -256,6 +256,18 @@ int pfn_valid(unsigned long pfn) } EXPORT_SYMBOL(pfn_valid); +int pfn_is_map_memory(unsigned long pfn) +{ + phys_addr_t addr = PFN_PHYS(pfn); + + /* avoid false positives for bogus PFNs, see comment in pfn_valid() */ + if (PHYS_PFN(addr) != pfn) + return 0; + + return memblock_is_map_memory(addr); +} +EXPORT_SYMBOL(pfn_is_map_memory); + static phys_addr_t memory_limit = PHYS_ADDR_MAX; /* diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index b5e83c46b23e..b7c81dacabf0 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, /* * Don't allow RAM to be mapped. */ - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) + if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr)))) return NULL; area = get_vm_area_caller(size, VM_IOREMAP, caller); @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) { /* For normal memory we already have a cacheable mapping. */ - if (pfn_valid(__phys_to_pfn(phys_addr))) + if (pfn_is_map_memory(__phys_to_pfn(phys_addr))) return (void __iomem *)__phys_to_virt(phys_addr); return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index add67deb4910..7553a7eab3db 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -82,7 +82,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t vma_prot) { - if (!pfn_valid(pfn)) + if (!pfn_is_map_memory(pfn)) return pgprot_noncached(vma_prot); else if (file->f_flags & O_SYNC) return pgprot_writecombine(vma_prot); -- cgit From a7d9f306ba7052056edf9ccae596aeb400226af8 Mon Sep 17 00:00:00 2001 From: Mike Rapoport Date: Wed, 30 Jun 2021 18:51:22 -0700 Subject: arm64: drop pfn_valid_within() and simplify pfn_valid() The arm64's version of pfn_valid() differs from the generic because of two reasons: * Parts of the memory map are freed during boot. This makes it necessary to verify that there is actual physical memory that corresponds to a pfn which is done by querying memblock. * There are NOMAP memory regions. These regions are not mapped in the linear map and until the previous commit the struct pages representing these areas had default values. As the consequence of absence of the special treatment of NOMAP regions in the memory map it was necessary to use memblock_is_map_memory() in pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that generic mm functionality would not treat a NOMAP page as a normal page. Since the NOMAP regions are now marked as PageReserved(), pfn walkers and the rest of core mm will treat them as unusable memory and thus pfn_valid_within() is no longer required at all and can be disabled on arm64. pfn_valid() can be slightly simplified by replacing memblock_is_map_memory() with memblock_is_memory(). [rppt@kernel.org: fix merge fix] Link: https://lkml.kernel.org/r/YJtoQhidtIJOhYsV@kernel.org Link: https://lkml.kernel.org/r/20210511100550.28178-5-rppt@kernel.org Signed-off-by: Mike Rapoport Acked-by: David Hildenbrand Acked-by: Ard Biesheuvel Reviewed-by: Kefeng Wang Cc: Anshuman Khandual Cc: Catalin Marinas Cc: Marc Zyngier Cc: Mark Rutland Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/Kconfig | 1 - arch/arm64/mm/init.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 937bdf0c3859..59c9255ab9d0 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -201,7 +201,6 @@ config ARM64 select HAVE_KPROBES select HAVE_KRETPROBES select HAVE_GENERIC_VDSO - select HOLES_IN_ZONE select IOMMU_DMA if IOMMU_SUPPORT select IRQ_DOMAIN select IRQ_FORCED_THREADING diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 148e752a70f7..725aa84f2faa 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -252,7 +252,7 @@ int pfn_valid(unsigned long pfn) if (!early_section(ms)) return pfn_section_valid(ms, pfn); - return memblock_is_map_memory(addr); + return memblock_is_memory(addr); } EXPORT_SYMBOL(pfn_valid); -- cgit From 16c9afc776608324ca71c0bc354987bab532f51d Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Wed, 30 Jun 2021 18:51:26 -0700 Subject: arm64/mm: drop HAVE_ARCH_PFN_VALID CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64 platforms and free_unused_memmap() would just return without creating any holes in the memmap mapping. There is no need for any special handling in pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves the pfn upper bits sanity check into generic pfn_valid(). Link: https://lkml.kernel.org/r/1621947349-25421-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Acked-by: David Hildenbrand Acked-by: Mike Rapoport Cc: Catalin Marinas Cc: Will Deacon Cc: David Hildenbrand Cc: Mike Rapoport Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/arm64/Kconfig | 1 - arch/arm64/include/asm/page.h | 1 - arch/arm64/mm/init.c | 37 ------------------------------------- 3 files changed, 39 deletions(-) (limited to 'arch') diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 59c9255ab9d0..84f0216d3c5c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -154,7 +154,6 @@ config ARM64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT - select HAVE_ARCH_PFN_VALID select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 75ddfe671393..fcbef3eec4b2 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; -int pfn_valid(unsigned long pfn); int pfn_is_map_memory(unsigned long pfn); #include diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 725aa84f2faa..49019ea0c8a8 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) free_area_init(max_zone_pfns); } -int pfn_valid(unsigned long pfn) -{ - phys_addr_t addr = PFN_PHYS(pfn); - struct mem_section *ms; - - /* - * Ensure the upper PAGE_SHIFT bits are clear in the - * pfn. Else it might lead to false positives when - * some of the upper bits are set, but the lower bits - * match a valid pfn. - */ - if (PHYS_PFN(addr) != pfn) - return 0; - - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) - return 0; - - ms = __pfn_to_section(pfn); - if (!valid_section(ms)) - return 0; - - /* - * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_map_memory() check for ZONE_DEVICE based - * addresses will always fail. Even the normal hotplugged - * memory will never have MEMBLOCK_NOMAP flag set in their - * memblock entries. Skip memblock search for all non early - * memory sections covering all of hotplug memory including - * both normal and ZONE_DEVICE based. - */ - if (!early_section(ms)) - return pfn_section_valid(ms, pfn); - - return memblock_is_memory(addr); -} -EXPORT_SYMBOL(pfn_valid); - int pfn_is_map_memory(unsigned long pfn) { phys_addr_t addr = PFN_PHYS(pfn); -- cgit From cebc774fdc9cb39b959968fbfd7aabe7a8a5154c Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Wed, 30 Jun 2021 18:51:58 -0700 Subject: mm/thp: make ARCH_ENABLE_SPLIT_PMD_PTLOCK dependent on PGTABLE_LEVELS > 2 ARCH_ENABLE_SPLIT_PMD_PTLOCK is irrelevant unless there are more than two page table levels including PMD (also per Documentation/vm/split_page_table_lock.rst). Make this dependency explicit on remaining platforms i.e x86 and s390 where ARCH_ENABLE_SPLIT_PMD_PTLOCK is subscribed. Link: https://lkml.kernel.org/r/1622013501-20409-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Acked-by: Gerald Schaefer # s390 Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/s390/Kconfig | 2 +- arch/x86/Kconfig | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 707afbcd81c2..4098637a52fc 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -62,7 +62,7 @@ config S390 select ARCH_BINFMT_ELF_STATE select ARCH_ENABLE_MEMORY_HOTPLUG if SPARSEMEM select ARCH_ENABLE_MEMORY_HOTREMOVE - select ARCH_ENABLE_SPLIT_PMD_PTLOCK + select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEBUG_WX select ARCH_HAS_DEVMEM_IS_ALLOWED diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5d523ff70fe7..8c5d86655180 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -63,7 +63,7 @@ config X86 select ARCH_ENABLE_HUGEPAGE_MIGRATION if X86_64 && HUGETLB_PAGE && MIGRATION select ARCH_ENABLE_MEMORY_HOTPLUG if X86_64 || (X86_32 && HIGHMEM) select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG - select ARCH_ENABLE_SPLIT_PMD_PTLOCK if X86_64 || X86_PAE + select ARCH_ENABLE_SPLIT_PMD_PTLOCK if (PGTABLE_LEVELS > 2) && (X86_64 || X86_PAE) select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI select ARCH_HAS_CACHE_LINE_SIZE -- cgit From 63703f37aa09e2c12c0ff25afbf5c460b21bfe4c Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Wed, 30 Jun 2021 18:52:20 -0700 Subject: mm: generalize ZONE_[DMA|DMA32] ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that subscribe to them. Instead, just make them generic options which can be selected on applicable platforms. Also only x86/arm64 architectures could enable both ZONE_DMA and ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone configurable and visible on the two architectures. Link: https://lkml.kernel.org/r/20210528074557.17768-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Acked-by: Catalin Marinas [arm64] Acked-by: Geert Uytterhoeven [m68k] Acked-by: Mike Rapoport Acked-by: Palmer Dabbelt [RISC-V] Acked-by: Michal Simek [microblaze] Acked-by: Michael Ellerman [powerpc] Cc: Catalin Marinas Cc: Will Deacon Cc: Geert Uytterhoeven Cc: Thomas Bogendoerfer Cc: "David S. Miller" Cc: Ingo Molnar Cc: Borislav Petkov Cc: Richard Henderson Cc: Russell King Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/alpha/Kconfig | 5 +---- arch/arm/Kconfig | 3 --- arch/arm64/Kconfig | 9 +-------- arch/ia64/Kconfig | 4 +--- arch/m68k/Kconfig | 5 +---- arch/microblaze/Kconfig | 4 +--- arch/mips/Kconfig | 7 ------- arch/powerpc/Kconfig | 4 ---- arch/powerpc/platforms/Kconfig.cputype | 1 + arch/riscv/Kconfig | 5 +---- arch/s390/Kconfig | 4 +--- arch/sparc/Kconfig | 5 +---- arch/x86/Kconfig | 15 ++------------- 13 files changed, 11 insertions(+), 60 deletions(-) (limited to 'arch') diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index 8954216b9956..77d3280dc678 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -40,6 +40,7 @@ config ALPHA select MMU_GATHER_NO_RANGE select SET_FS select SPARSEMEM_EXTREME if SPARSEMEM + select ZONE_DMA help The Alpha is a 64-bit general-purpose processor designed and marketed by the Digital Equipment Corporation of blessed memory, @@ -65,10 +66,6 @@ config GENERIC_CALIBRATE_DELAY bool default y -config ZONE_DMA - bool - default y - config GENERIC_ISA_DMA bool default y diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 24804f11302d..000c3f80b58e 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -218,9 +218,6 @@ config GENERIC_CALIBRATE_DELAY config ARCH_MAY_HAVE_PC_FDC bool -config ZONE_DMA - bool - config ARCH_SUPPORTS_UPROBES def_bool y diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 84f0216d3c5c..39b378ee56ce 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -42,6 +42,7 @@ config ARM64 select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_TEARDOWN_DMA_OPS if IOMMU_SUPPORT select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST + select ARCH_HAS_ZONE_DMA_SET if EXPERT select ARCH_HAVE_ELF_PROT select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_INLINE_READ_LOCK if !PREEMPTION @@ -306,14 +307,6 @@ config GENERIC_CSUM config GENERIC_CALIBRATE_DELAY def_bool y -config ZONE_DMA - bool "Support DMA zone" if EXPERT - default y - -config ZONE_DMA32 - bool "Support DMA32 zone" if EXPERT - default y - config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE def_bool y diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index f4ff6bd5fc13..cf425c2c63af 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -60,6 +60,7 @@ config IA64 select NUMA if !FLATMEM select PCI_MSI_ARCH_FALLBACKS if PCI_MSI select SET_FS + select ZONE_DMA32 default y help The Itanium Processor Family is Intel's 64-bit successor to @@ -72,9 +73,6 @@ config 64BIT select ATA_NONSTANDARD if ATA default y -config ZONE_DMA32 - def_bool y - config MMU bool default y diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig index 372e4e69c43a..05a729c6ad7f 100644 --- a/arch/m68k/Kconfig +++ b/arch/m68k/Kconfig @@ -34,6 +34,7 @@ config M68K select SET_FS select UACCESS_MEMCPY if !MMU select VIRT_TO_BUS + select ZONE_DMA config CPU_BIG_ENDIAN def_bool y @@ -62,10 +63,6 @@ config TIME_LOW_RES config NO_IOPORT_MAP def_bool y -config ZONE_DMA - bool - default y - config HZ int default 1000 if CLEOPATRA diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig index 0660f47012bc..14a67a42fcae 100644 --- a/arch/microblaze/Kconfig +++ b/arch/microblaze/Kconfig @@ -43,6 +43,7 @@ config MICROBLAZE select MMU_GATHER_NO_RANGE select SPARSE_IRQ select SET_FS + select ZONE_DMA # Endianness selection choice @@ -60,9 +61,6 @@ config CPU_LITTLE_ENDIAN endchoice -config ZONE_DMA - def_bool y - config ARCH_HAS_ILOG2_U32 def_bool n diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index df02e36a4a8d..d69f4b17d0a6 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -3274,13 +3274,6 @@ config I8253 select CLKSRC_I8253 select CLKEVT_I8253 select MIPS_EXTERNAL_TIMER - -config ZONE_DMA - bool - -config ZONE_DMA32 - bool - endmenu config TRAD_SIGNALS diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 98206f3ebae0..df46324d5090 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -403,10 +403,6 @@ config PPC_ADV_DEBUG_DAC_RANGE config PPC_DAWR bool -config ZONE_DMA - bool - default y if PPC_BOOK3E_64 - config PGTABLE_LEVELS int default 2 if !PPC64 diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index f998e655b570..7d271de8fcbd 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -111,6 +111,7 @@ config PPC_BOOK3E_64 select PPC_FPU # Make it a choice ? select PPC_SMP_MUXED_IPI select PPC_DOORBELL + select ZONE_DMA endchoice diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 15f9490a7aad..469a70bd8da6 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -104,6 +104,7 @@ config RISCV select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select UACCESS_MEMCPY if !MMU + select ZONE_DMA32 if 64BIT config ARCH_MMAP_RND_BITS_MIN default 18 if 64BIT @@ -133,10 +134,6 @@ config MMU Select if you want MMU-based virtualised addressing space support by paged memory management. If unsure, say 'Y'. -config ZONE_DMA32 - bool - default y if 64BIT - config VA_BITS int default 32 if 32BIT diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 4098637a52fc..9e14acacb884 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -2,9 +2,6 @@ config MMU def_bool y -config ZONE_DMA - def_bool y - config CPU_BIG_ENDIAN def_bool y @@ -210,6 +207,7 @@ config S390 select THREAD_INFO_IN_TASK select TTY select VIRT_CPU_ACCOUNTING + select ZONE_DMA # Note: keep the above list sorted alphabetically config SCHED_OMIT_FRAME_POINTER diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index c72f52c704cd..c5fa7932b550 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -59,6 +59,7 @@ config SPARC32 select CLZ_TAB select HAVE_UID16 select OLD_SIGACTION + select ZONE_DMA config SPARC64 def_bool 64BIT @@ -141,10 +142,6 @@ config HIGHMEM default y if SPARC32 select KMAP_LOCAL -config ZONE_DMA - bool - default y if SPARC32 - config GENERIC_ISA_DMA bool default y if SPARC32 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8c5d86655180..5f875112f086 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -33,6 +33,7 @@ config X86_64 select NEED_DMA_MAP_STATE select SWIOTLB select ARCH_HAS_ELFCORE_COMPAT + select ZONE_DMA32 config FORCE_DYNAMIC_FTRACE def_bool y @@ -93,6 +94,7 @@ config X86 select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_UBSAN_SANITIZE_ALL select ARCH_HAS_DEBUG_WX + select ARCH_HAS_ZONE_DMA_SET if EXPERT select ARCH_HAVE_NMI_SAFE_CMPXCHG select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI select ARCH_MIGHT_HAVE_PC_PARPORT @@ -343,9 +345,6 @@ config ARCH_SUSPEND_POSSIBLE config ARCH_WANT_GENERAL_HUGETLB def_bool y -config ZONE_DMA32 - def_bool y if X86_64 - config AUDIT_ARCH def_bool y if X86_64 @@ -393,16 +392,6 @@ config CC_HAS_SANE_STACKPROTECTOR menu "Processor type and features" -config ZONE_DMA - bool "DMA memory allocation support" if EXPERT - default y - help - DMA memory allocation support allows devices with less than 32-bit - addressing to allocate within the first 16MB of address space. - Disable if no such devices will be used. - - If unsure, say Y. - config SMP bool "Symmetric multi-processing support" help -- cgit From 4ca9b3859dac14bbef0c27d00667bb5b10917adb Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 30 Jun 2021 18:52:28 -0700 Subject: mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables I. Background: Sparse Memory Mappings When we manage sparse memory mappings dynamically in user space - also sometimes involving MAP_NORESERVE - we want to dynamically populate/ discard memory inside such a sparse memory region. Example users are hypervisors (especially implementing memory ballooning or similar technologies like virtio-mem) and memory allocators. In addition, we want to fail in a nice way (instead of generating SIGBUS) if populating does not succeed because we are out of backend memory (which can happen easily with file-based mappings, especially tmpfs and hugetlbfs). While MADV_DONTNEED, MADV_REMOVE and FALLOC_FL_PUNCH_HOLE allow for reliably discarding memory for most mapping types, there is no generic approach to populate page tables and preallocate memory. Although mmap() supports MAP_POPULATE, it is not applicable to the concept of sparse memory mappings, where we want to populate/discard dynamically and avoid expensive/problematic remappings. In addition, we never actually report errors during the final populate phase - it is best-effort only. fallocate() can be used to preallocate file-based memory and fail in a safe way. However, it cannot really be used for any private mappings on anonymous files via memfd due to COW semantics. In addition, fallocate() does not actually populate page tables, so we still always get pagefaults on first access - which is sometimes undesired (i.e., real-time workloads) and requires real prefaulting of page tables, not just a preallocation of backend storage. There might be interesting use cases for sparse memory regions along with mlockall(MCL_ONFAULT) which fallocate() cannot satisfy as it does not prefault page tables. II. On preallcoation/prefaulting from user space Because we don't have a proper interface, what applications (like QEMU and databases) end up doing is touching (i.e., reading+writing one byte to not overwrite existing data) all individual pages. However, that approach 1) Can result in wear on storage backing, because we end up reading/writing each page; this is especially a problem for dax/pmem. 2) Can result in mmap_sem contention when prefaulting via multiple threads. 3) Requires expensive signal handling, especially to catch SIGBUS in case of hugetlbfs/shmem/file-backed memory. For example, this is problematic in hypervisors like QEMU where SIGBUS handlers might already be used by other subsystems concurrently to e.g, handle hardware errors. "Simply" doing preallocation concurrently from other thread is not that easy. III. On MADV_WILLNEED Extending MADV_WILLNEED is not an option because 1. It would change the semantics: "Expect access in the near future." and "might be a good idea to read some pages" vs. "Definitely populate/ preallocate all memory and definitely fail on errors.". 2. Existing users (like virtio-balloon in QEMU when deflating the balloon) don't want populate/prealloc semantics. They treat this rather as a hint to give a little performance boost without too much overhead - and don't expect that a lot of memory might get consumed or a lot of time might be spent. IV. MADV_POPULATE_READ and MADV_POPULATE_WRITE Let's introduce MADV_POPULATE_READ and MADV_POPULATE_WRITE, inspired by MAP_POPULATE, with the following semantics: 1. MADV_POPULATE_READ can be used to prefault page tables just like manually reading each individual page. This will not break any COW mappings. The shared zero page might get mapped and no backend storage might get preallocated -- allocation might be deferred to write-fault time. Especially shared file mappings require an explicit fallocate() upfront to actually preallocate backend memory (blocks in the file system) in case the file might have holes. 2. If MADV_POPULATE_READ succeeds, all page tables have been populated (prefaulted) readable once. 3. MADV_POPULATE_WRITE can be used to preallocate backend memory and prefault page tables just like manually writing (or reading+writing) each individual page. This will break any COW mappings -- e.g., the shared zeropage is never populated. 4. If MADV_POPULATE_WRITE succeeds, all page tables have been populated (prefaulted) writable once. 5. MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot be applied to special mappings marked with VM_PFNMAP and VM_IO. Also, proper access permissions (e.g., PROT_READ, PROT_WRITE) are required. If any such mapping is encountered, madvise() fails with -EINVAL. 6. If MADV_POPULATE_READ or MADV_POPULATE_WRITE fails, some page tables might have been populated. 7. MADV_POPULATE_READ and MADV_POPULATE_WRITE will return -EHWPOISON when encountering a HW poisoned page in the range. 8. Similar to MAP_POPULATE, MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot protect from the OOM (Out Of Memory) handler killing the process. While the use case for MADV_POPULATE_WRITE is fairly obvious (i.e., preallocate memory and prefault page tables for VMs), one issue is that whenever we prefault pages writable, the pages have to be marked dirty, because the CPU could dirty them any time. while not a real problem for hugetlbfs or dax/pmem, it can be a problem for shared file mappings: each page will be marked dirty and has to be written back later when evicting. MADV_POPULATE_READ allows for optimizing this scenario: Pre-read a whole mapping from backend storage without marking it dirty, such that eviction won't have to write it back. As discussed above, shared file mappings might require an explciit fallocate() upfront to achieve preallcoation+prepopulation. Although sparse memory mappings are the primary use case, this will also be useful for other preallocate/prefault use cases where MAP_POPULATE is not desired or the semantics of MAP_POPULATE are not sufficient: as one example, QEMU users can trigger preallocation/prefaulting of guest RAM after the mapping was created -- and don't want errors to be silently suppressed. Looking at the history, MADV_POPULATE was already proposed in 2013 [1], however, the main motivation back than was performance improvements -- which should also still be the case. V. Single-threaded performance comparison I did a short experiment, prefaulting page tables on completely *empty mappings/files* and repeated the experiment 10 times. The results correspond to the shortest execution time. In general, the performance benefit for huge pages is negligible with small mappings. V.1: Private mappings POPULATE_READ and POPULATE_WRITE is fastest. Note that Reading/POPULATE_READ will populate the shared zeropage where applicable -- which result in short population times. The fastest way to allocate backend storage (here: swap or huge pages) and prefault page tables is POPULATE_WRITE. V.2: Shared mappings fallocate() is fastest, however, doesn't prefault page tables. POPULATE_WRITE is faster than simple writes and read/writes. POPULATE_READ is faster than simple reads. Without a fd, the fastest way to allocate backend storage and prefault page tables is POPULATE_WRITE. With an fd, the fastest way is usually FALLOCATE+POPULATE_READ or FALLOCATE+POPULATE_WRITE respectively; one exception are actual files: FALLOCATE+Read is slightly faster than FALLOCATE+POPULATE_READ. The fastest way to allocate backend storage prefault page tables is FALLOCATE+POPULATE_WRITE -- except when dealing with actual files; then, FALLOCATE+POPULATE_READ is fastest and won't directly mark all pages as dirty. v.3: Detailed results ================================================== 2 MiB MAP_PRIVATE: ************************************************** Anon 4 KiB : Read : 0.119 ms Anon 4 KiB : Write : 0.222 ms Anon 4 KiB : Read/Write : 0.380 ms Anon 4 KiB : POPULATE_READ : 0.060 ms Anon 4 KiB : POPULATE_WRITE : 0.158 ms Memfd 4 KiB : Read : 0.034 ms Memfd 4 KiB : Write : 0.310 ms Memfd 4 KiB : Read/Write : 0.362 ms Memfd 4 KiB : POPULATE_READ : 0.039 ms Memfd 4 KiB : POPULATE_WRITE : 0.229 ms Memfd 2 MiB : Read : 0.030 ms Memfd 2 MiB : Write : 0.030 ms Memfd 2 MiB : Read/Write : 0.030 ms Memfd 2 MiB : POPULATE_READ : 0.030 ms Memfd 2 MiB : POPULATE_WRITE : 0.030 ms tmpfs : Read : 0.033 ms tmpfs : Write : 0.313 ms tmpfs : Read/Write : 0.406 ms tmpfs : POPULATE_READ : 0.039 ms tmpfs : POPULATE_WRITE : 0.285 ms file : Read : 0.033 ms file : Write : 0.351 ms file : Read/Write : 0.408 ms file : POPULATE_READ : 0.039 ms file : POPULATE_WRITE : 0.290 ms hugetlbfs : Read : 0.030 ms hugetlbfs : Write : 0.030 ms hugetlbfs : Read/Write : 0.030 ms hugetlbfs : POPULATE_READ : 0.030 ms hugetlbfs : POPULATE_WRITE : 0.030 ms ************************************************** 4096 MiB MAP_PRIVATE: ************************************************** Anon 4 KiB : Read : 237.940 ms Anon 4 KiB : Write : 708.409 ms Anon 4 KiB : Read/Write : 1054.041 ms Anon 4 KiB : POPULATE_READ : 124.310 ms Anon 4 KiB : POPULATE_WRITE : 572.582 ms Memfd 4 KiB : Read : 136.928 ms Memfd 4 KiB : Write : 963.898 ms Memfd 4 KiB : Read/Write : 1106.561 ms Memfd 4 KiB : POPULATE_READ : 78.450 ms Memfd 4 KiB : POPULATE_WRITE : 805.881 ms Memfd 2 MiB : Read : 357.116 ms Memfd 2 MiB : Write : 357.210 ms Memfd 2 MiB : Read/Write : 357.606 ms Memfd 2 MiB : POPULATE_READ : 356.094 ms Memfd 2 MiB : POPULATE_WRITE : 356.937 ms tmpfs : Read : 137.536 ms tmpfs : Write : 954.362 ms tmpfs : Read/Write : 1105.954 ms tmpfs : POPULATE_READ : 80.289 ms tmpfs : POPULATE_WRITE : 822.826 ms file : Read : 137.874 ms file : Write : 987.025 ms file : Read/Write : 1107.439 ms file : POPULATE_READ : 80.413 ms file : POPULATE_WRITE : 857.622 ms hugetlbfs : Read : 355.607 ms hugetlbfs : Write : 355.729 ms hugetlbfs : Read/Write : 356.127 ms hugetlbfs : POPULATE_READ : 354.585 ms hugetlbfs : POPULATE_WRITE : 355.138 ms ************************************************** 2 MiB MAP_SHARED: ************************************************** Anon 4 KiB : Read : 0.394 ms Anon 4 KiB : Write : 0.348 ms Anon 4 KiB : Read/Write : 0.400 ms Anon 4 KiB : POPULATE_READ : 0.326 ms Anon 4 KiB : POPULATE_WRITE : 0.273 ms Anon 2 MiB : Read : 0.030 ms Anon 2 MiB : Write : 0.030 ms Anon 2 MiB : Read/Write : 0.030 ms Anon 2 MiB : POPULATE_READ : 0.030 ms Anon 2 MiB : POPULATE_WRITE : 0.030 ms Memfd 4 KiB : Read : 0.412 ms Memfd 4 KiB : Write : 0.372 ms Memfd 4 KiB : Read/Write : 0.419 ms Memfd 4 KiB : POPULATE_READ : 0.343 ms Memfd 4 KiB : POPULATE_WRITE : 0.288 ms Memfd 4 KiB : FALLOCATE : 0.137 ms Memfd 4 KiB : FALLOCATE+Read : 0.446 ms Memfd 4 KiB : FALLOCATE+Write : 0.330 ms Memfd 4 KiB : FALLOCATE+Read/Write : 0.454 ms Memfd 4 KiB : FALLOCATE+POPULATE_READ : 0.379 ms Memfd 4 KiB : FALLOCATE+POPULATE_WRITE : 0.268 ms Memfd 2 MiB : Read : 0.030 ms Memfd 2 MiB : Write : 0.030 ms Memfd 2 MiB : Read/Write : 0.030 ms Memfd 2 MiB : POPULATE_READ : 0.030 ms Memfd 2 MiB : POPULATE_WRITE : 0.030 ms Memfd 2 MiB : FALLOCATE : 0.030 ms Memfd 2 MiB : FALLOCATE+Read : 0.031 ms Memfd 2 MiB : FALLOCATE+Write : 0.031 ms Memfd 2 MiB : FALLOCATE+Read/Write : 0.031 ms Memfd 2 MiB : FALLOCATE+POPULATE_READ : 0.030 ms Memfd 2 MiB : FALLOCATE+POPULATE_WRITE : 0.030 ms tmpfs : Read : 0.416 ms tmpfs : Write : 0.369 ms tmpfs : Read/Write : 0.425 ms tmpfs : POPULATE_READ : 0.346 ms tmpfs : POPULATE_WRITE : 0.295 ms tmpfs : FALLOCATE : 0.139 ms tmpfs : FALLOCATE+Read : 0.447 ms tmpfs : FALLOCATE+Write : 0.333 ms tmpfs : FALLOCATE+Read/Write : 0.454 ms tmpfs : FALLOCATE+POPULATE_READ : 0.380 ms tmpfs : FALLOCATE+POPULATE_WRITE : 0.272 ms file : Read : 0.191 ms file : Write : 0.511 ms file : Read/Write : 0.524 ms file : POPULATE_READ : 0.196 ms file : POPULATE_WRITE : 0.434 ms file : FALLOCATE : 0.004 ms file : FALLOCATE+Read : 0.197 ms file : FALLOCATE+Write : 0.554 ms file : FALLOCATE+Read/Write : 0.480 ms file : FALLOCATE+POPULATE_READ : 0.201 ms file : FALLOCATE+POPULATE_WRITE : 0.381 ms hugetlbfs : Read : 0.030 ms hugetlbfs : Write : 0.030 ms hugetlbfs : Read/Write : 0.030 ms hugetlbfs : POPULATE_READ : 0.030 ms hugetlbfs : POPULATE_WRITE : 0.030 ms hugetlbfs : FALLOCATE : 0.030 ms hugetlbfs : FALLOCATE+Read : 0.031 ms hugetlbfs : FALLOCATE+Write : 0.031 ms hugetlbfs : FALLOCATE+Read/Write : 0.030 ms hugetlbfs : FALLOCATE+POPULATE_READ : 0.030 ms hugetlbfs : FALLOCATE+POPULATE_WRITE : 0.030 ms ************************************************** 4096 MiB MAP_SHARED: ************************************************** Anon 4 KiB : Read : 1053.090 ms Anon 4 KiB : Write : 913.642 ms Anon 4 KiB : Read/Write : 1060.350 ms Anon 4 KiB : POPULATE_READ : 893.691 ms Anon 4 KiB : POPULATE_WRITE : 782.885 ms Anon 2 MiB : Read : 358.553 ms Anon 2 MiB : Write : 358.419 ms Anon 2 MiB : Read/Write : 357.992 ms Anon 2 MiB : POPULATE_READ : 357.533 ms Anon 2 MiB : POPULATE_WRITE : 357.808 ms Memfd 4 KiB : Read : 1078.144 ms Memfd 4 KiB : Write : 942.036 ms Memfd 4 KiB : Read/Write : 1100.391 ms Memfd 4 KiB : POPULATE_READ : 925.829 ms Memfd 4 KiB : POPULATE_WRITE : 804.394 ms Memfd 4 KiB : FALLOCATE : 304.632 ms Memfd 4 KiB : FALLOCATE+Read : 1163.359 ms Memfd 4 KiB : FALLOCATE+Write : 933.186 ms Memfd 4 KiB : FALLOCATE+Read/Write : 1187.304 ms Memfd 4 KiB : FALLOCATE+POPULATE_READ : 1013.660 ms Memfd 4 KiB : FALLOCATE+POPULATE_WRITE : 794.560 ms Memfd 2 MiB : Read : 358.131 ms Memfd 2 MiB : Write : 358.099 ms Memfd 2 MiB : Read/Write : 358.250 ms Memfd 2 MiB : POPULATE_READ : 357.563 ms Memfd 2 MiB : POPULATE_WRITE : 357.334 ms Memfd 2 MiB : FALLOCATE : 356.735 ms Memfd 2 MiB : FALLOCATE+Read : 358.152 ms Memfd 2 MiB : FALLOCATE+Write : 358.331 ms Memfd 2 MiB : FALLOCATE+Read/Write : 358.018 ms Memfd 2 MiB : FALLOCATE+POPULATE_READ : 357.286 ms Memfd 2 MiB : FALLOCATE+POPULATE_WRITE : 357.523 ms tmpfs : Read : 1087.265 ms tmpfs : Write : 950.840 ms tmpfs : Read/Write : 1107.567 ms tmpfs : POPULATE_READ : 922.605 ms tmpfs : POPULATE_WRITE : 810.094 ms tmpfs : FALLOCATE : 306.320 ms tmpfs : FALLOCATE+Read : 1169.796 ms tmpfs : FALLOCATE+Write : 933.730 ms tmpfs : FALLOCATE+Read/Write : 1191.610 ms tmpfs : FALLOCATE+POPULATE_READ : 1020.474 ms tmpfs : FALLOCATE+POPULATE_WRITE : 798.945 ms file : Read : 654.101 ms file : Write : 1259.142 ms file : Read/Write : 1289.509 ms file : POPULATE_READ : 661.642 ms file : POPULATE_WRITE : 1106.816 ms file : FALLOCATE : 1.864 ms file : FALLOCATE+Read : 656.328 ms file : FALLOCATE+Write : 1153.300 ms file : FALLOCATE+Read/Write : 1180.613 ms file : FALLOCATE+POPULATE_READ : 668.347 ms file : FALLOCATE+POPULATE_WRITE : 996.143 ms hugetlbfs : Read : 357.245 ms hugetlbfs : Write : 357.413 ms hugetlbfs : Read/Write : 357.120 ms hugetlbfs : POPULATE_READ : 356.321 ms hugetlbfs : POPULATE_WRITE : 356.693 ms hugetlbfs : FALLOCATE : 355.927 ms hugetlbfs : FALLOCATE+Read : 357.074 ms hugetlbfs : FALLOCATE+Write : 357.120 ms hugetlbfs : FALLOCATE+Read/Write : 356.983 ms hugetlbfs : FALLOCATE+POPULATE_READ : 356.413 ms hugetlbfs : FALLOCATE+POPULATE_WRITE : 356.266 ms ************************************************** [1] https://lkml.org/lkml/2013/6/27/698 [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/20210419135443.12822-3-david@redhat.com Signed-off-by: David Hildenbrand Cc: Arnd Bergmann Cc: Michal Hocko Cc: Oscar Salvador Cc: Matthew Wilcox (Oracle) Cc: Andrea Arcangeli Cc: Minchan Kim Cc: Jann Horn Cc: Jason Gunthorpe Cc: Dave Hansen Cc: Hugh Dickins Cc: Rik van Riel Cc: Michael S. Tsirkin Cc: Kirill A. Shutemov Cc: Vlastimil Babka Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Thomas Bogendoerfer Cc: "James E.J. Bottomley" Cc: Helge Deller Cc: Chris Zankel Cc: Max Filippov Cc: Mike Kravetz Cc: Peter Xu Cc: Rolf Eike Beer Cc: Ram Pai Cc: Shuah Khan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/alpha/include/uapi/asm/mman.h | 3 +++ arch/mips/include/uapi/asm/mman.h | 3 +++ arch/parisc/include/uapi/asm/mman.h | 3 +++ arch/xtensa/include/uapi/asm/mman.h | 3 +++ 4 files changed, 12 insertions(+) (limited to 'arch') diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index a18ec7f63888..56b4ee5a6c9e 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -71,6 +71,9 @@ #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */ +#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index 57dc2ac4f8bd..40b210c65a5a 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -98,6 +98,9 @@ #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */ +#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index ab78cba446ed..9e3c010c0f61 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -52,6 +52,9 @@ #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */ +#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */ + #define MADV_MERGEABLE 65 /* KSM may merge identical pages */ #define MADV_UNMERGEABLE 66 /* KSM may not merge identical pages */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index e5e643752947..b3a22095371b 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -106,6 +106,9 @@ #define MADV_COLD 20 /* deactivate these pages */ #define MADV_PAGEOUT 21 /* reclaim these pages */ +#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */ +#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */ + /* compatibility flags */ #define MAP_FILE 0 -- cgit From fac7757e1fb05b75c8e22d4f8fe2f6c9c4d7edca Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Wed, 30 Jun 2021 18:53:13 -0700 Subject: mm: define default value for FIRST_USER_ADDRESS Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the same code all over. Instead just define a generic default value (i.e 0UL) for FIRST_USER_ADDRESS and let the platforms override when required. This makes it much cleaner with reduced code. The default FIRST_USER_ADDRESS here would be skipped in when the given platform overrides its value via . Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Acked-by: Geert Uytterhoeven [m68k] Acked-by: Guo Ren [csky] Acked-by: Stafford Horne [openrisc] Acked-by: Catalin Marinas [arm64] Acked-by: Mike Rapoport Acked-by: Palmer Dabbelt [RISC-V] Cc: Richard Henderson Cc: Vineet Gupta Cc: Catalin Marinas Cc: Will Deacon Cc: Guo Ren Cc: Brian Cain Cc: Geert Uytterhoeven Cc: Michal Simek Cc: Thomas Bogendoerfer Cc: Ley Foon Tan Cc: Jonas Bonn Cc: Stefan Kristiansson Cc: Stafford Horne Cc: "James E.J. Bottomley" Cc: Michael Ellerman Cc: Christophe Leroy Cc: Paul Walmsley Cc: Heiko Carstens Cc: Yoshinori Sato Cc: "David S. Miller" Cc: Jeff Dike Cc: Thomas Gleixner Cc: Chris Zankel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/alpha/include/asm/pgtable.h | 1 - arch/arc/include/asm/pgtable.h | 6 ------ arch/arm64/include/asm/pgtable.h | 2 -- arch/csky/include/asm/pgtable.h | 1 - arch/hexagon/include/asm/pgtable.h | 3 --- arch/ia64/include/asm/pgtable.h | 1 - arch/m68k/include/asm/pgtable_mm.h | 1 - arch/microblaze/include/asm/pgtable.h | 2 -- arch/mips/include/asm/pgtable-32.h | 1 - arch/mips/include/asm/pgtable-64.h | 1 - arch/nios2/include/asm/pgtable.h | 2 -- arch/openrisc/include/asm/pgtable.h | 1 - arch/parisc/include/asm/pgtable.h | 2 -- arch/powerpc/include/asm/book3s/pgtable.h | 1 - arch/powerpc/include/asm/nohash/32/pgtable.h | 1 - arch/powerpc/include/asm/nohash/64/pgtable.h | 2 -- arch/riscv/include/asm/pgtable.h | 2 -- arch/s390/include/asm/pgtable.h | 2 -- arch/sh/include/asm/pgtable.h | 2 -- arch/sparc/include/asm/pgtable_32.h | 1 - arch/sparc/include/asm/pgtable_64.h | 3 --- arch/um/include/asm/pgtable-2level.h | 1 - arch/um/include/asm/pgtable-3level.h | 1 - arch/x86/include/asm/pgtable_types.h | 2 -- arch/xtensa/include/asm/pgtable.h | 1 - 25 files changed, 43 deletions(-) (limited to 'arch') diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h index e1757b7cfe3d..ff690846465e 100644 --- a/arch/alpha/include/asm/pgtable.h +++ b/arch/alpha/include/asm/pgtable.h @@ -46,7 +46,6 @@ struct vm_area_struct; #define PTRS_PER_PMD (1UL << (PAGE_SHIFT-3)) #define PTRS_PER_PGD (1UL << (PAGE_SHIFT-3)) #define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL /* Number of pointers that fit on a page: this will go away. */ #define PTRS_PER_PAGE (1UL << (PAGE_SHIFT-3)) diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index 5878846f00cf..577682e8cc7f 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -222,12 +222,6 @@ */ #define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE) -/* - * No special requirements for lowest virtual address we permit any user space - * mapping to be mapped at. - */ -#define FIRST_USER_ADDRESS 0UL - /**************************************************************** * Bucket load of VM Helpers diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b10204e72fc..25f5c04b43ce 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -26,8 +26,6 @@ #define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT)) -#define FIRST_USER_ADDRESS 0UL - #ifndef __ASSEMBLY__ #include diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 0d60367b6bfa..151607ed5158 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -14,7 +14,6 @@ #define PGDIR_MASK (~(PGDIR_SIZE-1)) #define USER_PTRS_PER_PGD (PAGE_OFFSET/PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL /* * C-SKY is two-level paging structure: diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index dbb22b80b8c4..e4979508cddf 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -155,9 +155,6 @@ extern unsigned long _dflt_cache_att; extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; /* located in head.S */ -/* Seems to be zero even in architectures where the zero page is firewalled? */ -#define FIRST_USER_ADDRESS 0UL - /* HUGETLB not working currently */ #ifdef CONFIG_HUGETLB_PAGE #define pte_mkhuge(pte) __pte((pte_val(pte) & ~0x3) | HVM_HUGEPAGE_SIZE) diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h index d765fd948fae..3f5dbbd8b9d8 100644 --- a/arch/ia64/include/asm/pgtable.h +++ b/arch/ia64/include/asm/pgtable.h @@ -128,7 +128,6 @@ #define PTRS_PER_PGD_SHIFT PTRS_PER_PTD_SHIFT #define PTRS_PER_PGD (1UL << PTRS_PER_PGD_SHIFT) #define USER_PTRS_PER_PGD (5*PTRS_PER_PGD/8) /* regions 0-4 are user regions */ -#define FIRST_USER_ADDRESS 0UL /* * All the normal masks have the "page accessed" bits on, as any time diff --git a/arch/m68k/include/asm/pgtable_mm.h b/arch/m68k/include/asm/pgtable_mm.h index aca22c2c1ee2..143ba7de9bda 100644 --- a/arch/m68k/include/asm/pgtable_mm.h +++ b/arch/m68k/include/asm/pgtable_mm.h @@ -72,7 +72,6 @@ #define PTRS_PER_PGD 128 #endif #define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL /* Virtual address region for use by kernel_map() */ #ifdef CONFIG_SUN3 diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h index 9ae8d2c17dd5..71cd547655d9 100644 --- a/arch/microblaze/include/asm/pgtable.h +++ b/arch/microblaze/include/asm/pgtable.h @@ -25,8 +25,6 @@ extern int mem_init_done; #include #include -#define FIRST_USER_ADDRESS 0UL - extern unsigned long va_to_phys(unsigned long address); extern pte_t *va_to_pte(unsigned long address); diff --git a/arch/mips/include/asm/pgtable-32.h b/arch/mips/include/asm/pgtable-32.h index 6c0532d7b211..95df9c293d8d 100644 --- a/arch/mips/include/asm/pgtable-32.h +++ b/arch/mips/include/asm/pgtable-32.h @@ -93,7 +93,6 @@ extern int add_temporary_entry(unsigned long entrylo0, unsigned long entrylo1, #endif #define USER_PTRS_PER_PGD (0x80000000UL/PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL #define VMALLOC_START MAP_BASE diff --git a/arch/mips/include/asm/pgtable-64.h b/arch/mips/include/asm/pgtable-64.h index 1e7d6ce9d8d6..046465906c82 100644 --- a/arch/mips/include/asm/pgtable-64.h +++ b/arch/mips/include/asm/pgtable-64.h @@ -137,7 +137,6 @@ #define PTRS_PER_PTE ((PAGE_SIZE << PTE_ORDER) / sizeof(pte_t)) #define USER_PTRS_PER_PGD ((TASK_SIZE64 / PGDIR_SIZE)?(TASK_SIZE64 / PGDIR_SIZE):1) -#define FIRST_USER_ADDRESS 0UL /* * TLB refill handlers also map the vmalloc area into xuseg. Avoid diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h index 2600d76c310c..4a995fa628ee 100644 --- a/arch/nios2/include/asm/pgtable.h +++ b/arch/nios2/include/asm/pgtable.h @@ -24,8 +24,6 @@ #include #include -#define FIRST_USER_ADDRESS 0UL - #define VMALLOC_START CONFIG_NIOS2_KERNEL_MMU_REGION_BASE #define VMALLOC_END (CONFIG_NIOS2_KERNEL_REGION_BASE - 1) diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h index 9425bedab4fc..4ac591c9ca33 100644 --- a/arch/openrisc/include/asm/pgtable.h +++ b/arch/openrisc/include/asm/pgtable.h @@ -73,7 +73,6 @@ extern void paging_init(void); */ #define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL /* * Kernels own virtual memory area. diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h index 39017210dbf0..7f33c29764cc 100644 --- a/arch/parisc/include/asm/pgtable.h +++ b/arch/parisc/include/asm/pgtable.h @@ -171,8 +171,6 @@ static inline void purge_tlb_entries(struct mm_struct *mm, unsigned long addr) * pgd entries used up by user/kernel: */ -#define FIRST_USER_ADDRESS 0UL - /* NB: The tlb miss handlers make certain assumptions about the order */ /* of the following bits, so be careful (One example, bits 25-31 */ /* are moved together in one instruction). */ diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h index 0e1263455d73..ad130e15a126 100644 --- a/arch/powerpc/include/asm/book3s/pgtable.h +++ b/arch/powerpc/include/asm/book3s/pgtable.h @@ -8,7 +8,6 @@ #include #endif -#define FIRST_USER_ADDRESS 0UL #ifndef __ASSEMBLY__ /* Insert a PTE, top-level function is out of line. It uses an inline * low level function in the respective pgtable-* files diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 96522f7f0618..f06ae00f2a65 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -54,7 +54,6 @@ extern int icache_44x_need_flush; #define PGD_MASKED_BITS 0 #define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL #define pte_ERROR(e) \ pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \ diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h index 57cd3892bfe0..53fbfdfac93d 100644 --- a/arch/powerpc/include/asm/nohash/64/pgtable.h +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h @@ -12,8 +12,6 @@ #include #include -#define FIRST_USER_ADDRESS 0UL - /* * Size of EA range mapped by our pagetables. */ diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 380cd3a7e548..62f3fe7368f3 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -536,8 +536,6 @@ void setup_bootmem(void); void paging_init(void); void misc_mem_init(void); -#define FIRST_USER_ADDRESS 0 - /* * ZERO_PAGE is a global shared page that is always zero, * used for zero-mapped memory areas, etc. diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index b38f7b781564..7c86bf03fc8f 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -65,8 +65,6 @@ extern unsigned long zero_page_mask; /* TODO: s390 cannot support io_remap_pfn_range... */ -#define FIRST_USER_ADDRESS 0UL - #define pte_ERROR(e) \ printk("%s:%d: bad pte %p.\n", __FILE__, __LINE__, (void *) pte_val(e)) #define pmd_ERROR(e) \ diff --git a/arch/sh/include/asm/pgtable.h b/arch/sh/include/asm/pgtable.h index 27751e9470df..d7ddb1ec86a0 100644 --- a/arch/sh/include/asm/pgtable.h +++ b/arch/sh/include/asm/pgtable.h @@ -59,8 +59,6 @@ static inline unsigned long long neff_sign_extend(unsigned long val) /* Entries per level */ #define PTRS_PER_PTE (PAGE_SIZE / (1 << PTE_MAGNITUDE)) -#define FIRST_USER_ADDRESS 0UL - #define PHYS_ADDR_MASK29 0x1fffffff #define PHYS_ADDR_MASK32 0xffffffff diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h index a5cf79c149fe..0888bda245f5 100644 --- a/arch/sparc/include/asm/pgtable_32.h +++ b/arch/sparc/include/asm/pgtable_32.h @@ -48,7 +48,6 @@ unsigned long __init bootmem_init(unsigned long *pages_avail); #define PTRS_PER_PMD 64 #define PTRS_PER_PGD 256 #define USER_PTRS_PER_PGD PAGE_OFFSET / PGDIR_SIZE -#define FIRST_USER_ADDRESS 0UL #define PTE_SIZE (PTRS_PER_PTE*4) #define PAGE_NONE SRMMU_PAGE_NONE diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 2cd80a0a9795..a400e0f23046 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -95,9 +95,6 @@ bool kern_addr_valid(unsigned long addr); #define PTRS_PER_PUD (1UL << PUD_BITS) #define PTRS_PER_PGD (1UL << PGDIR_BITS) -/* Kernel has a separate 44bit address space. */ -#define FIRST_USER_ADDRESS 0UL - #define pmd_ERROR(e) \ pr_err("%s:%d: bad pmd %p(%016lx) seen at (%pS)\n", \ __FILE__, __LINE__, &(e), pmd_val(e), __builtin_return_address(0)) diff --git a/arch/um/include/asm/pgtable-2level.h b/arch/um/include/asm/pgtable-2level.h index 32106d31e4ab..8256ecc5b919 100644 --- a/arch/um/include/asm/pgtable-2level.h +++ b/arch/um/include/asm/pgtable-2level.h @@ -23,7 +23,6 @@ #define PTRS_PER_PTE 1024 #define USER_PTRS_PER_PGD ((TASK_SIZE + (PGDIR_SIZE - 1)) / PGDIR_SIZE) #define PTRS_PER_PGD 1024 -#define FIRST_USER_ADDRESS 0UL #define pte_ERROR(e) \ printk("%s:%d: bad pte %p(%08lx).\n", __FILE__, __LINE__, &(e), \ diff --git a/arch/um/include/asm/pgtable-3level.h b/arch/um/include/asm/pgtable-3level.h index 7e6a4180db9d..9289a86643a9 100644 --- a/arch/um/include/asm/pgtable-3level.h +++ b/arch/um/include/asm/pgtable-3level.h @@ -41,7 +41,6 @@ #endif #define USER_PTRS_PER_PGD ((TASK_SIZE + (PGDIR_SIZE - 1)) / PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL #define pte_ERROR(e) \ printk("%s:%d: bad pte %p(%016lx).\n", __FILE__, __LINE__, &(e), \ diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index f24d7ef8fffa..40497a9020c6 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -7,8 +7,6 @@ #include -#define FIRST_USER_ADDRESS 0UL - #define _PAGE_BIT_PRESENT 0 /* is present */ #define _PAGE_BIT_RW 1 /* writeable */ #define _PAGE_BIT_USER 2 /* userspace addressable */ diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h index d7fc45c920c2..bd5aeb795567 100644 --- a/arch/xtensa/include/asm/pgtable.h +++ b/arch/xtensa/include/asm/pgtable.h @@ -59,7 +59,6 @@ #define PTRS_PER_PGD 1024 #define PGD_ORDER 0 #define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE) -#define FIRST_USER_ADDRESS 0UL #define FIRST_USER_PGD_NR (FIRST_USER_ADDRESS >> PGDIR_SHIFT) #ifdef CONFIG_MMU -- cgit From 1c2f7d14d84f767a797558609eb034511e02f41e Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Wed, 30 Jun 2021 18:53:59 -0700 Subject: mm/thp: define default pmd_pgtable() Currently most platforms define pmd_pgtable() as pmd_page() duplicating the same code all over. Instead just define a default value i.e pmd_page() for pmd_pgtable() and let platforms override when required via . All the existing platform that override pmd_pgtable() have been moved into their respective header in order to precede before the new generic definition. This makes it much cleaner with reduced code. Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Acked-by: Geert Uytterhoeven Acked-by: Mike Rapoport Cc: Nick Hu Cc: Richard Henderson Cc: Vineet Gupta Cc: Catalin Marinas Cc: Will Deacon Cc: Guo Ren Cc: Brian Cain Cc: Geert Uytterhoeven Cc: Michal Simek Cc: Thomas Bogendoerfer Cc: Ley Foon Tan Cc: Jonas Bonn Cc: Stefan Kristiansson Cc: Stafford Horne Cc: "James E.J. Bottomley" Cc: Michael Ellerman Cc: Christophe Leroy Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Heiko Carstens Cc: Yoshinori Sato Cc: "David S. Miller" Cc: Jeff Dike Cc: Thomas Gleixner Cc: Chris Zankel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/alpha/include/asm/pgalloc.h | 1 - arch/arc/include/asm/pgalloc.h | 2 -- arch/arc/include/asm/pgtable.h | 2 ++ arch/arm/include/asm/pgalloc.h | 1 - arch/arm64/include/asm/pgalloc.h | 1 - arch/csky/include/asm/pgalloc.h | 2 -- arch/hexagon/include/asm/pgtable.h | 1 - arch/ia64/include/asm/pgalloc.h | 1 - arch/m68k/include/asm/mcf_pgalloc.h | 2 -- arch/m68k/include/asm/mcf_pgtable.h | 2 ++ arch/m68k/include/asm/motorola_pgalloc.h | 1 - arch/m68k/include/asm/motorola_pgtable.h | 2 ++ arch/m68k/include/asm/sun3_pgalloc.h | 1 - arch/microblaze/include/asm/pgalloc.h | 2 -- arch/mips/include/asm/pgalloc.h | 1 - arch/nds32/include/asm/pgalloc.h | 5 ----- arch/nios2/include/asm/pgalloc.h | 1 - arch/openrisc/include/asm/pgalloc.h | 2 -- arch/parisc/include/asm/pgalloc.h | 1 - arch/powerpc/include/asm/pgalloc.h | 5 ----- arch/powerpc/include/asm/pgtable.h | 6 ++++++ arch/riscv/include/asm/pgalloc.h | 2 -- arch/s390/include/asm/pgalloc.h | 3 --- arch/s390/include/asm/pgtable.h | 3 +++ arch/sh/include/asm/pgalloc.h | 1 - arch/sparc/include/asm/pgalloc_32.h | 1 - arch/sparc/include/asm/pgalloc_64.h | 1 - arch/sparc/include/asm/pgtable_32.h | 2 ++ arch/sparc/include/asm/pgtable_64.h | 2 ++ arch/um/include/asm/pgalloc.h | 1 - arch/x86/include/asm/pgalloc.h | 2 -- arch/xtensa/include/asm/pgalloc.h | 2 -- 32 files changed, 19 insertions(+), 43 deletions(-) (limited to 'arch') diff --git a/arch/alpha/include/asm/pgalloc.h b/arch/alpha/include/asm/pgalloc.h index 9c6a24fe493d..68be7adbfe58 100644 --- a/arch/alpha/include/asm/pgalloc.h +++ b/arch/alpha/include/asm/pgalloc.h @@ -18,7 +18,6 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmd, pgtable_t pte) { pmd_set(pmd, (pte_t *)(page_to_pa(pte) + PAGE_OFFSET)); } -#define pmd_pgtable(pmd) pmd_page(pmd) static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, pte_t *pte) diff --git a/arch/arc/include/asm/pgalloc.h b/arch/arc/include/asm/pgalloc.h index 6147db925248..a32ca3104ced 100644 --- a/arch/arc/include/asm/pgalloc.h +++ b/arch/arc/include/asm/pgalloc.h @@ -129,6 +129,4 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t ptep) #define __pte_free_tlb(tlb, pte, addr) pte_free((tlb)->mm, pte) -#define pmd_pgtable(pmd) ((pgtable_t) pmd_page_vaddr(pmd)) - #endif /* _ASM_ARC_PGALLOC_H */ diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h index 577682e8cc7f..320cc0ae8a08 100644 --- a/arch/arc/include/asm/pgtable.h +++ b/arch/arc/include/asm/pgtable.h @@ -350,6 +350,8 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, #define kern_addr_valid(addr) (1) +#define pmd_pgtable(pmd) ((pgtable_t) pmd_page_vaddr(pmd)) + /* * remap a physical page `pfn' of size `size' with page protection `prot' * into virtual address `from' diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h index fdee1f04f4f3..a17f01235c29 100644 --- a/arch/arm/include/asm/pgalloc.h +++ b/arch/arm/include/asm/pgalloc.h @@ -143,7 +143,6 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep) __pmd_populate(pmdp, page_to_phys(ptep), prot); } -#define pmd_pgtable(pmd) pmd_page(pmd) #endif /* CONFIG_MMU */ diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h index 31fbab3d6f99..8433a2058eb1 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -86,6 +86,5 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep) VM_BUG_ON(mm == &init_mm); __pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN); } -#define pmd_pgtable(pmd) pmd_page(pmd) #endif diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h index cd211aabbefd..bbbd0698b397 100644 --- a/arch/csky/include/asm/pgalloc.h +++ b/arch/csky/include/asm/pgalloc.h @@ -22,8 +22,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, set_pmd(pmd, __pmd(__pa(page_address(pte)))); } -#define pmd_pgtable(pmd) pmd_page(pmd) - extern void pgd_init(unsigned long *p); static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h index e4979508cddf..18cd6ea9ab23 100644 --- a/arch/hexagon/include/asm/pgtable.h +++ b/arch/hexagon/include/asm/pgtable.h @@ -239,7 +239,6 @@ static inline int pmd_bad(pmd_t pmd) * pmd_page - converts a PMD entry to a page pointer */ #define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)) -#define pmd_pgtable(pmd) pmd_page(pmd) /** * pte_none - check if pte is mapped diff --git a/arch/ia64/include/asm/pgalloc.h b/arch/ia64/include/asm/pgalloc.h index 9601cfe83c94..0fb2b6291d58 100644 --- a/arch/ia64/include/asm/pgalloc.h +++ b/arch/ia64/include/asm/pgalloc.h @@ -52,7 +52,6 @@ pmd_populate(struct mm_struct *mm, pmd_t * pmd_entry, pgtable_t pte) { pmd_val(*pmd_entry) = page_to_phys(pte); } -#define pmd_pgtable(pmd) pmd_page(pmd) static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t * pmd_entry, pte_t * pte) diff --git a/arch/m68k/include/asm/mcf_pgalloc.h b/arch/m68k/include/asm/mcf_pgalloc.h index bc1228e00518..5c2c0a864524 100644 --- a/arch/m68k/include/asm/mcf_pgalloc.h +++ b/arch/m68k/include/asm/mcf_pgalloc.h @@ -32,8 +32,6 @@ extern inline pmd_t *pmd_alloc_kernel(pgd_t *pgd, unsigned long address) #define pmd_populate_kernel pmd_populate -#define pmd_pgtable(pmd) pfn_to_virt(pmd_val(pmd) >> PAGE_SHIFT) - static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pgtable, unsigned long address) { diff --git a/arch/m68k/include/asm/mcf_pgtable.h b/arch/m68k/include/asm/mcf_pgtable.h index 8d4ec05996c5..6f2b87d7a50d 100644 --- a/arch/m68k/include/asm/mcf_pgtable.h +++ b/arch/m68k/include/asm/mcf_pgtable.h @@ -150,6 +150,8 @@ #ifndef __ASSEMBLY__ +#define pmd_pgtable(pmd) pfn_to_virt(pmd_val(pmd) >> PAGE_SHIFT) + /* * Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. diff --git a/arch/m68k/include/asm/motorola_pgalloc.h b/arch/m68k/include/asm/motorola_pgalloc.h index b4fc3b4f6bb3..74a817d9387f 100644 --- a/arch/m68k/include/asm/motorola_pgalloc.h +++ b/arch/m68k/include/asm/motorola_pgalloc.h @@ -88,7 +88,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, pgtable_t page { pmd_set(pmd, page); } -#define pmd_pgtable(pmd) ((pgtable_t)pmd_page_vaddr(pmd)) static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd) { diff --git a/arch/m68k/include/asm/motorola_pgtable.h b/arch/m68k/include/asm/motorola_pgtable.h index 8076467eff4b..a2908164ee6f 100644 --- a/arch/m68k/include/asm/motorola_pgtable.h +++ b/arch/m68k/include/asm/motorola_pgtable.h @@ -105,6 +105,8 @@ extern unsigned long mm_cachebits; #define __S110 PAGE_SHARED_C #define __S111 PAGE_SHARED_C +#define pmd_pgtable(pmd) ((pgtable_t)pmd_page_vaddr(pmd)) + /* * Conversion functions: convert a page and protection to a page entry, * and a page entry and page directory to the page they refer to. diff --git a/arch/m68k/include/asm/sun3_pgalloc.h b/arch/m68k/include/asm/sun3_pgalloc.h index 000f64869b91..198036aff519 100644 --- a/arch/m68k/include/asm/sun3_pgalloc.h +++ b/arch/m68k/include/asm/sun3_pgalloc.h @@ -32,7 +32,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, pgtable_t page { pmd_val(*pmd) = __pa((unsigned long)page_address(page)); } -#define pmd_pgtable(pmd) pmd_page(pmd) /* * allocating and freeing a pmd is trivial: the 1-entry pmd is diff --git a/arch/microblaze/include/asm/pgalloc.h b/arch/microblaze/include/asm/pgalloc.h index d56b9f670ad1..6c33b05f730f 100644 --- a/arch/microblaze/include/asm/pgalloc.h +++ b/arch/microblaze/include/asm/pgalloc.h @@ -28,8 +28,6 @@ static inline pgd_t *get_pgd(void) #define pgd_alloc(mm) get_pgd() -#define pmd_pgtable(pmd) pmd_page(pmd) - extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm); #define __pte_free_tlb(tlb, pte, addr) pte_free((tlb)->mm, (pte)) diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h index 8b18424b3120..dd53d0f79cb3 100644 --- a/arch/mips/include/asm/pgalloc.h +++ b/arch/mips/include/asm/pgalloc.h @@ -28,7 +28,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, { set_pmd(pmd, __pmd((unsigned long)page_address(pte))); } -#define pmd_pgtable(pmd) pmd_page(pmd) /* * Initialize a new pmd table with invalid pointers. diff --git a/arch/nds32/include/asm/pgalloc.h b/arch/nds32/include/asm/pgalloc.h index 85c117347c86..a08e1ebca70e 100644 --- a/arch/nds32/include/asm/pgalloc.h +++ b/arch/nds32/include/asm/pgalloc.h @@ -12,11 +12,6 @@ #define __HAVE_ARCH_PTE_ALLOC_ONE #include /* for pte_{alloc,free}_one */ -/* - * Since we have only two-level page tables, these are trivial - */ -#define pmd_pgtable(pmd) pmd_page(pmd) - extern pgd_t *pgd_alloc(struct mm_struct *mm); extern void pgd_free(struct mm_struct *mm, pgd_t * pgd); diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h index e6600d2a5ae0..3c4ae74d5798 100644 --- a/arch/nios2/include/asm/pgalloc.h +++ b/arch/nios2/include/asm/pgalloc.h @@ -25,7 +25,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, { set_pmd(pmd, __pmd((unsigned long)page_address(pte))); } -#define pmd_pgtable(pmd) pmd_page(pmd) /* * Initialize a new pmd table with invalid pointers. diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h index 88820299ecc4..b7b2b8d16fad 100644 --- a/arch/openrisc/include/asm/pgalloc.h +++ b/arch/openrisc/include/asm/pgalloc.h @@ -72,6 +72,4 @@ do { \ tlb_remove_page((tlb), (pte)); \ } while (0) -#define pmd_pgtable(pmd) pmd_page(pmd) - #endif diff --git a/arch/parisc/include/asm/pgalloc.h b/arch/parisc/include/asm/pgalloc.h index dda557085311..6a7e98e71f1d 100644 --- a/arch/parisc/include/asm/pgalloc.h +++ b/arch/parisc/include/asm/pgalloc.h @@ -69,6 +69,5 @@ pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, pte_t *pte) #define pmd_populate(mm, pmd, pte_page) \ pmd_populate_kernel(mm, pmd, page_address(pte_page)) -#define pmd_pgtable(pmd) pmd_page(pmd) #endif diff --git a/arch/powerpc/include/asm/pgalloc.h b/arch/powerpc/include/asm/pgalloc.h index 6dd78a2dc03a..3360cad78ace 100644 --- a/arch/powerpc/include/asm/pgalloc.h +++ b/arch/powerpc/include/asm/pgalloc.h @@ -70,9 +70,4 @@ extern struct kmem_cache *pgtable_cache[]; #include #endif -static inline pgtable_t pmd_pgtable(pmd_t pmd) -{ - return (pgtable_t)pmd_page_vaddr(pmd); -} - #endif /* _ASM_POWERPC_PGALLOC_H */ diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index c6a676714f04..5969743719bc 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -152,6 +152,12 @@ static inline bool p4d_is_leaf(p4d_t p4d) } #endif +#define pmd_pgtable pmd_pgtable +static inline pgtable_t pmd_pgtable(pmd_t pmd) +{ + return (pgtable_t)pmd_page_vaddr(pmd); +} + #ifdef CONFIG_PPC64 #define is_ioremap_addr is_ioremap_addr static inline bool is_ioremap_addr(const void *x) diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h index 23b1544e0ca5..0af6933a7100 100644 --- a/arch/riscv/include/asm/pgalloc.h +++ b/arch/riscv/include/asm/pgalloc.h @@ -38,8 +38,6 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd) } #endif /* __PAGETABLE_PMD_FOLDED */ -#define pmd_pgtable(pmd) pmd_page(pmd) - static inline pgd_t *pgd_alloc(struct mm_struct *mm) { pgd_t *pgd; diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h index 6b187cd72251..f14a555eff74 100644 --- a/arch/s390/include/asm/pgalloc.h +++ b/arch/s390/include/asm/pgalloc.h @@ -134,9 +134,6 @@ static inline void pmd_populate(struct mm_struct *mm, #define pmd_populate_kernel(mm, pmd, pte) pmd_populate(mm, pmd, pte) -#define pmd_pgtable(pmd) \ - ((pgtable_t)__va(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE)) - /* * page table entry allocation/free routines. */ diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 7c86bf03fc8f..1f8f5da53262 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1709,4 +1709,7 @@ extern void s390_reset_cmma(struct mm_struct *mm); #define HAVE_ARCH_UNMAPPED_AREA #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN +#define pmd_pgtable(pmd) \ + ((pgtable_t)__va(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE)) + #endif /* _S390_PAGE_H */ diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h index 0e6b0be25e33..a9e98233c4d4 100644 --- a/arch/sh/include/asm/pgalloc.h +++ b/arch/sh/include/asm/pgalloc.h @@ -30,7 +30,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, { set_pmd(pmd, __pmd((unsigned long)page_address(pte))); } -#define pmd_pgtable(pmd) pmd_page(pmd) #define __pte_free_tlb(tlb,pte,addr) \ do { \ diff --git a/arch/sparc/include/asm/pgalloc_32.h b/arch/sparc/include/asm/pgalloc_32.h index 9d353e6dc5a9..4f73e87b22a3 100644 --- a/arch/sparc/include/asm/pgalloc_32.h +++ b/arch/sparc/include/asm/pgalloc_32.h @@ -51,7 +51,6 @@ static inline void free_pmd_fast(pmd_t * pmd) #define __pmd_free_tlb(tlb, pmd, addr) pmd_free((tlb)->mm, pmd) #define pmd_populate(mm, pmd, pte) pmd_set(pmd, pte) -#define pmd_pgtable(pmd) (pgtable_t)__pmd_page(pmd) void pmd_set(pmd_t *pmdp, pte_t *ptep); #define pmd_populate_kernel pmd_populate diff --git a/arch/sparc/include/asm/pgalloc_64.h b/arch/sparc/include/asm/pgalloc_64.h index a8dafc550985..7b5561d17ab1 100644 --- a/arch/sparc/include/asm/pgalloc_64.h +++ b/arch/sparc/include/asm/pgalloc_64.h @@ -67,7 +67,6 @@ void pte_free(struct mm_struct *mm, pgtable_t ptepage); #define pmd_populate_kernel(MM, PMD, PTE) pmd_set(MM, PMD, PTE) #define pmd_populate(MM, PMD, PTE) pmd_set(MM, PMD, PTE) -#define pmd_pgtable(PMD) ((pte_t *)pmd_page_vaddr(PMD)) void pgtable_free(void *table, bool is_page); diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h index 0888bda245f5..ebaf374b55ab 100644 --- a/arch/sparc/include/asm/pgtable_32.h +++ b/arch/sparc/include/asm/pgtable_32.h @@ -432,4 +432,6 @@ static inline int io_remap_pfn_range(struct vm_area_struct *vma, /* We provide our own get_unmapped_area to cope with VA holes for userland */ #define HAVE_ARCH_UNMAPPED_AREA +#define pmd_pgtable(pmd) ((pgtable_t)__pmd_page(pmd)) + #endif /* !(_SPARC_PGTABLE_H) */ diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index a400e0f23046..e0ee48ec3903 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1117,6 +1117,8 @@ extern unsigned long cmdline_memory_size; asmlinkage void do_sparc64_fault(struct pt_regs *regs); +#define pmd_pgtable(PMD) ((pte_t *)pmd_page_vaddr(PMD)) + #ifdef CONFIG_HUGETLB_PAGE #define pud_leaf_size pud_leaf_size diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h index 2bbf28cf3aa9..8ec7cd46dd96 100644 --- a/arch/um/include/asm/pgalloc.h +++ b/arch/um/include/asm/pgalloc.h @@ -19,7 +19,6 @@ set_pmd(pmd, __pmd(_PAGE_TABLE + \ ((unsigned long long)page_to_pfn(pte) << \ (unsigned long long) PAGE_SHIFT))) -#define pmd_pgtable(pmd) pmd_page(pmd) /* * Allocate and free page tables. diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index 62ad61d6fefc..c7ec5bb88334 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -84,8 +84,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, set_pmd(pmd, __pmd(((pteval_t)pfn << PAGE_SHIFT) | _PAGE_TABLE)); } -#define pmd_pgtable(pmd) pmd_page(pmd) - #if CONFIG_PGTABLE_LEVELS > 2 extern void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd); diff --git a/arch/xtensa/include/asm/pgalloc.h b/arch/xtensa/include/asm/pgalloc.h index d3a22da4d2c9..eeb2de3a89e5 100644 --- a/arch/xtensa/include/asm/pgalloc.h +++ b/arch/xtensa/include/asm/pgalloc.h @@ -25,7 +25,6 @@ (pmd_val(*(pmdp)) = ((unsigned long)ptep)) #define pmd_populate(mm, pmdp, page) \ (pmd_val(*(pmdp)) = ((unsigned long)page_to_virt(page))) -#define pmd_pgtable(pmd) pmd_page(pmd) static inline pgd_t* pgd_alloc(struct mm_struct *mm) @@ -63,7 +62,6 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm) return page; } -#define pmd_pgtable(pmd) pmd_page(pmd) #endif /* CONFIG_MMU */ #endif /* _XTENSA_PGALLOC_H */ -- cgit From af5cdaf82238fb3637a0d0fff4670e5be71c611c Mon Sep 17 00:00:00 2001 From: Alistair Popple Date: Wed, 30 Jun 2021 18:54:06 -0700 Subject: mm: remove special swap entry functions Patch series "Add support for SVM atomics in Nouveau", v11. Introduction ============ Some devices have features such as atomic PTE bits that can be used to implement atomic access to system memory. To support atomic operations to a shared virtual memory page such a device needs access to that page which is exclusive of the CPU. This series introduces a mechanism to temporarily unmap pages granting exclusive access to a device. These changes are required to support OpenCL atomic operations in Nouveau to shared virtual memory (SVM) regions allocated with the CL_MEM_SVM_ATOMICS clSVMAlloc flag. A more complete description of the OpenCL SVM feature is available at https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/ OpenCL_API.html#_shared_virtual_memory . Implementation ============== Exclusive device access is implemented by adding a new swap entry type (SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main difference is that on fault the original entry is immediately restored by the fault handler instead of waiting. Restoring the entry triggers calls to MMU notifers which allows a device driver to revoke the atomic access permission from the GPU prior to the CPU finalising the entry. Patches ======= Patches 1 & 2 refactor existing migration and device private entry functions. Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated functionality into separate functions - try_to_migrate_one() and try_to_munlock_one(). Patch 5 renames some existing code but does not introduce functionality. Patch 6 is a small clean-up to swap entry handling in copy_pte_range(). Patch 7 contains the bulk of the implementation for device exclusive memory. Patch 8 contains some additions to the HMM selftests to ensure everything works as expected. Patch 9 is a cleanup for the Nouveau SVM implementation. Patch 10 contains the implementation of atomic access for the Nouveau driver. Testing ======= This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program which checks that GPU atomic accesses to system memory are atomic. Without this series the test fails as there is no way of write-protecting the page mapping which results in the device clobbering CPU writes. For reference the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/ Further testing has been performed by adding support for testing exclusive access to the hmm-tests kselftests. This patch (of 10): Remove multiple similar inline functions for dealing with different types of special swap entries. Both migration and device private swap entries use the swap offset to store a pfn. Instead of multiple inline functions to obtain a struct page for each swap entry type use a common function pfn_swap_entry_to_page(). Also open-code the various entry_to_pfn() functions as this results is shorter code that is easier to understand. Link: https://lkml.kernel.org/r/20210616105937.23201-1-apopple@nvidia.com Link: https://lkml.kernel.org/r/20210616105937.23201-2-apopple@nvidia.com Signed-off-by: Alistair Popple Reviewed-by: Ralph Campbell Reviewed-by: Christoph Hellwig Cc: "Matthew Wilcox (Oracle)" Cc: Hugh Dickins Cc: Peter Xu Cc: Shakeel Butt Cc: Ben Skeggs Cc: Jason Gunthorpe Cc: John Hubbard Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/s390/mm/pgtable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 18205f851c24..eec3a9d7176e 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -691,7 +691,7 @@ static void ptep_zap_swap_entry(struct mm_struct *mm, swp_entry_t entry) if (!non_swap_entry(entry)) dec_mm_counter(mm, MM_SWAPENTS); else if (is_migration_entry(entry)) { - struct page *page = migration_entry_to_page(entry); + struct page *page = pfn_swap_entry_to_page(entry); dec_mm_counter(mm, mm_counter(page)); } -- cgit From f39650de687e35766572ac89dbcd16a5911e2f0a Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Wed, 30 Jun 2021 18:54:59 -0700 Subject: kernel.h: split out panic and oops helpers kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko Reviewed-by: Bjorn Andersson Co-developed-by: Andrew Morton Acked-by: Mike Rapoport Acked-by: Corey Minyard Acked-by: Christian Brauner Acked-by: Arnd Bergmann Acked-by: Kees Cook Acked-by: Wei Liu Acked-by: Rasmus Villemoes Signed-off-by: Andrew Morton Acked-by: Sebastian Reichel Acked-by: Luis Chamberlain Acked-by: Stephen Boyd Acked-by: Thomas Bogendoerfer Acked-by: Helge Deller # parisc Signed-off-by: Linus Torvalds --- arch/alpha/kernel/setup.c | 2 +- arch/arm64/kernel/setup.c | 1 + arch/ia64/include/asm/pal.h | 1 + arch/mips/kernel/relocate.c | 1 + arch/mips/sgi-ip22/ip22-reset.c | 1 + arch/mips/sgi-ip32/ip32-reset.c | 1 + arch/parisc/kernel/pdc_chassis.c | 1 + arch/powerpc/kernel/setup-common.c | 1 + arch/s390/kernel/ipl.c | 1 + arch/sparc/kernel/sstate.c | 1 + arch/um/drivers/mconsole_kern.c | 1 + arch/um/kernel/um_arch.c | 1 + arch/x86/include/asm/desc.h | 1 + arch/x86/kernel/cpu/mshyperv.c | 1 + arch/x86/kernel/setup.c | 1 + arch/x86/purgatory/purgatory.c | 2 ++ arch/x86/xen/enlighten.c | 1 + arch/xtensa/platforms/iss/setup.c | 1 + 18 files changed, 19 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c index 5f6858e9dc28..7d56c217b235 100644 --- a/arch/alpha/kernel/setup.c +++ b/arch/alpha/kernel/setup.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -46,7 +47,6 @@ #include #include -extern struct atomic_notifier_head panic_notifier_list; static int alpha_panic_event(struct notifier_block *, unsigned long, void *); static struct notifier_block alpha_panic_block = { alpha_panic_event, diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 61845c0821d9..787bc0f601b3 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/ia64/include/asm/pal.h b/arch/ia64/include/asm/pal.h index 5c51fceedaf9..e6b652f9e45e 100644 --- a/arch/ia64/include/asm/pal.h +++ b/arch/ia64/include/asm/pal.h @@ -99,6 +99,7 @@ #include #include +#include /* * Data types needed to pass information into PAL procedures and diff --git a/arch/mips/kernel/relocate.c b/arch/mips/kernel/relocate.c index 499a5357c09f..56b51de2dc51 100644 --- a/arch/mips/kernel/relocate.c +++ b/arch/mips/kernel/relocate.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/mips/sgi-ip22/ip22-reset.c b/arch/mips/sgi-ip22/ip22-reset.c index c374f3ceec38..9028dbbb45dd 100644 --- a/arch/mips/sgi-ip22/ip22-reset.c +++ b/arch/mips/sgi-ip22/ip22-reset.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include diff --git a/arch/mips/sgi-ip32/ip32-reset.c b/arch/mips/sgi-ip32/ip32-reset.c index 20d8637340be..18d1c115cd53 100644 --- a/arch/mips/sgi-ip32/ip32-reset.c +++ b/arch/mips/sgi-ip32/ip32-reset.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/parisc/kernel/pdc_chassis.c b/arch/parisc/kernel/pdc_chassis.c index 75ae88d13909..da154406d368 100644 --- a/arch/parisc/kernel/pdc_chassis.c +++ b/arch/parisc/kernel/pdc_chassis.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index 74a98fff2c2f..046fe21b5c3b 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -9,6 +9,7 @@ #undef DEBUG #include +#include #include #include #include diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index dba04fbc37a2..36f870dc944f 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/sparc/kernel/sstate.c b/arch/sparc/kernel/sstate.c index ac8677c3841e..3bcc4ddc6911 100644 --- a/arch/sparc/kernel/sstate.c +++ b/arch/sparc/kernel/sstate.c @@ -6,6 +6,7 @@ #include #include +#include #include #include diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c index 6d00af25ec6b..328b16f99b30 100644 --- a/arch/um/drivers/mconsole_kern.c +++ b/arch/um/drivers/mconsole_kern.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 74e07e748a9b..9512253947d5 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 476082a83d1c..ceb12683b6d1 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -9,6 +9,7 @@ #include #include +#include #include #include diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 22f13343b5da..9e5c6f2b044d 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1e720626069a..d21ba6fa39d9 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c index f03b64d9cb51..7558139920f8 100644 --- a/arch/x86/purgatory/purgatory.c +++ b/arch/x86/purgatory/purgatory.c @@ -9,6 +9,8 @@ */ #include +#include +#include #include #include diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index aa9f50fccc5d..c79bd0af2e8c 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include diff --git a/arch/xtensa/platforms/iss/setup.c b/arch/xtensa/platforms/iss/setup.c index ed519aee0ec8..d3433e1bb94e 100644 --- a/arch/xtensa/platforms/iss/setup.c +++ b/arch/xtensa/platforms/iss/setup.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include -- cgit From 66ce75144d4b33e376f187df3dec495fe47d2ad0 Mon Sep 17 00:00:00 2001 From: Barry Song Date: Wed, 30 Jun 2021 18:56:31 -0700 Subject: kprobes: remove duplicated strong free_insn_page in x86 and s390 free_insn_page() in x86 and s390 is same with the common weak function in kernel/kprobes.c. Plus, the comment "Recover page to RW mode before releasing it" in x86 seems insensible to be there since resetting mapping is done by common code in vfree() of module_memfree(). So drop these two duplicated strong functions and related comment, then mark the common one in kernel/kprobes.c strong. Link: https://lkml.kernel.org/r/20210608065736.32656-1-song.bao.hua@hisilicon.com Signed-off-by: Barry Song Acked-by: Masami Hiramatsu Acked-by: Heiko Carstens Reviewed-by: Christoph Hellwig Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: "Naveen N. Rao" Cc: Anil S Keshavamurthy Cc: David S. Miller Cc: Qi Liu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/s390/kernel/kprobes.c | 5 ----- arch/x86/kernel/kprobes/core.c | 6 ------ 2 files changed, 11 deletions(-) (limited to 'arch') diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c index aae24dc75df6..60cfbd24229b 100644 --- a/arch/s390/kernel/kprobes.c +++ b/arch/s390/kernel/kprobes.c @@ -44,11 +44,6 @@ void *alloc_insn_page(void) return page; } -void free_insn_page(void *page) -{ - module_memfree(page); -} - static void *alloc_s390_insn_page(void) { if (xchg(&insn_page_in_use, 1) == 1) diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index d3d65545cb8b..3bce67d3a03c 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -422,12 +422,6 @@ void *alloc_insn_page(void) return page; } -/* Recover page to RW mode before releasing it */ -void free_insn_page(void *page) -{ - module_memfree(page); -} - /* Kprobe x86 instruction emulation - only regs->ip or IF flag modifiers */ static void kprobe_emulate_ifmodifiers(struct kprobe *p, struct pt_regs *regs) -- cgit