From cf264e1329fb0307e044f7675849f9f38b44c11a Mon Sep 17 00:00:00 2001 From: Nhat Pham Date: Tue, 2 May 2023 18:36:07 -0700 Subject: cachestat: implement cachestat syscall There is currently no good way to query the page cache state of large file sets and directory trees. There is mincore(), but it scales poorly: the kernel writes out a lot of bitmap data that userspace has to aggregate, when the user really doesn not care about per-page information in that case. The user also needs to mmap and unmap each file as it goes along, which can be quite slow as well. Some use cases where this information could come in handy: * Allowing database to decide whether to perform an index scan or direct table queries based on the in-memory cache state of the index. * Visibility into the writeback algorithm, for performance issues diagnostic. * Workload-aware writeback pacing: estimating IO fulfilled by page cache (and IO to be done) within a range of a file, allowing for more frequent syncing when and where there is IO capacity, and batching when there is not. * Computing memory usage of large files/directory trees, analogous to the du tool for disk usage. More information about these use cases could be found in the following thread: https://lore.kernel.org/lkml/20230315170934.GA97793@cmpxchg.org/ This patch implements a new syscall that queries cache state of a file and summarizes the number of cached pages, number of dirty pages, number of pages marked for writeback, number of (recently) evicted pages, etc. in a given range. Currently, the syscall is only wired in for x86 architecture. NAME cachestat - query the page cache statistics of a file. SYNOPSIS #include struct cachestat_range { __u64 off; __u64 len; }; struct cachestat { __u64 nr_cache; __u64 nr_dirty; __u64 nr_writeback; __u64 nr_evicted; __u64 nr_recently_evicted; }; int cachestat(unsigned int fd, struct cachestat_range *cstat_range, struct cachestat *cstat, unsigned int flags); DESCRIPTION cachestat() queries the number of cached pages, number of dirty pages, number of pages marked for writeback, number of evicted pages, number of recently evicted pages, in the bytes range given by `off` and `len`. An evicted page is a page that is previously in the page cache but has been evicted since. A page is recently evicted if its last eviction was recent enough that its reentry to the cache would indicate that it is actively being used by the system, and that there is memory pressure on the system. These values are returned in a cachestat struct, whose address is given by the `cstat` argument. The `off` and `len` arguments must be non-negative integers. If `len` > 0, the queried range is [`off`, `off` + `len`]. If `len` == 0, we will query in the range from `off` to the end of the file. The `flags` argument is unused for now, but is included for future extensibility. User should pass 0 (i.e no flag specified). Currently, hugetlbfs is not supported. Because the status of a page can change after cachestat() checks it but before it returns to the application, the returned values may contain stale information. RETURN VALUE On success, cachestat returns 0. On error, -1 is returned, and errno is set to indicate the error. ERRORS EFAULT cstat or cstat_args points to an invalid address. EINVAL invalid flags. EBADF invalid file descriptor. EOPNOTSUPP file descriptor is of a hugetlbfs file [nphamcs@gmail.com: replace rounddown logic with the existing helper] Link: https://lkml.kernel.org/r/20230504022044.3675469-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230503013608.2431726-3-nphamcs@gmail.com Signed-off-by: Nhat Pham Acked-by: Johannes Weiner Cc: Brian Foster Cc: Matthew Wilcox (Oracle) Cc: Michael Kerrisk Signed-off-by: Andrew Morton --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + 2 files changed, 2 insertions(+) (limited to 'arch') diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 320480a8db4f..bc0a3c941b35 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -455,3 +455,4 @@ 448 i386 process_mrelease sys_process_mrelease 449 i386 futex_waitv sys_futex_waitv 450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node +451 i386 cachestat sys_cachestat diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index c84d12608cd2..227538b0ce80 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -372,6 +372,7 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat # # Due to a historical design error, certain syscalls are numbered differently -- cgit From 946e697c69ffeeefdd84dad90eac307284df46be Mon Sep 17 00:00:00 2001 From: Nhat Pham Date: Wed, 10 May 2023 12:58:06 -0700 Subject: cachestat: wire up cachestat for other architectures cachestat is previously only wired in for x86 (and architectures using the generic unistd.h table): https://lore.kernel.org/lkml/20230503013608.2431726-1-nphamcs@gmail.com/ This patch wires cachestat in for all the other architectures. [nphamcs@gmail.com: wire up cachestat for arm64] Link: https://lkml.kernel.org/r/20230511092843.3896327-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230510195806.2902878-1-nphamcs@gmail.com Signed-off-by: Nhat Pham Tested-by: Michael Ellerman [powerpc] Acked-by: Geert Uytterhoeven [m68k] Reviewed-by: Arnd Bergmann Acked-by: Heiko Carstens [s390] Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: Christophe Leroy Cc: Chris Zankel Cc: David S. Miller Cc: Helge Deller Cc: Ivan Kokshaysky Cc: "James E.J. Bottomley" Cc: Johannes Weiner Cc: John Paul Adrian Glaubitz Cc: Matt Turner Cc: Max Filippov Cc: Michal Simek Cc: Nicholas Piggin Cc: Richard Henderson Cc: Rich Felker Cc: Russell King Cc: Sven Schnelle Cc: Thomas Bogendoerfer Cc: Vasily Gorbik Cc: Yoshinori Sato Signed-off-by: Andrew Morton --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + 16 files changed, 17 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 8ebacf37a8cf..1f13995d00d7 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -490,3 +490,4 @@ 558 common process_mrelease sys_process_mrelease 559 common futex_waitv sys_futex_waitv 560 common set_mempolicy_home_node sys_ni_syscall +561 common cachestat sys_cachestat diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index ac964612d8b0..8ebed8a13874 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -464,3 +464,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 037feba03a51..64a514f90131 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 451 +#define __NR_compat_syscalls 452 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 604a2053d006..d952a28463e0 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -907,6 +907,8 @@ __SYSCALL(__NR_process_mrelease, sys_process_mrelease) __SYSCALL(__NR_futex_waitv, sys_futex_waitv) #define __NR_set_mempolicy_home_node 450 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) +#define __NR_cachestat 451 +__SYSCALL(__NR_cachestat, sys_cachestat) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 72c929d9902b..f8c74ffeeefb 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -371,3 +371,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index b1f3940bc298..4f504783371f 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -450,3 +450,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 820145e47350..858d22bf275c 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 253ff994ed2e..1976317d4e8b 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -389,3 +389,4 @@ 448 n32 process_mrelease sys_process_mrelease 449 n32 futex_waitv sys_futex_waitv 450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node +451 n32 cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index 3f1886ad9d80..cfda2511badf 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -365,3 +365,4 @@ 448 n64 process_mrelease sys_process_mrelease 449 n64 futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 n64 cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 8f243e35a7b2..7692234c3768 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -438,3 +438,4 @@ 448 o32 process_mrelease sys_process_mrelease 449 o32 futex_waitv sys_futex_waitv 450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node +451 o32 cachestat sys_cachestat diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 0e42fceb2d5e..3c71fad78318 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -448,3 +448,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index a0be127475b1..8c0b08b7a80e 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -537,3 +537,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index b68f47541169..a6935af2235c 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -453,3 +453,4 @@ 448 common process_mrelease sys_process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat sys_cachestat diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 2de85c977f54..97377e8c5025 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -453,3 +453,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 4398cc6fb68d..faa835f3c54a 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -496,3 +496,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index 52c94ab5c205..2b69c3c035b6 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -421,3 +421,4 @@ 448 common process_mrelease sys_process_mrelease 449 common futex_waitv sys_futex_waitv 450 common set_mempolicy_home_node sys_set_mempolicy_home_node +451 common cachestat sys_cachestat -- cgit From bb6e04a173f06e51819a4bb512e127dfbc50dcfa Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 9 May 2023 16:57:21 +0200 Subject: kasan: use internal prototypes matching gcc-13 builtins gcc-13 warns about function definitions for builtin interfaces that have a different prototype, e.g.: In file included from kasan_test.c:31: kasan.h:574:6: error: conflicting types for built-in function '__asan_register_globals'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 574 | void __asan_register_globals(struct kasan_global *globals, size_t size); kasan.h:577:6: error: conflicting types for built-in function '__asan_alloca_poison'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 577 | void __asan_alloca_poison(unsigned long addr, size_t size); kasan.h:580:6: error: conflicting types for built-in function '__asan_load1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 580 | void __asan_load1(unsigned long addr); kasan.h:581:6: error: conflicting types for built-in function '__asan_store1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 581 | void __asan_store1(unsigned long addr); kasan.h:643:6: error: conflicting types for built-in function '__hwasan_tag_memory'; expected 'void(void *, unsigned char, long int)' [-Werror=builtin-declaration-mismatch] 643 | void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size); The two problems are: - Addresses are passes as 'unsigned long' in the kernel, but gcc-13 expects a 'void *'. - sizes meant to use a signed ssize_t rather than size_t. Change all the prototypes to match these. Using 'void *' consistently for addresses gets rid of a couple of type casts, so push that down to the leaf functions where possible. This now passes all randconfig builds on arm, arm64 and x86, but I have not tested it on the other architectures that support kasan, since they tend to fail randconfig builds in other ways. This might fail if any of the 32-bit architectures expect a 'long' instead of 'int' for the size argument. The __asan_allocas_unpoison() function prototype is somewhat weird, since it uses a pointer for 'stack_top' and an size_t for 'stack_bottom'. This looks like it is meant to be 'addr' and 'size' like the others, but the implementation clearly treats them as 'top' and 'bottom'. Link: https://lkml.kernel.org/r/20230509145735.9263-2-arnd@kernel.org Signed-off-by: Arnd Bergmann Cc: Alexander Potapenko Cc: Andrey Konovalov Cc: Andrey Ryabinin Cc: Dmitry Vyukov Cc: Marco Elver Cc: Vincenzo Frascino Cc: Signed-off-by: Andrew Morton --- arch/arm64/kernel/traps.c | 2 +- arch/arm64/mm/fault.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 4bb1b8f47298..7b889445e5c6 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -1044,7 +1044,7 @@ static int kasan_handler(struct pt_regs *regs, unsigned long esr) bool recover = esr & KASAN_ESR_RECOVER; bool write = esr & KASAN_ESR_WRITE; size_t size = KASAN_ESR_SIZE(esr); - u64 addr = regs->regs[0]; + void *addr = (void *)regs->regs[0]; u64 pc = regs->pc; kasan_report(addr, size, write, pc); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index cb21ccd7940d..d5047eef4295 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -317,7 +317,7 @@ static void report_tag_fault(unsigned long addr, unsigned long esr, * find out access size. */ bool is_write = !!(esr & ESR_ELx_WNR); - kasan_report(addr, 0, is_write, regs->pc); + kasan_report((void *)addr, 0, is_write, regs->pc); } #else /* Tag faults aren't enabled without CONFIG_KASAN_HW_TAGS. */ -- cgit From 54d020692b342f7bd02d7f5795fb5c401caecfcc Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 17 May 2023 20:25:33 +0100 Subject: mm/gup: remove unused vmas parameter from get_user_pages() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Patch series "remove the vmas parameter from GUP APIs", v6. (pin_/get)_user_pages[_remote]() each provide an optional output parameter for an array of VMA objects associated with each page in the input range. These provide the means for VMAs to be returned, as long as mm->mmap_lock is never released during the GUP operation (i.e. the internal flag FOLL_UNLOCKABLE is not specified). In addition, these VMAs can only be accessed with the mmap_lock held and become invalidated the moment it is released. The vast majority of invocations do not use this functionality and of those that do, all but one case retrieve a single VMA to perform checks upon. It is not egregious in the single VMA cases to simply replace the operation with a vma_lookup(). In these cases we duplicate the (fast) lookup on a slow path already under the mmap_lock, abstracted to a new get_user_page_vma_remote() inline helper function which also performs error checking and reference count maintenance. The special case is io_uring, where io_pin_pages() specifically needs to assert that the VMAs underlying the range do not result in broken long-term GUP file-backed mappings. As GUP now internally asserts that FOLL_LONGTERM mappings are not file-backed in a broken fashion (i.e. requiring dirty tracking) - as implemented in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings" - this logic is no longer required and so we can simply remove it altogether from io_uring. Eliminating the vmas parameter eliminates an entire class of danging pointer errors that might have occured should the lock have been incorrectly released. In addition, the API is simplified and now clearly expresses what it is intended for - applying the specified GUP flags and (if pinning) returning pinned pages. This change additionally opens the door to further potential improvements in GUP and the possible marrying of disparate code paths. I have run this series against gup_test with no issues. Thanks to Matthew Wilcox for suggesting this refactoring! This patch (of 6): No invocation of get_user_pages() use the vmas parameter, so remove it. The GUP API is confusing and caveated. Recent changes have done much to improve that, however there is more we can do. Exporting vmas is a prime target as the caller has to be extremely careful to preclude their use after the mmap_lock has expired or otherwise be left with dangling pointers. Removing the vmas parameter focuses the GUP functions upon their primary purpose - pinning (and outputting) pages as well as performing the actions implied by the input flags. This is part of a patch series aiming to remove the vmas parameter altogether. Link: https://lkml.kernel.org/r/cover.1684350871.git.lstoakes@gmail.com Link: https://lkml.kernel.org/r/589e0c64794668ffc799651e8d85e703262b1e9d.1684350871.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Suggested-by: Matthew Wilcox (Oracle) Acked-by: Greg Kroah-Hartman Acked-by: David Hildenbrand Reviewed-by: Jason Gunthorpe Acked-by: Christian König (for radeon parts) Acked-by: Jarkko Sakkinen Reviewed-by: Christoph Hellwig Acked-by: Sean Christopherson (KVM) Cc: Catalin Marinas Cc: Dennis Dalessandro Cc: Janosch Frank Cc: Jens Axboe Cc: Sakari Ailus Signed-off-by: Andrew Morton --- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 21ca0a831b70..5d390df21440 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -214,7 +214,7 @@ static int __sgx_encl_add_page(struct sgx_encl *encl, if (!(vma->vm_flags & VM_MAYEXEC)) return -EACCES; - ret = get_user_pages(src, 1, 0, &src_page, NULL); + ret = get_user_pages(src, 1, 0, &src_page); if (ret < 1) return -EFAULT; -- cgit From ca5e863233e8f6acd1792fd85d6bc2729a1b2c10 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 17 May 2023 20:25:39 +0100 Subject: mm/gup: remove vmas parameter from get_user_pages_remote() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The only instances of get_user_pages_remote() invocations which used the vmas parameter were for a single page which can instead simply look up the VMA directly. In particular:- - __update_ref_ctr() looked up the VMA but did nothing with it so we simply remove it. - __access_remote_vm() was already using vma_lookup() when the original lookup failed so by doing the lookup directly this also de-duplicates the code. We are able to perform these VMA operations as we already hold the mmap_lock in order to be able to call get_user_pages_remote(). As part of this work we add get_user_page_vma_remote() which abstracts the VMA lookup, error handling and decrementing the page reference count should the VMA lookup fail. This forms part of a broader set of patches intended to eliminate the vmas parameter altogether. [akpm@linux-foundation.org: avoid passing NULL to PTR_ERR] Link: https://lkml.kernel.org/r/d20128c849ecdbf4dd01cc828fcec32127ed939a.1684350871.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Catalin Marinas (for arm64) Acked-by: David Hildenbrand Reviewed-by: Janosch Frank (for s390) Reviewed-by: Christoph Hellwig Cc: Christian König Cc: Dennis Dalessandro Cc: Greg Kroah-Hartman Cc: Jarkko Sakkinen Cc: Jason Gunthorpe Cc: Jens Axboe Cc: Matthew Wilcox (Oracle) Cc: Sakari Ailus Cc: Sean Christopherson Signed-off-by: Andrew Morton --- arch/arm64/kernel/mte.c | 17 +++++++++-------- arch/s390/kvm/interrupt.c | 2 +- 2 files changed, 10 insertions(+), 9 deletions(-) (limited to 'arch') diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 7e89968bd282..4c5ef9b20065 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -416,10 +416,9 @@ long get_mte_ctrl(struct task_struct *task) static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, struct iovec *kiov, unsigned int gup_flags) { - struct vm_area_struct *vma; void __user *buf = kiov->iov_base; size_t len = kiov->iov_len; - int ret; + int err = 0; int write = gup_flags & FOLL_WRITE; if (!access_ok(buf, len)) @@ -429,14 +428,16 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, return -EIO; while (len) { + struct vm_area_struct *vma; unsigned long tags, offset; void *maddr; - struct page *page = NULL; + struct page *page = get_user_page_vma_remote(mm, addr, + gup_flags, &vma); - ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page, - &vma, NULL); - if (ret <= 0) + if (IS_ERR_OR_NULL(page)) { + err = page == NULL ? -EIO : PTR_ERR(page); break; + } /* * Only copy tags if the page has been mapped as PROT_MTE @@ -446,7 +447,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, * was never mapped with PROT_MTE. */ if (!(vma->vm_flags & VM_MTE)) { - ret = -EOPNOTSUPP; + err = -EOPNOTSUPP; put_page(page); break; } @@ -479,7 +480,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, kiov->iov_len = buf - kiov->iov_base; if (!kiov->iov_len) { /* check for error accessing the tracee's address space */ - if (ret <= 0) + if (err) return -EIO; else return -EFAULT; diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index da6dac36e959..9bd0a873f3b1 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -2777,7 +2777,7 @@ static struct page *get_map_page(struct kvm *kvm, u64 uaddr) mmap_read_lock(kvm->mm); get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE, - &page, NULL, NULL); + &page, NULL); mmap_read_unlock(kvm->mm); return page; } -- cgit From 4c630f307455c06f99bdeca7f7a1ab5318604fe0 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Wed, 17 May 2023 20:25:45 +0100 Subject: mm/gup: remove vmas parameter from pin_user_pages() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We are now in a position where no caller of pin_user_pages() requires the vmas parameter at all, so eliminate this parameter from the function and all callers. This clears the way to removing the vmas parameter from GUP altogether. Link: https://lkml.kernel.org/r/195a99ae949c9f5cb589d2222b736ced96ec199a.1684350871.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Acked-by: David Hildenbrand Acked-by: Dennis Dalessandro [qib] Reviewed-by: Christoph Hellwig Acked-by: Sakari Ailus [drivers/media] Cc: Catalin Marinas Cc: Christian König Cc: Greg Kroah-Hartman Cc: Janosch Frank Cc: Jarkko Sakkinen Cc: Jason Gunthorpe Cc: Jens Axboe Cc: Matthew Wilcox (Oracle) Cc: Sean Christopherson Signed-off-by: Andrew Morton --- arch/powerpc/mm/book3s64/iommu_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c index 81d7185e2ae8..d19fb1f3007d 100644 --- a/arch/powerpc/mm/book3s64/iommu_api.c +++ b/arch/powerpc/mm/book3s64/iommu_api.c @@ -105,7 +105,7 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, ret = pin_user_pages(ua + (entry << PAGE_SHIFT), n, FOLL_WRITE | FOLL_LONGTERM, - mem->hpages + entry, NULL); + mem->hpages + entry); if (ret == n) { pinned += n; continue; -- cgit From 766b59e876505852e1571142bd57dfe20fca383e Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:10:57 -0700 Subject: arm: allow pte_offset_map[_lock]() to fail Patch series "arch: allow pte_offset_map[_lock]() to fail", v2. What is it all about? Some mmap_lock avoidance i.e. latency reduction. Initially just for the case of collapsing shmem or file pages to THPs; but likely to be relied upon later in other contexts e.g. freeing of empty page tables (but that's not work I'm doing). mmap_write_lock avoidance when collapsing to anon THPs? Perhaps, but again that's not work I've done: a quick attempt was not as easy as the shmem/file case. I would much prefer not to have to make these small but wide-ranging changes for such a niche case; but failed to find another way, and have heard that shmem MADV_COLLAPSE's usefulness is being limited by that mmap_write_lock it currently requires. These changes (though of course not these exact patches, and not all of these architectures!) have been in Google's data centre kernel for three years now: we do rely upon them. What are the per-arch changes about? Generally, two things. One: the current mmap locking may not be enough to guard against that tricky transition between pmd entry pointing to page table, and empty pmd entry, and pmd entry pointing to huge page: pte_offset_map() will have to validate the pmd entry for itself, returning NULL if no page table is there. What to do about that varies: often the nearby error handling indicates just to skip it; but in some cases a "goto again" looks appropriate (and if that risks an infinite loop, then there must have been an oops, or pfn 0 mistaken for page table, before). Deeper study of each site might show that 90% of them here in arch code could only fail if there's corruption e.g. a transition to THP would be surprising on an arch without HAVE_ARCH_TRANSPARENT_HUGEPAGE. But given the likely extension to freeing empty page tables, I have not limited this set of changes to THP; and it has been easier, and sets a better example, if each site is given appropriate handling. Two: pte_offset_map() will need to do an rcu_read_lock(), with the corresponding rcu_read_unlock() in pte_unmap(). But most architectures never supported CONFIG_HIGHPTE, so some don't always call pte_unmap() after pte_offset_map(), or have used userspace pte_offset_map() where pte_offset_kernel() is more correct. No problem in the current tree, but a problem once an rcu_read_unlock() will be needed to keep balance. A common special case of that comes in arch/*/mm/hugetlbpage.c, if the architecture supports hugetlb pages down at the lowest PTE level. huge_pte_alloc() uses pte_alloc_map(), but generic hugetlb code does no corresponding pte_unmap(); similarly for huge_pte_offset(). In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/a4963be9-7aa6-350-66d0-2ba843e1af44@google.com Link: https://lkml.kernel.org/r/813429a1-204a-1844-eeae-7fd72826c28@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Will Deacon Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- arch/arm/lib/uaccess_with_memcpy.c | 3 +++ arch/arm/mm/fault-armv.c | 5 ++++- arch/arm/mm/fault.c | 3 +++ 3 files changed, 10 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c index e4c2677cc1e9..2f6163f05e93 100644 --- a/arch/arm/lib/uaccess_with_memcpy.c +++ b/arch/arm/lib/uaccess_with_memcpy.c @@ -74,6 +74,9 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp) return 0; pte = pte_offset_map_lock(current->mm, pmd, addr, &ptl); + if (unlikely(!pte)) + return 0; + if (unlikely(!pte_present(*pte) || !pte_young(*pte) || !pte_write(*pte) || !pte_dirty(*pte))) { pte_unmap_unlock(pte, ptl); diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 0e49154454a6..ca5302b0b7ee 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -117,8 +117,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, * must use the nested version. This also means we need to * open-code the spin-locking. */ - ptl = pte_lockptr(vma->vm_mm, pmd); pte = pte_offset_map(pmd, address); + if (!pte) + return 0; + + ptl = pte_lockptr(vma->vm_mm, pmd); do_pte_lock(ptl); ret = do_adjust_pte(vma, address, pfn, pte); diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 2418f1efabd8..83598649a094 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -85,6 +85,9 @@ void show_pte(const char *lvl, struct mm_struct *mm, unsigned long addr) break; pte = pte_offset_map(pmd, addr); + if (!pte) + break; + pr_cont(", *pte=%08llx", (long long)pte_val(*pte)); #ifndef CONFIG_ARM_LPAE pr_cont(", *ppte=%08llx", -- cgit From 52924726f4c06b3f77de75fb665e6efd8e777d07 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:11:59 -0700 Subject: arm64: allow pte_offset_map() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/35e46485-8499-4337-c51f-b8fa495a1a93@google.com Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/arm64/mm/fault.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'arch') diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index d5047eef4295..d81430119004 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -177,6 +177,9 @@ static void show_pte(unsigned long addr) break; ptep = pte_offset_map(pmdp, addr); + if (!ptep) + break; + pte = READ_ONCE(*ptep); pr_cont(", pte=%016llx", pte_val(pte)); pte_unmap(ptep); -- cgit From cafcb9ca5a56f393019200cd072cbb57d0ed7dd4 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:13:13 -0700 Subject: arm64/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/5849464-7191-40c5-c55f-fba9c3802e5d@google.com Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/arm64/mm/hugetlbpage.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) (limited to 'arch') diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 95364e8bdc19..21716c940682 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -307,14 +307,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, return NULL; WARN_ON(addr & (sz - 1)); - /* - * Note that if this code were ever ported to the - * 32-bit arm platform then it will cause trouble in - * the case where CONFIG_HIGHPTE is set, since there - * will be no pte_unmap() to correspond with this - * pte_alloc_map(). - */ - ptep = pte_alloc_map(mm, pmdp, addr); + ptep = pte_alloc_huge(mm, pmdp, addr); } else if (sz == PMD_SIZE) { if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) ptep = huge_pmd_share(mm, vma, addr, pudp); @@ -366,7 +359,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return (pte_t *)pmdp; if (sz == CONT_PTE_SIZE) - return pte_offset_kernel(pmdp, (addr & CONT_PTE_MASK)); + return pte_offset_huge(pmdp, (addr & CONT_PTE_MASK)); return NULL; } -- cgit From 0db639f768e6b6296bbc64689dd8823b7bdb765a Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:14:26 -0700 Subject: ia64/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/1c2c7837-bfea-9640-a74-985379fcc5a@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/ia64/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c index 78a02e026164..adc49f2d22e8 100644 --- a/arch/ia64/mm/hugetlbpage.c +++ b/arch/ia64/mm/hugetlbpage.c @@ -41,7 +41,7 @@ huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, if (pud) { pmd = pmd_alloc(mm, pud, taddr); if (pmd) - pte = pte_alloc_map(mm, pmd, taddr); + pte = pte_alloc_huge(mm, pmd, taddr); } return pte; } @@ -64,7 +64,7 @@ huge_pte_offset (struct mm_struct *mm, unsigned long addr, unsigned long sz) if (pud_present(*pud)) { pmd = pmd_offset(pud, taddr); if (pmd_present(*pmd)) - pte = pte_offset_map(pmd, taddr); + pte = pte_offset_huge(pmd, taddr); } } } -- cgit From e67b37c368b7cc24b8c0fe5ab6c44422312eab37 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:15:16 -0700 Subject: m68k: allow pte_offset_map[_lock]() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Restructure cf_tlb_miss() with a pte_unmap() (previously omitted) at label out, followed by one local_irq_restore() for all. Link: https://lkml.kernel.org/r/795f6a7-bcca-cdf-ad2a-fbdaa232998c@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/m68k/include/asm/mmu_context.h | 6 +++-- arch/m68k/kernel/sys_m68k.c | 2 ++ arch/m68k/mm/mcfmmu.c | 52 +++++++++++++++---------------------- 3 files changed, 27 insertions(+), 33 deletions(-) (limited to 'arch') diff --git a/arch/m68k/include/asm/mmu_context.h b/arch/m68k/include/asm/mmu_context.h index 8ed6ac14d99f..141bbdfad960 100644 --- a/arch/m68k/include/asm/mmu_context.h +++ b/arch/m68k/include/asm/mmu_context.h @@ -99,7 +99,7 @@ static inline void load_ksp_mmu(struct task_struct *task) p4d_t *p4d; pud_t *pud; pmd_t *pmd; - pte_t *pte; + pte_t *pte = NULL; unsigned long mmuar; local_irq_save(flags); @@ -139,7 +139,7 @@ static inline void load_ksp_mmu(struct task_struct *task) pte = (mmuar >= PAGE_OFFSET) ? pte_offset_kernel(pmd, mmuar) : pte_offset_map(pmd, mmuar); - if (pte_none(*pte) || !pte_present(*pte)) + if (!pte || pte_none(*pte) || !pte_present(*pte)) goto bug; set_pte(pte, pte_mkyoung(*pte)); @@ -161,6 +161,8 @@ static inline void load_ksp_mmu(struct task_struct *task) bug: pr_info("ksp load failed: mm=0x%p ksp=0x08%lx\n", mm, mmuar); end: + if (pte && mmuar < PAGE_OFFSET) + pte_unmap(pte); local_irq_restore(flags); } diff --git a/arch/m68k/kernel/sys_m68k.c b/arch/m68k/kernel/sys_m68k.c index bd0274c7592e..c586034d2a7a 100644 --- a/arch/m68k/kernel/sys_m68k.c +++ b/arch/m68k/kernel/sys_m68k.c @@ -488,6 +488,8 @@ sys_atomic_cmpxchg_32(unsigned long newval, int oldval, int d3, int d4, int d5, if (!pmd_present(*pmd)) goto bad_access; pte = pte_offset_map_lock(mm, pmd, (unsigned long)mem, &ptl); + if (!pte) + goto bad_access; if (!pte_present(*pte) || !pte_dirty(*pte) || !pte_write(*pte)) { pte_unmap_unlock(pte, ptl); diff --git a/arch/m68k/mm/mcfmmu.c b/arch/m68k/mm/mcfmmu.c index 70aa0979e027..42f45abea37a 100644 --- a/arch/m68k/mm/mcfmmu.c +++ b/arch/m68k/mm/mcfmmu.c @@ -91,7 +91,8 @@ int cf_tlb_miss(struct pt_regs *regs, int write, int dtlb, int extension_word) p4d_t *p4d; pud_t *pud; pmd_t *pmd; - pte_t *pte; + pte_t *pte = NULL; + int ret = -1; int asid; local_irq_save(flags); @@ -100,47 +101,33 @@ int cf_tlb_miss(struct pt_regs *regs, int write, int dtlb, int extension_word) regs->pc + (extension_word * sizeof(long)); mm = (!user_mode(regs) && KMAPAREA(mmuar)) ? &init_mm : current->mm; - if (!mm) { - local_irq_restore(flags); - return -1; - } + if (!mm) + goto out; pgd = pgd_offset(mm, mmuar); - if (pgd_none(*pgd)) { - local_irq_restore(flags); - return -1; - } + if (pgd_none(*pgd)) + goto out; p4d = p4d_offset(pgd, mmuar); - if (p4d_none(*p4d)) { - local_irq_restore(flags); - return -1; - } + if (p4d_none(*p4d)) + goto out; pud = pud_offset(p4d, mmuar); - if (pud_none(*pud)) { - local_irq_restore(flags); - return -1; - } + if (pud_none(*pud)) + goto out; pmd = pmd_offset(pud, mmuar); - if (pmd_none(*pmd)) { - local_irq_restore(flags); - return -1; - } + if (pmd_none(*pmd)) + goto out; pte = (KMAPAREA(mmuar)) ? pte_offset_kernel(pmd, mmuar) : pte_offset_map(pmd, mmuar); - if (pte_none(*pte) || !pte_present(*pte)) { - local_irq_restore(flags); - return -1; - } + if (!pte || pte_none(*pte) || !pte_present(*pte)) + goto out; if (write) { - if (!pte_write(*pte)) { - local_irq_restore(flags); - return -1; - } + if (!pte_write(*pte)) + goto out; set_pte(pte, pte_mkdirty(*pte)); } @@ -161,9 +148,12 @@ int cf_tlb_miss(struct pt_regs *regs, int write, int dtlb, int extension_word) mmu_write(MMUOR, MMUOR_ACC | MMUOR_UAA); else mmu_write(MMUOR, MMUOR_ITLB | MMUOR_ACC | MMUOR_UAA); - + ret = 0; +out: + if (pte && !KMAPAREA(mmuar)) + pte_unmap(pte); local_irq_restore(flags); - return 0; + return ret; } void __init cf_bootmem_alloc(void) -- cgit From 505a23a5f8934175d2b66fb5ea03a8c519a28e44 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:16:14 -0700 Subject: microblaze: allow pte_offset_map() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/eab66faf-c0ab-3a8f-47bf-8a7c5af638f@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/microblaze/kernel/signal.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c index c3aebec71c0c..c78a0ff48066 100644 --- a/arch/microblaze/kernel/signal.c +++ b/arch/microblaze/kernel/signal.c @@ -194,7 +194,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, preempt_disable(); ptep = pte_offset_map(pmdp, address); - if (pte_present(*ptep)) { + if (ptep && pte_present(*ptep)) { address = (unsigned long) page_address(pte_page(*ptep)); /* MS: I need add offset in page */ address += ((unsigned long)frame->tramp) & ~PAGE_MASK; @@ -203,7 +203,8 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, invalidate_icache_range(address, address + 8); flush_dcache_range(address, address + 8); } - pte_unmap(ptep); + if (ptep) + pte_unmap(ptep); preempt_enable(); if (err) return -EFAULT; -- cgit From 17b25a3801d11c28374a37cc672823cedb0f10d3 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 15 Jun 2023 16:02:43 -0700 Subject: mips: add pte_unmap() to balance pte_offset_map() To keep balance in future, __update_tlb() remember to pte_unmap() after pte_offset_map(). This is an odd case, since the caller has already done pte_offset_map_lock(), then mips forgets the address and recalculates it; but my two naive attempts to clean that up did more harm than good. Link: https://lkml.kernel.org/r/addfcb3-b5f4-976e-e050-a2508e589cfe@google.com Signed-off-by: Hugh Dickins Tested-by: Nathan Chancellor Tested-by: Yu Zhao Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/mips/mm/tlb-r4k.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c index 1b939abbe4ca..93c2d695588a 100644 --- a/arch/mips/mm/tlb-r4k.c +++ b/arch/mips/mm/tlb-r4k.c @@ -297,7 +297,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte) p4d_t *p4dp; pud_t *pudp; pmd_t *pmdp; - pte_t *ptep; + pte_t *ptep, *ptemap = NULL; int idx, pid; /* @@ -344,7 +344,12 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte) } else #endif { - ptep = pte_offset_map(pmdp, address); + ptemap = ptep = pte_offset_map(pmdp, address); + /* + * update_mmu_cache() is called between pte_offset_map_lock() + * and pte_unmap_unlock(), so we can assume that ptep is not + * NULL here: and what should be done below if it were NULL? + */ #if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32) #ifdef CONFIG_XPA @@ -373,6 +378,9 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte) tlbw_use_hazard(); htw_start(); flush_micro_tlb_vm(vma); + + if (ptemap) + pte_unmap(ptemap); local_irq_restore(flags); } -- cgit From 6a2561f92e7d74536b6eba3ffdcf66d60d44031c Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:18:35 -0700 Subject: parisc: add pte_unmap() to balance get_ptep() To keep balance in future, remember to pte_unmap() after a successful get_ptep(). And act as if flush_cache_pages() really needs a map there, to read the pfn before "unmapping", to be sure page table is not removed. Link: https://lkml.kernel.org/r/653369-95ef-acd2-d6ea-e95f5a997493@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/parisc/kernel/cache.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) (limited to 'arch') diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c index ca4a302d4365..501160250bb7 100644 --- a/arch/parisc/kernel/cache.c +++ b/arch/parisc/kernel/cache.c @@ -426,10 +426,15 @@ void flush_dcache_page(struct page *page) offset = (pgoff - mpnt->vm_pgoff) << PAGE_SHIFT; addr = mpnt->vm_start + offset; if (parisc_requires_coherency()) { + bool needs_flush = false; pte_t *ptep; ptep = get_ptep(mpnt->vm_mm, addr); - if (ptep && pte_needs_flush(*ptep)) + if (ptep) { + needs_flush = pte_needs_flush(*ptep); + pte_unmap(ptep); + } + if (needs_flush) flush_user_cache_page(mpnt, addr); } else { /* @@ -561,14 +566,20 @@ EXPORT_SYMBOL(flush_kernel_dcache_page_addr); static void flush_cache_page_if_present(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn) { - pte_t *ptep = get_ptep(vma->vm_mm, vmaddr); + bool needs_flush = false; + pte_t *ptep; /* * The pte check is racy and sometimes the flush will trigger * a non-access TLB miss. Hopefully, the page has already been * flushed. */ - if (ptep && pte_needs_flush(*ptep)) + ptep = get_ptep(vma->vm_mm, vmaddr); + if (ptep) { + needs_flush = pte_needs_flush(*ptep); + pte_unmap(ptep); + } + if (needs_flush) flush_cache_page(vma, vmaddr, pfn); } @@ -635,17 +646,22 @@ static void flush_cache_pages(struct vm_area_struct *vma, unsigned long start, u pte_t *ptep; for (addr = start; addr < end; addr += PAGE_SIZE) { + bool needs_flush = false; /* * The vma can contain pages that aren't present. Although * the pte search is expensive, we need the pte to find the * page pfn and to check whether the page should be flushed. */ ptep = get_ptep(vma->vm_mm, addr); - if (ptep && pte_needs_flush(*ptep)) { + if (ptep) { + needs_flush = pte_needs_flush(*ptep); + pfn = pte_pfn(*ptep); + pte_unmap(ptep); + } + if (needs_flush) { if (parisc_requires_coherency()) { flush_user_cache_page(vma, addr); } else { - pfn = pte_pfn(*ptep); if (WARN_ON(!pfn_valid(pfn))) return; __flush_cache_page(vma, addr, PFN_PHYS(pfn)); -- cgit From ffd3e90a8fda600e37f60e6b648e9252927c5e8b Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:20:00 -0700 Subject: parisc: unmap_uncached_pte() use pte_offset_kernel() unmap_uncached_pte() is working from pgd_offset_k(vaddr), so it should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Link: https://lkml.kernel.org/r/358dfe21-a47f-9d3-bf21-9c454735944@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/parisc/kernel/pci-dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index 71ed5391f29d..415f12d5bab3 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -164,7 +164,7 @@ static inline void unmap_uncached_pte(pmd_t * pmd, unsigned long vaddr, pmd_clear(pmd); return; } - pte = pte_offset_map(pmd, vaddr); + pte = pte_offset_kernel(pmd, vaddr); vaddr &= ~PMD_MASK; end = vaddr + size; if (end > PMD_SIZE) -- cgit From def1cd433f8a83d7fe0bef8fa3bafdb3756ba1b2 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:21:07 -0700 Subject: parisc/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/7963aeed-f7d2-e0-f3c6-3680c5572444@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/parisc/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/parisc/mm/hugetlbpage.c b/arch/parisc/mm/hugetlbpage.c index d1d3990b83f6..a8a1a7c1e16e 100644 --- a/arch/parisc/mm/hugetlbpage.c +++ b/arch/parisc/mm/hugetlbpage.c @@ -66,7 +66,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, if (pud) { pmd = pmd_alloc(mm, pud, addr); if (pmd) - pte = pte_alloc_map(mm, pmd, addr); + pte = pte_alloc_huge(mm, pmd, addr); } return pte; } @@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, if (!pud_none(*pud)) { pmd = pmd_offset(pud, addr); if (!pmd_none(*pmd)) - pte = pte_offset_map(pmd, addr); + pte = pte_offset_huge(pmd, addr); } } } -- cgit From d00ae31fa2fc7ce5ed8d60b9fb10adb5c99a792b Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:22:39 -0700 Subject: powerpc: kvmppc_unmap_free_pmd() pte_offset_kernel() kvmppc_unmap_free_pmd() use pte_offset_kernel(), like everywhere else in book3s_64_mmu_radix.c: instead of pte_offset_map(), which will come to need a pte_unmap() to balance it. But note that this is a more complex case than most: see those -EAGAINs in kvmppc_create_pte(), which is coping with kvmppc races beween page table and huge entry, of the kind which we are expecting to address in pte_offset_map() - this might want to be revisited in future. Link: https://lkml.kernel.org/r/c76aa421-aec3-4cc8-cc61-4130f2e27e1@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 461307b89c3a..572707858d65 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -509,7 +509,7 @@ static void kvmppc_unmap_free_pmd(struct kvm *kvm, pmd_t *pmd, bool full, } else { pte_t *pte; - pte = pte_offset_map(p, 0); + pte = pte_offset_kernel(p, 0); kvmppc_unmap_free_pte(kvm, pte, full, lpid); pmd_clear(p); } -- cgit From 0c31f29b0cbc11e5bef73681e7e9cbf03ce1acbe Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:23:35 -0700 Subject: powerpc: allow pte_offset_map[_lock]() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Balance successful pte_offset_map() with pte_unmap() where omitted. Link: https://lkml.kernel.org/r/54c8b578-ca9-a0f-bfd2-d72976f8d73a@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/powerpc/mm/book3s64/hash_tlb.c | 4 ++++ arch/powerpc/mm/book3s64/subpage_prot.c | 2 ++ arch/powerpc/xmon/xmon.c | 5 ++++- 3 files changed, 10 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/powerpc/mm/book3s64/hash_tlb.c b/arch/powerpc/mm/book3s64/hash_tlb.c index a64ea0a7ef96..21fcad97ae80 100644 --- a/arch/powerpc/mm/book3s64/hash_tlb.c +++ b/arch/powerpc/mm/book3s64/hash_tlb.c @@ -239,12 +239,16 @@ void flush_hash_table_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long local_irq_save(flags); arch_enter_lazy_mmu_mode(); start_pte = pte_offset_map(pmd, addr); + if (!start_pte) + goto out; for (pte = start_pte; pte < start_pte + PTRS_PER_PTE; pte++) { unsigned long pteval = pte_val(*pte); if (pteval & H_PAGE_HASHPTE) hpte_need_flush(mm, addr, pte, pteval, 0); addr += PAGE_SIZE; } + pte_unmap(start_pte); +out: arch_leave_lazy_mmu_mode(); local_irq_restore(flags); } diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c index b75a9fb99599..0dc85556dec5 100644 --- a/arch/powerpc/mm/book3s64/subpage_prot.c +++ b/arch/powerpc/mm/book3s64/subpage_prot.c @@ -71,6 +71,8 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr, if (pmd_none(*pmd)) return; pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + if (!pte) + return; arch_enter_lazy_mmu_mode(); for (; npages > 0; --npages) { pte_update(mm, addr, pte, 0, 0, 0); diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index 728d3c257e4a..69447bdf0bcf 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -3376,12 +3376,15 @@ static void show_pte(unsigned long addr) printf("pmdp @ 0x%px = 0x%016lx\n", pmdp, pmd_val(*pmdp)); ptep = pte_offset_map(pmdp, addr); - if (pte_none(*ptep)) { + if (!ptep || pte_none(*ptep)) { + if (ptep) + pte_unmap(ptep); printf("no valid PTE\n"); return; } format_pte(ptep, pte_val(*ptep)); + pte_unmap(ptep); sync(); __delay(200); -- cgit From 5d991378d1e5b5d4c5b8c0f72e426a94ff340a88 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:24:32 -0700 Subject: powerpc/hugetlb: pte_alloc_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead. huge_pte_offset() is using __find_linux_pte(), which is using pte_offset_kernel() - don't rename that to _huge, it's more complicated. Link: https://lkml.kernel.org/r/36b4e5d-954b-8569-4fe2-bd1797362441@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/powerpc/mm/hugetlbpage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index b900933507da..f7c683b672c1 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -183,7 +183,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, return NULL; if (IS_ENABLED(CONFIG_PPC_8xx) && pshift < PMD_SHIFT) - return pte_alloc_map(mm, (pmd_t *)hpdp, addr); + return pte_alloc_huge(mm, (pmd_t *)hpdp, addr); BUG_ON(!hugepd_none(*hpdp) && !hugepd_ok(*hpdp)); -- cgit From 893f667f744014ebb10d0609c17bee729b0eac57 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:25:42 -0700 Subject: riscv/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/291f20-5947-9f5f-ec7f-96a18df336d9@google.com Signed-off-by: Hugh Dickins Reviewed-by: Alexandre Ghiti Acked-by: Palmer Dabbelt Cc: Alexander Gordeev Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/riscv/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index a163a3e0f0d4..80926946759f 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -43,7 +43,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_alloc_map(mm, pmd, addr & napot_cont_mask(order)); + pte = pte_alloc_huge(mm, pmd, addr & napot_cont_mask(order)); break; } } @@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_offset_kernel(pmd, addr & napot_cont_mask(order)); + pte = pte_offset_huge(pmd, addr & napot_cont_mask(order)); break; } } -- cgit From 5c7f3bf04a6cf266567fdea1ae4987875e92619f Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:27:22 -0700 Subject: s390: allow pte_offset_map_lock() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Add comment on mm's contract with s390 above __zap_zero_pages(), and fix old comment there: must be called after THP was disabled. Link: https://lkml.kernel.org/r/3ff29363-336a-9733-12a1-5c31a45c8aeb@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/s390/kernel/uv.c | 2 ++ arch/s390/mm/gmap.c | 9 ++++++++- arch/s390/mm/pgtable.c | 12 +++++++++--- 3 files changed, 19 insertions(+), 4 deletions(-) (limited to 'arch') diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c index cb2ee06df286..3c62d1b218b1 100644 --- a/arch/s390/kernel/uv.c +++ b/arch/s390/kernel/uv.c @@ -294,6 +294,8 @@ again: rc = -ENXIO; ptep = get_locked_pte(gmap->mm, uaddr, &ptelock); + if (!ptep) + goto out; if (pte_present(*ptep) && !(pte_val(*ptep) & _PAGE_INVALID) && pte_write(*ptep)) { page = pte_page(*ptep); rc = -EAGAIN; diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index dc90d1eb0d55..3a2a31a15ea8 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2537,7 +2537,12 @@ static inline void thp_split_mm(struct mm_struct *mm) * Remove all empty zero pages from the mapping for lazy refaulting * - This must be called after mm->context.has_pgste is set, to avoid * future creation of zero pages - * - This must be called after THP was enabled + * - This must be called after THP was disabled. + * + * mm contracts with s390, that even if mm were to remove a page table, + * racing with the loop below and so causing pte_offset_map_lock() to fail, + * it will never insert a page table containing empty zero pages once + * mm_forbids_zeropage(mm) i.e. mm->context.has_pgste is set. */ static int __zap_zero_pages(pmd_t *pmd, unsigned long start, unsigned long end, struct mm_walk *walk) @@ -2549,6 +2554,8 @@ static int __zap_zero_pages(pmd_t *pmd, unsigned long start, spinlock_t *ptl; ptep = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (!ptep) + break; if (is_zero_pfn(pte_pfn(*ptep))) ptep_xchg_direct(walk->mm, addr, ptep, __pte(_PAGE_INVALID)); pte_unmap_unlock(ptep, ptl); diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 6effb24de6d9..3bd2ab2a9a34 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -829,7 +829,7 @@ int set_guest_storage_key(struct mm_struct *mm, unsigned long addr, default: return -EFAULT; } - +again: ptl = pmd_lock(mm, pmdp); if (!pmd_present(*pmdp)) { spin_unlock(ptl); @@ -850,6 +850,8 @@ int set_guest_storage_key(struct mm_struct *mm, unsigned long addr, spin_unlock(ptl); ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); + if (!ptep) + goto again; new = old = pgste_get_lock(ptep); pgste_val(new) &= ~(PGSTE_GR_BIT | PGSTE_GC_BIT | PGSTE_ACC_BITS | PGSTE_FP_BIT); @@ -938,7 +940,7 @@ int reset_guest_reference_bit(struct mm_struct *mm, unsigned long addr) default: return -EFAULT; } - +again: ptl = pmd_lock(mm, pmdp); if (!pmd_present(*pmdp)) { spin_unlock(ptl); @@ -955,6 +957,8 @@ int reset_guest_reference_bit(struct mm_struct *mm, unsigned long addr) spin_unlock(ptl); ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); + if (!ptep) + goto again; new = old = pgste_get_lock(ptep); /* Reset guest reference bit only */ pgste_val(new) &= ~PGSTE_GR_BIT; @@ -1000,7 +1004,7 @@ int get_guest_storage_key(struct mm_struct *mm, unsigned long addr, default: return -EFAULT; } - +again: ptl = pmd_lock(mm, pmdp); if (!pmd_present(*pmdp)) { spin_unlock(ptl); @@ -1017,6 +1021,8 @@ int get_guest_storage_key(struct mm_struct *mm, unsigned long addr, spin_unlock(ptl); ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl); + if (!ptep) + goto again; pgste = pgste_get_lock(ptep); *key = (pgste_val(pgste) & (PGSTE_ACC_BITS | PGSTE_FP_BIT)) >> 56; paddr = pte_val(*ptep) & PAGE_MASK; -- cgit From b2f58941adcbdb736df1686289619719961691ff Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:29:13 -0700 Subject: s390: gmap use pte_unmap_unlock() not spin_unlock() pte_alloc_map_lock() expects to be followed by pte_unmap_unlock(): to keep balance in future, pass ptep as well as ptl to gmap_pte_op_end(), and use pte_unmap_unlock() instead of direct spin_unlock() (even though ptep ends up unused inside the macro). Link: https://lkml.kernel.org/r/78873af-e1ec-4f9-47ac-483940ac6daa@google.com Signed-off-by: Hugh Dickins Acked-by: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/s390/mm/gmap.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) (limited to 'arch') diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 3a2a31a15ea8..f4b6fc746fce 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -895,12 +895,12 @@ static int gmap_pte_op_fixup(struct gmap *gmap, unsigned long gaddr, /** * gmap_pte_op_end - release the page table lock - * @ptl: pointer to the spinlock pointer + * @ptep: pointer to the locked pte + * @ptl: pointer to the page table spinlock */ -static void gmap_pte_op_end(spinlock_t *ptl) +static void gmap_pte_op_end(pte_t *ptep, spinlock_t *ptl) { - if (ptl) - spin_unlock(ptl); + pte_unmap_unlock(ptep, ptl); } /** @@ -1011,7 +1011,7 @@ static int gmap_protect_pte(struct gmap *gmap, unsigned long gaddr, { int rc; pte_t *ptep; - spinlock_t *ptl = NULL; + spinlock_t *ptl; unsigned long pbits = 0; if (pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID) @@ -1025,7 +1025,7 @@ static int gmap_protect_pte(struct gmap *gmap, unsigned long gaddr, pbits |= (bits & GMAP_NOTIFY_SHADOW) ? PGSTE_VSIE_BIT : 0; /* Protect and unlock. */ rc = ptep_force_prot(gmap->mm, gaddr, ptep, prot, pbits); - gmap_pte_op_end(ptl); + gmap_pte_op_end(ptep, ptl); return rc; } @@ -1154,7 +1154,7 @@ int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val) /* Do *NOT* clear the _PAGE_INVALID bit! */ rc = 0; } - gmap_pte_op_end(ptl); + gmap_pte_op_end(ptep, ptl); } if (!rc) break; @@ -1248,7 +1248,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, if (!rc) gmap_insert_rmap(sg, vmaddr, rmap); spin_unlock(&sg->guest_table_lock); - gmap_pte_op_end(ptl); + gmap_pte_op_end(ptep, ptl); } radix_tree_preload_end(); if (rc) { @@ -2156,7 +2156,7 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) tptep = (pte_t *) gmap_table_walk(sg, saddr, 0); if (!tptep) { spin_unlock(&sg->guest_table_lock); - gmap_pte_op_end(ptl); + gmap_pte_op_end(sptep, ptl); radix_tree_preload_end(); break; } @@ -2167,7 +2167,7 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) rmap = NULL; rc = 0; } - gmap_pte_op_end(ptl); + gmap_pte_op_end(sptep, ptl); spin_unlock(&sg->guest_table_lock); } radix_tree_preload_end(); @@ -2495,7 +2495,7 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], continue; if (ptep_test_and_clear_uc(gmap->mm, vmaddr, ptep)) set_bit(i, bitmap); - spin_unlock(ptl); + pte_unmap_unlock(ptep, ptl); } } gmap_pmd_op_end(gmap, pmdp); -- cgit From b7b7ef6b448513e41796a160efab320755b835c2 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:30:14 -0700 Subject: sh/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/ee885978-7355-624b-cfe2-c3d75672b842@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/sh/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c index 999ab5916e69..6cb0ad73dbb9 100644 --- a/arch/sh/mm/hugetlbpage.c +++ b/arch/sh/mm/hugetlbpage.c @@ -38,7 +38,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, if (pud) { pmd = pmd_alloc(mm, pud, addr); if (pmd) - pte = pte_alloc_map(mm, pmd, addr); + pte = pte_alloc_huge(mm, pmd, addr); } } } @@ -63,7 +63,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, if (pud) { pmd = pmd_offset(pud, addr); if (pmd) - pte = pte_offset_map(pmd, addr); + pte = pte_offset_huge(pmd, addr); } } } -- cgit From c65d09fd2c280f8b8e79655f7c011e25e5b45568 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:31:10 -0700 Subject: sparc/hugetlb: pte_alloc_huge() pte_offset_huge() pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Link: https://lkml.kernel.org/r/c2aeb62f-58f9-d014-ddcd-266267bd97b@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/sparc/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index d8e0e3c7038d..d7018823206c 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -298,7 +298,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, return NULL; if (sz >= PMD_SIZE) return (pte_t *)pmd; - return pte_alloc_map(mm, pmd, addr); + return pte_alloc_huge(mm, pmd, addr); } pte_t *huge_pte_offset(struct mm_struct *mm, @@ -325,7 +325,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, return NULL; if (is_hugetlb_pmd(*pmd)) return (pte_t *)pmd; - return pte_offset_map(pmd, addr); + return pte_offset_huge(pmd, addr); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, -- cgit From 4be14ec02ee12b807f5f66ee165df948409465bb Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:32:36 -0700 Subject: sparc: allow pte_offset_map() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/22165adb-581c-9ce1-8aa6-a3385cd39055@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/sparc/kernel/signal32.c | 2 ++ arch/sparc/mm/fault_64.c | 3 +++ arch/sparc/mm/tlb.c | 2 ++ 3 files changed, 7 insertions(+) (limited to 'arch') diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c index dad38960d1a8..ca450c7bc53f 100644 --- a/arch/sparc/kernel/signal32.c +++ b/arch/sparc/kernel/signal32.c @@ -328,6 +328,8 @@ static void flush_signal_insns(unsigned long address) goto out_irqs_on; ptep = pte_offset_map(pmdp, address); + if (!ptep) + goto out_irqs_on; pte = *ptep; if (!pte_present(pte)) goto out_unmap; diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index d91305de694c..d8a407fbe350 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -99,6 +99,7 @@ static unsigned int get_user_insn(unsigned long tpc) local_irq_disable(); pmdp = pmd_offset(pudp, tpc); +again: if (pmd_none(*pmdp) || unlikely(pmd_bad(*pmdp))) goto out_irq_enable; @@ -115,6 +116,8 @@ static unsigned int get_user_insn(unsigned long tpc) #endif { ptep = pte_offset_map(pmdp, tpc); + if (!ptep) + goto again; pte = *ptep; if (pte_present(pte)) { pa = (pte_pfn(pte) << PAGE_SHIFT); diff --git a/arch/sparc/mm/tlb.c b/arch/sparc/mm/tlb.c index 9a725547578e..7ecf8556947a 100644 --- a/arch/sparc/mm/tlb.c +++ b/arch/sparc/mm/tlb.c @@ -149,6 +149,8 @@ static void tlb_batch_pmd_scan(struct mm_struct *mm, unsigned long vaddr, pte_t *pte; pte = pte_offset_map(&pmd, vaddr); + if (!pte) + return; end = vaddr + HPAGE_SIZE; while (vaddr < end) { if (pte_val(*pte) & _PAGE_VALID) { -- cgit From 7a19c361b1fac7fbe2d21fc0268902150edb059a Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:33:40 -0700 Subject: sparc: iounit and iommu use pte_offset_kernel() iounit_alloc() and sbus_iommu_alloc() are working from pmd_off_k(), so should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Link: https://lkml.kernel.org/r/99962272-12ff-975d-bf7f-7fd5d95a2df5@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/sparc/mm/io-unit.c | 2 +- arch/sparc/mm/iommu.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/sparc/mm/io-unit.c b/arch/sparc/mm/io-unit.c index bf3e6d2fe5d9..133dd42570d6 100644 --- a/arch/sparc/mm/io-unit.c +++ b/arch/sparc/mm/io-unit.c @@ -244,7 +244,7 @@ static void *iounit_alloc(struct device *dev, size_t len, long i; pmdp = pmd_off_k(addr); - ptep = pte_offset_map(pmdp, addr); + ptep = pte_offset_kernel(pmdp, addr); set_pte(ptep, mk_pte(virt_to_page(page), dvma_prot)); diff --git a/arch/sparc/mm/iommu.c b/arch/sparc/mm/iommu.c index 9e3f6933ca13..3a6caef68348 100644 --- a/arch/sparc/mm/iommu.c +++ b/arch/sparc/mm/iommu.c @@ -358,7 +358,7 @@ static void *sbus_iommu_alloc(struct device *dev, size_t len, __flush_page_to_ram(page); pmdp = pmd_off_k(addr); - ptep = pte_offset_map(pmdp, addr); + ptep = pte_offset_kernel(pmdp, addr); set_pte(ptep, mk_pte(virt_to_page(page), dvma_prot)); } -- cgit From 975ca3986bec8ebd6d8b45f4a7f77c730e424ac4 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:35:09 -0700 Subject: x86: allow get_locked_pte() to fail In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Link: https://lkml.kernel.org/r/b7fa8547-4f28-ec82-9893-1b2eb58e40b4@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/x86/kernel/ldt.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c index 525876e7b9f4..adc67f98819a 100644 --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt) va = (unsigned long)ldt_slot_va(ldt->slot) + offset; ptep = get_locked_pte(mm, va, &ptl); - pte_clear(mm, va, ptep); - pte_unmap_unlock(ptep, ptl); + if (!WARN_ON_ONCE(!ptep)) { + pte_clear(mm, va, ptep); + pte_unmap_unlock(ptep, ptl); + } } va = (unsigned long)ldt_slot_va(ldt->slot); -- cgit From 653ba8108debd54c318afbf738cf7d231011380d Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:36:26 -0700 Subject: x86: sme_populate_pgd() use pte_offset_kernel() sme_populate_pgd() is an __init function for sme_encrypt_kernel(): it should use pte_offset_kernel() instead of pte_offset_map(), to avoid the question of whether a pte_unmap() will be needed to balance. Link: https://lkml.kernel.org/r/497d7777-736e-85f2-c37-aa6bcf155e4@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/x86/mm/mem_encrypt_identity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index c6efcf559d88..a1ab542bdfd6 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -188,7 +188,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd) if (pmd_large(*pmd)) return; - pte = pte_offset_map(pmd, ppd->vaddr); + pte = pte_offset_kernel(pmd, ppd->vaddr); if (pte_none(*pte)) set_pte(pte, __pte(ppd->paddr | ppd->pte_flags)); } -- cgit From 56e0d1cb1689da0db8cc383dc3c25e569437bec1 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 8 Jun 2023 12:37:48 -0700 Subject: xtensa: add pte_unmap() to balance pte_offset_map() To keep balance in future, remember to pte_unmap() after a successful pte_offset_map(). And act as if get_pte_for_vaddr() really needs a map there, to read the pteval before "unmapping", to be sure page table is not removed. Link: https://lkml.kernel.org/r/ab2581eb-daa6-894e-4aa6-97c81de3b8c@google.com Signed-off-by: Hugh Dickins Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Chris Zankel Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Heiko Carstens Cc: Helge Deller Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: John David Anglin Cc: John Paul Adrian Glaubitz Cc: Kirill A. Shutemov Cc: Matthew Wilcox (Oracle) Cc: Max Filippov Cc: Michael Ellerman Cc: Michal Simek Cc: Mike Kravetz Cc: Mike Rapoport (IBM) Cc: Palmer Dabbelt Cc: Peter Zijlstra Cc: Qi Zheng Cc: Russell King Cc: Suren Baghdasaryan Cc: Thomas Bogendoerfer Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton --- arch/xtensa/mm/tlb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/xtensa/mm/tlb.c b/arch/xtensa/mm/tlb.c index 27a477dae232..0a11fc5f185b 100644 --- a/arch/xtensa/mm/tlb.c +++ b/arch/xtensa/mm/tlb.c @@ -179,6 +179,7 @@ static unsigned get_pte_for_vaddr(unsigned vaddr) pud_t *pud; pmd_t *pmd; pte_t *pte; + unsigned int pteval; if (!mm) mm = task->active_mm; @@ -197,7 +198,9 @@ static unsigned get_pte_for_vaddr(unsigned vaddr) pte = pte_offset_map(pmd, vaddr); if (!pte) return 0; - return pte_val(*pte); + pteval = pte_val(*pte); + pte_unmap(pte); + return pteval; } enum { -- cgit From 9382bc44b5f58ccee375f08f518e53c0280051dc Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Mon, 12 Jun 2023 16:31:55 +0100 Subject: arm64: allow kmalloc() caches aligned to the smaller cache_line_size() On arm64, ARCH_DMA_MINALIGN is 128, larger than the cache line size on most of the current platforms (typically 64). Define ARCH_KMALLOC_MINALIGN to 8 (the default for architectures without their own ARCH_DMA_MINALIGN) and override dma_get_cache_alignment() to return cache_line_size(), probed at run-time. The kmalloc() caches will be limited to the cache line size. This will allow the additional kmalloc-{64,192} caches on most arm64 platforms. Link: https://lkml.kernel.org/r/20230612153201.554742-12-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Tested-by: Isaac J. Manjarres Cc: Will Deacon Cc: Alasdair Kergon Cc: Ard Biesheuvel Cc: Arnd Bergmann Cc: Christoph Hellwig Cc: Daniel Vetter Cc: Greg Kroah-Hartman Cc: Herbert Xu Cc: Jerry Snitselaar Cc: Joerg Roedel Cc: Jonathan Cameron Cc: Jonathan Cameron Cc: Lars-Peter Clausen Cc: Logan Gunthorpe Cc: Marc Zyngier Cc: Mark Brown Cc: Mike Snitzer Cc: "Rafael J. Wysocki" Cc: Robin Murphy Cc: Saravana Kannan Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- arch/arm64/include/asm/cache.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'arch') diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index a51e6e8f3171..ceb368d33bf4 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -33,6 +33,7 @@ * the CPU. */ #define ARCH_DMA_MINALIGN (128) +#define ARCH_KMALLOC_MINALIGN (8) #ifndef __ASSEMBLY__ @@ -90,6 +91,8 @@ static inline int cache_line_size_of_cpu(void) int cache_line_size(void); +#define dma_get_cache_alignment cache_line_size + /* * Read the effective value of CTR_EL0. * -- cgit From 1c1a429efd4ee8ca244cc2401365c983cda4ed76 Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Mon, 12 Jun 2023 16:32:01 +0100 Subject: arm64: enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64 With the DMA bouncing of unaligned kmalloc() buffers now in place, enable it for arm64 to allow the kmalloc-{8,16,32,48,96} caches. In addition, always create the swiotlb buffer even when the end of RAM is within the 32-bit physical address range (the swiotlb buffer can still be disabled on the kernel command line). Link: https://lkml.kernel.org/r/20230612153201.554742-18-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Tested-by: Isaac J. Manjarres Cc: Will Deacon Cc: Alasdair Kergon Cc: Ard Biesheuvel Cc: Arnd Bergmann Cc: Christoph Hellwig Cc: Daniel Vetter Cc: Greg Kroah-Hartman Cc: Herbert Xu Cc: Jerry Snitselaar Cc: Joerg Roedel Cc: Jonathan Cameron Cc: Jonathan Cameron Cc: Lars-Peter Clausen Cc: Logan Gunthorpe Cc: Marc Zyngier Cc: Mark Brown Cc: Mike Snitzer Cc: "Rafael J. Wysocki" Cc: Robin Murphy Cc: Saravana Kannan Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- arch/arm64/Kconfig | 1 + arch/arm64/mm/init.c | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b1201d25a8a4..af42871431c0 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -120,6 +120,7 @@ config ARM64 select CRC32 select DCACHE_WORD_ACCESS select DYNAMIC_FTRACE if FUNCTION_TRACER + select DMA_BOUNCE_UNALIGNED_KMALLOC select DMA_DIRECT_REMAP select EDAC_SUPPORT select FRAME_POINTER diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 66e70ca47680..3ac2e9d79ce4 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -442,7 +442,12 @@ void __init bootmem_init(void) */ void __init mem_init(void) { - swiotlb_init(max_pfn > PFN_DOWN(arm64_dma_phys_limit), SWIOTLB_VERBOSE); + bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit); + + if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC)) + swiotlb = true; + + swiotlb_init(swiotlb, SWIOTLB_VERBOSE); /* this will put all unused low memory onto the freelists */ memblock_free_all(); -- cgit From 78615c4ddb73bd4a7f13ec4bab60b974b8fc6faa Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Tue, 13 Jun 2023 16:52:43 +0100 Subject: powerpc: move the ARCH_DMA_MINALIGN definition to asm/cache.h Patch series "Move the ARCH_DMA_MINALIGN definition to asm/cache.h". The ARCH_KMALLOC_MINALIGN reduction series defines a generic ARCH_DMA_MINALIGN in linux/cache.h: https://lore.kernel.org/r/20230612153201.554742-2-catalin.marinas@arm.com/ Unfortunately, this causes a duplicate definition warning for microblaze, powerpc (32-bit only) and sh as these architectures define ARCH_DMA_MINALIGN in a different file than asm/cache.h. Move the macro to asm/cache.h to avoid this issue and also bring them in line with the other architectures. This patch (of 3): The powerpc architecture defines ARCH_DMA_MINALIGN in asm/page_32.h and only if CONFIG_NOT_COHERENT_CACHE is enabled (32-bit platforms only). Move this macro to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. Link: https://lkml.kernel.org/r/20230613155245.1228274-1-catalin.marinas@arm.com Link: https://lkml.kernel.org/r/20230613155245.1228274-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306131053.1ybvRRhO-lkp@intel.com/ Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: John Paul Adrian Glaubitz Cc: Michal Simek Cc: Rich Felker Cc: Vlastimil Babka Cc: Yoshinori Sato Signed-off-by: Andrew Morton --- arch/powerpc/include/asm/cache.h | 4 ++++ arch/powerpc/include/asm/page_32.h | 4 ---- 2 files changed, 4 insertions(+), 4 deletions(-) (limited to 'arch') diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index ae0a68a838e8..69232231d270 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -33,6 +33,10 @@ #define IFETCH_ALIGN_BYTES (1 << IFETCH_ALIGN_SHIFT) +#ifdef CONFIG_NOT_COHERENT_CACHE +#define ARCH_DMA_MINALIGN L1_CACHE_BYTES +#endif + #if !defined(__ASSEMBLY__) #ifdef CONFIG_PPC64 diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 56f217606327..b9ac9e3a771c 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -12,10 +12,6 @@ #define VM_DATA_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS32 -#ifdef CONFIG_NOT_COHERENT_CACHE -#define ARCH_DMA_MINALIGN L1_CACHE_BYTES -#endif - #if defined(CONFIG_PPC_256K_PAGES) || \ (defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)) #define PTE_SHIFT (PAGE_SHIFT - PTE_T_LOG2 - 2) /* 1/4 of a page */ -- cgit From 4ea57ce42886adb0129605f23a0cf0809271a524 Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Tue, 13 Jun 2023 16:52:44 +0100 Subject: microblaze: move the ARCH_{DMA,SLAB}_MINALIGN definitions to asm/cache.h The microblaze architecture defines ARCH_DMA_MINALIGN in asm/page.h. Move it to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. While at it, also move ARCH_SLAB_MINALIGN to asm/cache.h for consistency. Link: https://lkml.kernel.org/r/20230613155245.1228274-3-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Cc: Michal Simek Cc: Christophe Leroy Cc: John Paul Adrian Glaubitz Cc: kernel test robot Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Rich Felker Cc: Vlastimil Babka Cc: Yoshinori Sato Signed-off-by: Andrew Morton --- arch/microblaze/include/asm/cache.h | 5 +++++ arch/microblaze/include/asm/page.h | 5 ----- 2 files changed, 5 insertions(+), 5 deletions(-) (limited to 'arch') diff --git a/arch/microblaze/include/asm/cache.h b/arch/microblaze/include/asm/cache.h index a149b3e711ec..1903988b9e23 100644 --- a/arch/microblaze/include/asm/cache.h +++ b/arch/microblaze/include/asm/cache.h @@ -18,4 +18,9 @@ #define SMP_CACHE_BYTES L1_CACHE_BYTES +/* MS be sure that SLAB allocates aligned objects */ +#define ARCH_DMA_MINALIGN L1_CACHE_BYTES + +#define ARCH_SLAB_MINALIGN L1_CACHE_BYTES + #endif /* _ASM_MICROBLAZE_CACHE_H */ diff --git a/arch/microblaze/include/asm/page.h b/arch/microblaze/include/asm/page.h index 7b9861bcd458..337f23eabc71 100644 --- a/arch/microblaze/include/asm/page.h +++ b/arch/microblaze/include/asm/page.h @@ -30,11 +30,6 @@ #ifndef __ASSEMBLY__ -/* MS be sure that SLAB allocates aligned objects */ -#define ARCH_DMA_MINALIGN L1_CACHE_BYTES - -#define ARCH_SLAB_MINALIGN L1_CACHE_BYTES - /* * PAGE_OFFSET -- the first address of the first page of memory. With MMU * it is set to the kernel start address (aligned on a page boundary). -- cgit From e6926a4d1c9bfe12814c7c81acd57346285cd7ba Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Tue, 13 Jun 2023 16:52:45 +0100 Subject: sh: move the ARCH_DMA_MINALIGN definition to asm/cache.h The sh architecture defines ARCH_DMA_MINALIGN in asm/page.h. Move it to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. Link: https://lkml.kernel.org/r/20230613155245.1228274-4-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Cc: Yoshinori Sato Cc: Rich Felker Cc: John Paul Adrian Glaubitz Cc: Christophe Leroy Cc: kernel test robot Cc: Michael Ellerman Cc: Michal Simek Cc: Nicholas Piggin Cc: Vlastimil Babka Signed-off-by: Andrew Morton --- arch/sh/include/asm/cache.h | 6 ++++++ arch/sh/include/asm/page.h | 6 ------ 2 files changed, 6 insertions(+), 6 deletions(-) (limited to 'arch') diff --git a/arch/sh/include/asm/cache.h b/arch/sh/include/asm/cache.h index 32dfa6b82ec6..b38dbc975581 100644 --- a/arch/sh/include/asm/cache.h +++ b/arch/sh/include/asm/cache.h @@ -14,6 +14,12 @@ #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) +/* + * Some drivers need to perform DMA into kmalloc'ed buffers + * and so we have to increase the kmalloc minalign for this. + */ +#define ARCH_DMA_MINALIGN L1_CACHE_BYTES + #define __read_mostly __section(".data..read_mostly") #ifndef __ASSEMBLY__ diff --git a/arch/sh/include/asm/page.h b/arch/sh/include/asm/page.h index 09ac6c7faee0..62f4b9edcb98 100644 --- a/arch/sh/include/asm/page.h +++ b/arch/sh/include/asm/page.h @@ -174,10 +174,4 @@ typedef struct page *pgtable_t; #include #include -/* - * Some drivers need to perform DMA into kmalloc'ed buffers - * and so we have to increase the kmalloc minalign for this. - */ -#define ARCH_DMA_MINALIGN L1_CACHE_BYTES - #endif /* __ASM_SH_PAGE_H */ -- cgit