summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2022-12-02seccomp: document the "filter_count" fieldRandy Dunlap
Add missing kernel-doc for the 'filter_count' field in struct seccomp. seccomp.h:40: warning: Function parameter or member 'filter_count' not described in 'seccomp' Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221202073953.14677-1-rdunlap@infradead.org
2022-12-02regulator: add mt6357 regulatorFabien Parent
Add regulator driver for the MT6357 PMIC. Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20221005-mt6357-support-v7-7-477e60126749@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-02printk, xen: fbfront: create/use safe function for forcing preferredJohn Ogness
With commit 9e124fe16ff2("xen: Enable console tty by default in domU if it's not a dummy") a hack was implemented to make sure that the tty console remains the console behind the /dev/console device. The main problem with the hack is that, after getting the console pointer to the tty console, it is assumed the pointer is still valid after releasing the console_sem. This assumption is incorrect and unsafe. Make the hack safe by introducing a new function console_force_preferred_locked() and perform the full operation under the console_list_lock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-33-john.ogness@linutronix.de
2022-12-02serial_core: replace uart_console_enabled() with uart_console_registered()John Ogness
All users of uart_console_enabled() really want to know if a console is registered. It is not reliable to check for CON_ENABLED in order to identify if a console is registered. Use console_is_registered() instead. A _locked() variant is provided because uart_set_options() is always called with the console_list_lock held and must check if a console is registered in order to synchronize with kgdboc. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-23-john.ogness@linutronix.de
2022-12-02console: introduce console_is_registered()John Ogness
Currently it is not possible for drivers to detect if they have already successfully registered their console. Several drivers have multiple paths that lead to console registration. To avoid attempting a 2nd registration (which leads to a WARN), drivers are implementing their own solution. Introduce console_is_registered() so drivers can easily identify if their console is currently registered. A _locked() variant is also provided if the caller is already holding the console_list_lock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-22-john.ogness@linutronix.de
2022-12-02console: introduce wrappers to read/write console flagsJohn Ogness
After switching to SRCU for console list iteration, some readers will begin readings console->flags as a data race. Locklessly reading console->flags provides a consistent value because there is at most one CPU modifying console->flags and that CPU is using only read-modify-write operations. Introduce a wrapper for SRCU iterators to read console flags. Introduce a matching wrapper to write to flags of registered consoles. Writing to flags of registered consoles is synchronized by the console_list_lock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-13-john.ogness@linutronix.de
2022-12-02printk: introduce console_list_lockJohn Ogness
Currently there exist races in register_console(), where the types of registered consoles are checked (without holding the console_lock) and then after acquiring the console_lock, it is assumed that the list has not changed. Also, some code that performs console_unregister() make similar assumptions. It might be possible to fix these races using the console_lock. But it would require a complex analysis of all console drivers to make sure that the console_lock is not taken in match() and setup() callbacks. And we really prefer to split up and reduce the responsibilities of console_lock rather than expand its complexity. Therefore, introduce a new console_list_lock to provide full synchronization for any console list changes. In addition, also use console_list_lock for synchronization of console->flags updates. All flags are either static or modified only during the console registration. There are only two exceptions. The first exception is CON_ENABLED, which is also modified by console_start()/console_stop(). Therefore, these functions must also take the console_list_lock. The second exception is when the flags are modified by the console driver init code before the console is registered. These will be ignored because they are not visible to the rest of the system via the console_drivers list. Note that one of the various responsibilities of the console_lock is also intended to provide console list and console->flags synchronization. Later changes will update call sites relying on the console_lock for these purposes. Once all call sites have been updated, the console_lock will be relieved of synchronizing console_list and console->flags updates. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87sficwokr.fsf@jogness.linutronix.de
2022-12-02printk: Prepare for SRCU console list protectionJohn Ogness
Provide an NMI-safe SRCU protected variant to walk the console list. Note that all console fields are now set before adding the console to the list to avoid the console becoming visible by SCRU readers before being fully initialized. This is a preparatory change for a new console infrastructure which operates independent of the console BKL. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-4-john.ogness@linutronix.de
2022-12-02printk: Convert console_drivers list to hlistThomas Gleixner
Replace the open coded single linked list with a hlist so a conversion to SRCU protected list walks can reuse the existing primitives. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20221116162152.193147-3-john.ogness@linutronix.de
2022-12-02configfs: remove mentions of committable itemsBartosz Golaszewski
A proposition of implementation of committable items has been rejected due to the gpio-sim module being the only user and configfs not getting much development in general. In that case, let's remove the notion of committable items from docs and headers. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-12-01nvdimm/cxl/pmem: Add support for master passphrase disable security commandDave Jiang
The original nvdimm_security_ops ->disable() only supports user passphrase for security disable. The CXL spec introduced the disabling of master passphrase. Add a ->disable_master() callback to support this new operation and leaving the old ->disable() mechanism alone. A "disable_master" command is added for the sysfs attribute in order to allow command to be issued from userspace. ndctl will need enabling in order to utilize this new operation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983616454.2734609.14204031148234398086.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-01blk-crypto: Add support for SM4-XTS blk crypto modeTianjia Zhang
SM4 is a symmetric cipher algorithm widely used in China. The SM4-XTS variant is used to encrypt length-preserving data. This is the mandatory algorithm in some special scenarios. Add support for the algorithm to block inline encryption. This is needed for the inlinecrypt mount option to be supported via blk-crypto-fallback, as it is for the other fscrypt modes. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20221201125819.36932-2-tianjia.zhang@linux.alibaba.com
2022-12-01firmware: arm_ffa: Move comment before the field it is documentingWill Deacon
This is consistent with the other comments in the struct. Co-developed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20221116170335.2341003-3-qperret@google.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01firmware: arm_ffa: Move constants to header fileWill Deacon
FF-A function IDs and error codes will be needed in the hypervisor too, so move to them to the header file where they can be shared. Rename the version constants with an "FFA_" prefix so that they are less likely to clash with other code in the tree. Co-developed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20221116170335.2341003-2-qperret@google.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01ext4: journal_path mount options should follow linksLukas Czerner
Before the commit 461c3af045d3 ("ext4: Change handle_mount_opt() to use fs_parameter") ext4 mount option journal_path did follow links in the provided path. Bring this behavior back by allowing to pass pathwalk flags to fs_lookup_param(). Fixes: 461c3af045d3 ("ext4: Change handle_mount_opt() to use fs_parameter") Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20221004135803.32283-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-12-01vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copyJann Horn
find_timens_vvar_page() is not architecture-specific, as can be seen from how all five per-architecture versions of it are the same. (arm64, powerpc and riscv are exactly the same; x86 and s390 have two characters difference inside a comment, less blank lines, and mark the !CONFIG_TIME_NS version as inline.) Refactor the five copies into a central copy in kernel/time/namespace.c. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221130115320.2918447-1-jannh@google.com
2022-11-30linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>Masahiro Yamada
With CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y, the following code fails to build: ---------------->8---------------- #include <linux/init.h> int foo(void) { return 0; } core_initcall(foo); ---------------->8---------------- Include <linux/build_bug.h> for static_assert() and <linux/stringify.h> for __stringify(). Link: https://lkml.kernel.org/r/20221113110802.3760705-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Jiangshan Yi <yijiangshan@kylinos.cn> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> # build-tested Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30debugfs: fix error when writing negative value to atomic_t debugfs fileAkinobu Mita
The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), so we have to use a 64-bit value to write a negative value for a debugfs file created by debugfs_create_atomic_t(). This restores the previous behaviour by introducing DEFINE_DEBUGFS_ATTRIBUTE_SIGNED for a signed value. Link: https://lkml.kernel.org/r/20220919172418.45257-4-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed valueAkinobu Mita
Patch series "fix error when writing negative value to simple attribute files". The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), but some attribute files want to accept a negative value. This patch (of 3): The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), so we have to use a 64-bit value to write a negative value. This adds DEFINE_SIMPLE_ATTRIBUTE_SIGNED for a signed value. Link: https://lkml.kernel.org/r/20220919172418.45257-1-akinobu.mita@gmail.com Link: https://lkml.kernel.org/r/20220919172418.45257-2-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-01Merge branch 'slub-tiny-v1r6' into slab/for-nextVlastimil Babka
Merge my series [1] to deprecate the SLOB allocator. - Renames CONFIG_SLOB to CONFIG_SLOB_DEPRECATED with deprecation notice. - The recommended replacement is CONFIG_SLUB, optionally with the new CONFIG_SLUB_TINY tweaks for systems with 16MB or less RAM. - Use cases that stopped working with CONFIG_SLUB_TINY instead of SLOB should be reported to linux-mm@kvack.org and slab maintainers, otherwise SLOB will be removed in few cycles. [1] https://lore.kernel.org/all/20221121171202.22080-1-vbabka@suse.cz/
2022-12-01mm, slub: remove percpu slabs with CONFIG_SLUB_TINYVlastimil Babka
SLUB gets most of its scalability by percpu slabs. However for CONFIG_SLUB_TINY the goal is minimal memory overhead, not scalability. Thus, #ifdef out the whole kmem_cache_cpu percpu structure and associated code. Additionally to the slab page savings, this reduces percpu allocator usage, and code size. This change builds on recent commit c7323a5ad078 ("mm/slub: restrict sysfs validation to debug caches and make it safe"), as caches with enabled debugging also avoid percpu slabs and all allocations and freeing ends up working with the partial list. With a bit more refactoring by the preceding patches, use the same code paths with CONFIG_SLUB_TINY. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com>
2022-11-30revert "kbuild: fix -Wimplicit-function-declaration in ↵Andrew Morton
license_is_gpl_compatible" It causes build failures with unusual CC/HOSTCC combinations. Quoting https://lkml.kernel.org/r/A222B1E6-69B8-4085-AD1B-27BDB72CA971@goldelico.com: HOSTCC scripts/mod/modpost.o - due to target missing In file included from include/linux/string.h:5, from scripts/mod/../../include/linux/license.h:5, from scripts/mod/modpost.c:24: include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory 246 | #include <asm/rwonce.h> | ^~~~~~~~~~~~~~ compilation terminated. ... The problem is that HOSTCC is not necessarily the same compiler or even architecture as CC and pulling in <linux/compiler.h> or <asm/rwonce.h> files indirectly isn't a good idea then. My toolchain is providing HOSTCC = gcc (MacPorts) and CC = arm-linux-gnueabihf (built from gcc source) and all running on Darwin. If I change the include to <string.h> I can then "HOSTCC scripts/mod/modpost.c" but then it fails for "CC kernel/module/main.c" not finding <string.h>: CC kernel/module/main.o - due to target missing In file included from kernel/module/main.c:43:0: ./include/linux/license.h:5:20: fatal error: string.h: No such file or directory #include <string.h> ^ compilation terminated. Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com> Cc: Sam James <sam@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: introduce arch_has_hw_nonleaf_pmd_young()Juergen Gross
When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in pmdp_test_and_clear_young(): BUG: unable to handle page fault for address: ffff8880083374d0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1 RIP: e030:pmdp_test_and_clear_young+0x25/0x40 This happens because the Xen hypervisor can't emulate direct writes to page table entries other than PTEs. This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young() similar to arch_has_hw_pte_young() and test that instead of CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG. Link: https://lkml.kernel.org/r/20221123064510.16225-1-jgross@suse.com Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") Signed-off-by: Juergen Gross <jgross@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: David Hildenbrand <david@redhat.com> [core changes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: add dummy pmd_young() for architectures not having itJuergen Gross
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processingMike Kravetz
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page tables associated with the address range. For hugetlb vmas, zap_page_range will call __unmap_hugepage_range_final. However, __unmap_hugepage_range_final assumes the passed vma is about to be removed and deletes the vma_lock to prevent pmd sharing as the vma is on the way out. In the case of madvise(MADV_DONTNEED) the vma remains, but the missing vma_lock prevents pmd sharing and could potentially lead to issues with truncation/fault races. This issue was originally reported here [1] as a BUG triggered in page_try_dup_anon_rmap. Prior to the introduction of the hugetlb vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to prevent pmd sharing. Subsequent faults on this vma were confused as VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was not set in new pages added to the page table. This resulted in pages that appeared anonymous in a VM_SHARED vma and triggered the BUG. Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from unmap_vmas(). This is used to indicate the 'final' unmapping of a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and the vm_lock is not deleted. [1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20221114235507.294320-3-mike.kravetz@oracle.com Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Wei Chen <harperchen1110@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mina Almasry <almasrymina@google.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30madvise: use zap_page_range_single for madvise dontneedMike Kravetz
This series addresses the issue first reported in [1], and fully described in patch 2. Patches 1 and 2 address the user visible issue and are tagged for stable backports. While exploring solutions to this issue, related problems with mmu notification calls were discovered. This is addressed in the patch "hugetlb: remove duplicate mmu notifications:". Since there are no user visible effects, this third is not tagged for stable backports. Previous discussions suggested further cleanup by removing the routine zap_page_range. This is possible because zap_page_range_single is now exported, and all callers of zap_page_range pass ranges entirely within a single vma. This work will be done in a later patch so as not to distract from this bug fix. [1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/ This patch (of 2): Expose the routine zap_page_range_single to zap a range within a single vma. The madvise routine madvise_dontneed_single_vma can use this routine as it explicitly operates on a single vma. Also, update the mmu notification range in zap_page_range_single to take hugetlb pmd sharing into account. This is required as MADV_DONTNEED supports hugetlb vmas. Link: https://lkml.kernel.org/r/20221114235507.294320-1-mike.kravetz@oracle.com Link: https://lkml.kernel.org/r/20221114235507.294320-2-mike.kravetz@oracle.com Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Wei Chen <harperchen1110@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mina Almasry <almasrymina@google.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODEYang Shi
Syzbot reported the below splat: WARNING: CPU: 1 PID: 3646 at include/linux/gfp.h:221 __alloc_pages_node include/linux/gfp.h:221 [inline] WARNING: CPU: 1 PID: 3646 at include/linux/gfp.h:221 hpage_collapse_alloc_page mm/khugepaged.c:807 [inline] WARNING: CPU: 1 PID: 3646 at include/linux/gfp.h:221 alloc_charge_hpage+0x802/0xaa0 mm/khugepaged.c:963 Modules linked in: CPU: 1 PID: 3646 Comm: syz-executor210 Not tainted 6.1.0-rc1-syzkaller-00454-ga70385240892 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 RIP: 0010:__alloc_pages_node include/linux/gfp.h:221 [inline] RIP: 0010:hpage_collapse_alloc_page mm/khugepaged.c:807 [inline] RIP: 0010:alloc_charge_hpage+0x802/0xaa0 mm/khugepaged.c:963 Code: e5 01 4c 89 ee e8 6e f9 ae ff 4d 85 ed 0f 84 28 fc ff ff e8 70 fc ae ff 48 8d 6b ff 4c 8d 63 07 e9 16 fc ff ff e8 5e fc ae ff <0f> 0b e9 96 fa ff ff 41 bc 1a 00 00 00 e9 86 fd ff ff e8 47 fc ae RSP: 0018:ffffc90003fdf7d8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff888077f457c0 RSI: ffffffff81cd8f42 RDI: 0000000000000001 RBP: ffff888079388c0c R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f6b48ccf700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6b48a819f0 CR3: 00000000171e7000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> collapse_file+0x1ca/0x5780 mm/khugepaged.c:1715 hpage_collapse_scan_file+0xd6c/0x17a0 mm/khugepaged.c:2156 madvise_collapse+0x53a/0xb40 mm/khugepaged.c:2611 madvise_vma_behavior+0xd0a/0x1cc0 mm/madvise.c:1066 madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1240 do_madvise.part.0+0x24a/0x340 mm/madvise.c:1419 do_madvise mm/madvise.c:1432 [inline] __do_sys_madvise mm/madvise.c:1432 [inline] __se_sys_madvise mm/madvise.c:1430 [inline] __x64_sys_madvise+0x113/0x150 mm/madvise.c:1430 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6b48a4eef9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f6b48ccf318 EFLAGS: 00000246 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 00007f6b48af0048 RCX: 00007f6b48a4eef9 RDX: 0000000000000019 RSI: 0000000000600003 RDI: 0000000020000000 RBP: 00007f6b48af0040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6b48aa53a4 R13: 00007f6b48bffcbf R14: 00007f6b48ccf400 R15: 0000000000022000 </TASK> It is because khugepaged allocates pages with __GFP_THISNODE, but the preferred node is bogus. The previous patch fixed the khugepaged code to avoid allocating page from non-existing node. But it is still racy against memory hotremove. There is no synchronization with the memory hotplug so it is possible that memory gets offline during a longer taking scanning. So this warning still seems not quite helpful because: * There is no guarantee the node is online for __GFP_THISNODE context for all the callsites. * Kernel just fails the allocation regardless the warning, and it looks all callsites handle the allocation failure gracefully. Although while the warning has helped to identify a buggy code, it is not safe in general and this warning could panic the system with panic-on-warn configuration which tends to be used surprisingly often. So replace VM_WARN_ON to pr_warn(). And the warning will be triggered if __GFP_NOWARN is set since the allocator would print out warning for such case if __GFP_NOWARN is not set. [shy828301@gmail.com: rename nid to this_node and gfp to warn_gfp] Link: https://lkml.kernel.org/r/20221123193014.153983-1-shy828301@gmail.com [akpm@linux-foundation.org: fix whitespace] [akpm@linux-foundation.org: print gfp_mask instead of warn_gfp, per Michel] Link: https://lkml.kernel.org/r/20221108184357.55614-3-shy828301@gmail.com Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") Signed-off-by: Yang Shi <shy828301@gmail.com> Reported-by: <syzbot+0044b22d177870ee974f@syzkaller.appspotmail.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30Merge branches 'doc.2022.10.20a', 'fixes.2022.10.21a', 'lazy.2022.11.30a', ↵Paul E. McKenney
'srcunmisafe.2022.11.09a', 'torture.2022.10.18c' and 'torturescript.2022.10.20a' into HEAD doc.2022.10.20a: Documentation updates. fixes.2022.10.21a: Miscellaneous fixes. lazy.2022.11.30a: Lazy call_rcu() and NOCB updates. srcunmisafe.2022.11.09a: NMI-safe SRCU readers. torture.2022.10.18c: Torture-test updates. torturescript.2022.10.20a: Torture-test scripting updates.
2022-11-30block: untangle request_queue refcounting from sysfsChristoph Hellwig
The kobject embedded into the request_queue is used for the queue directory in sysfs, but that is a child of the gendisks directory and is intimately tied to it. Move this kobject to the gendisk and use a refcount_t in the request_queue for the actual request_queue refcounting that is completely unrelated to the device model. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221114042637.1009333-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-30io_uring: reshuffle issue_flagsPavel Begunkov
Reshuffle issue flags to keep normal flags separate from the uring_cmd ctx-setup like flags. Shift the second type to the second byte so it's easier to add new ones in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d6e4696c883943082d248716f4cd568f37b17a74.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-30Merge tag 'arm-soc/for-6.2/drivers' of https://github.com/Broadcom/stblinux ↵Arnd Bergmann
into soc/drivers This pull request contains Broadcom SoCs driver changes for 6.2, please pull the following: - Yuan uses dev_err_probe() in the Raspberry Pi firmware provider to simplify the error handling code - Rafal adds support for initialiazing the BCM47xx NVMEM/NVRAM firmware provider out of memory-mapped flash devices. * tag 'arm-soc/for-6.2/drivers' of https://github.com/Broadcom/stblinux: firmware/nvram: bcm47xx: support init from IO memory firmware: raspberrypi: Use dev_err_probe() to simplify code Link: https://lore.kernel.org/r/20221129191755.542584-3-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Revert "blk-cgroup: Flush stats at blkgs destruction path"Jens Axboe
This reverts commit dae590a6c96c799434e0ff8156ef29b88c257e60. We've had a few reports on this causing a crash at boot time, because of a reference issue. While this problem seemginly did exist before the patch and needs solving separately, this patch makes it a lot easier to trigger. Link: https://lore.kernel.org/linux-block/CA+QYu4oxiRKC6hJ7F27whXy-PRBx=Tvb+-7TQTONN8qTtV3aDA@mail.gmail.com/ Link: https://lore.kernel.org/linux-block/69af7ccb-6901-c84c-0e95-5682ccfb750c@acm.org/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-30filelock: add a new locks_inode_context accessor functionJeff Layton
There are a number of places in the kernel that are accessing the inode->i_flctx field without smp_load_acquire. This is required to ensure that the caller doesn't see a partially-initialized structure. Add a new accessor function for it to make this clear and convert all of the relevant accesses in locks.c to use it. Also, convert locks_free_lock_context to use the helper as well instead of just doing a "bare" assignment. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2022-11-30filelock: new helper: vfs_inode_has_locksJeff Layton
Ceph has a need to know whether a particular inode has any locks set on it. It's currently tracking that by a num_locks field in its filp->private_data, but that's problematic as it tries to decrement this field when releasing locks and that can race with the file being torn down. Add a new vfs_inode_has_locks helper that just returns whether any locks are currently held on the inode. Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2022-11-30Merge tag 'drm/tegra/for-6.2-rc1' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/tegra into drm-next drm/tegra: Changes for v6.2-rc1 This contains a bunch of cleanups across the board as well as support for the NVDEC hardware found on the Tegra234 SoC. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125155219.3352952-1-thierry.reding@gmail.com
2022-11-29rcu: Make SRCU mandatoryPaul E. McKenney
Kernels configured with CONFIG_PRINTK=n and CONFIG_SRCU=n get build failures. This causes trouble for deep embedded systems. But given that there are more than 25 instances of "select SRCU" in the kernel, it is hard to believe that there are many kernels running in production without SRCU. This commit therefore makes SRCU mandatory. The SRCU Kconfig option remains for backwards compatibility, and will be removed when it is no longer used. [ paulmck: Update per kernel test robot feedback. ] Reported-by: John Ogness <john.ogness@linutronix.de> Reported-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: <linux-arch@vger.kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: John Ogness <john.ogness@linutronix.de>
2022-11-29rcu: Make call_rcu() lazy to save powerJoel Fernandes (Google)
Implement timer-based RCU callback batching (also known as lazy callbacks). With this we save about 5-10% of power consumed due to RCU requests that happen when system is lightly loaded or idle. By default, all async callbacks (queued via call_rcu) are marked lazy. An alternate API call_rcu_hurry() is provided for the few users, for example synchronize_rcu(), that need the old behavior. The batch is flushed whenever a certain amount of time has passed, or the batch on a particular CPU grows too big. Also memory pressure will flush it in a future patch. To handle several corner cases automagically (such as rcu_barrier() and hotplug), we re-use bypass lists which were originally introduced to address lock contention, to handle lazy CBs as well. The bypass list length has the lazy CB length included in it. A separate lazy CB length counter is also introduced to keep track of the number of lazy CBs. [ paulmck: Fix formatting of inline call_rcu_lazy() definition. ] [ paulmck: Apply Zqiang feedback. ] [ paulmck: Apply s/call_rcu_flush/call_rcu_hurry/ feedback from Tejun Heo. ] Suggested-by: Paul McKenney <paulmck@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-11-29Merge branch 'for-6.2/io_uring' into for-6.2/io_uring-nextJens Axboe
* for-6.2/io_uring: (41 commits) io_uring: keep unlock_post inlined in hot path io_uring: don't use complete_post in kbuf io_uring: spelling fix io_uring: remove io_req_complete_post_tw io_uring: allow multishot polled reqs to defer completion io_uring: remove overflow param from io_post_aux_cqe io_uring: add lockdep assertion in io_fill_cqe_aux io_uring: make io_fill_cqe_aux static io_uring: add io_aux_cqe which allows deferred completion io_uring: allow defer completion for aux posted cqes io_uring: defer all io_req_complete_failed io_uring: always lock in io_apoll_task_func io_uring: remove iopoll spinlock io_uring: iopoll protect complete_post io_uring: inline __io_req_complete_put() io_uring: remove io_req_tw_post_queue io_uring: use io_req_task_complete() in timeout io_uring: hold locks for io_req_complete_failed io_uring: add completion locking for iopoll io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post() ...
2022-11-29Merge tag 'net-6.1-rc8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and wifi. Current release - new code bugs: - eth: mlx5e: - use kvfree() in mlx5e_accel_fs_tcp_create() - MACsec, fix RX data path 16 RX security channel limit - MACsec, fix memory leak when MACsec device is deleted - MACsec, fix update Rx secure channel active field - MACsec, fix add Rx security association (SA) rule memory leak Previous releases - regressions: - wifi: cfg80211: don't allow multi-BSSID in S1G - stmmac: set MAC's flow control register to reflect current settings - eth: mlx5: - E-switch, fix duplicate lag creation - fix use-after-free when reverting termination table Previous releases - always broken: - ipv4: fix route deletion when nexthop info is not specified - bpf: fix a local storage BPF map bug where the value's spin lock field can get initialized incorrectly - tipc: re-fetch skb cb after tipc_msg_validate - wifi: wilc1000: fix Information Element parsing - packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE - sctp: fix memory leak in sctp_stream_outq_migrate() - can: can327: fix potential skb leak when netdev is down - can: add number of missing netdev freeing on error paths - aquantia: do not purge addresses when setting the number of rings - wwan: iosm: - fix incorrect skb length leading to truncated packet - fix crash in peek throughput test due to skb UAF" * tag 'net-6.1-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed MAINTAINERS: Update maintainer list for chelsio drivers ionic: update MAINTAINERS entry sctp: fix memory leak in sctp_stream_outq_migrate() packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE net/mlx5: Lag, Fix for loop when checking lag Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path" net: marvell: prestera: Fix a NULL vs IS_ERR() check in some functions net: tun: Fix use-after-free in tun_detach() net: mdiobus: fix unbalanced node reference count net: hsr: Fix potential use-after-free tipc: re-fetch skb cb after tipc_msg_validate mptcp: fix sleep in atomic at close time mptcp: don't orphan ssk in mptcp_close() dsa: lan9303: Correct stat name ipv4: Fix route deletion when nexthop info is not specified net: wwan: iosm: fix incorrect skb length net: wwan: iosm: fix crash in peek throughput test net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type net: wwan: iosm: fix kernel test robot reported error ...
2022-11-29Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path"Saeed Mahameed
This reverts commit c0071be0e16c461680d87b763ba1ee5e46548fde. The cited commit removed the validity checks which initialized the window_sz and never removed the use of the now uninitialized variable, so now we are left with wrong value in the window size and the following clang warning: [-Wuninitialized] drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c:232:45: warning: variable 'window_sz' is uninitialized when used here MLX5_SET(macsec_aso, aso_ctx, window_size, window_sz); Revet at this time to address the clang issue due to lack of time to test the proper solution. Fixes: c0071be0e16c ("net/mlx5e: MACsec, remove replay window size limitation in offload path") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221129093006.378840-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge tag 'nvme-6.2-2022-11-29' of git://git.infradead.org/nvme into ↵Jens Axboe
for-6.2/block Pull NVMe updates from Christoph: "nvme updates for Linux 6.2 - support some passthrough commands without CAP_SYS_ADMIN (Kanchan Joshi) - refactor PCIe probing and reset (Christoph Hellwig) - various fabrics authentication fixes and improvements (Sagi Grimberg) - avoid fallback to sequential scan due to transient issues (Uday Shankar) - implement support for the DEAC bit in Write Zeroes (Christoph Hellwig) - allow overriding the IEEE OUI and firmware revision in configfs for nvmet (Aleksandr Miloserdov) - force reconnect when number of queue changes in nvmet (Daniel Wagner) - minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi Grimberg, Christoph Hellwig, Christophe JAILLET)" * tag 'nvme-6.2-2022-11-29' of git://git.infradead.org/nvme: (45 commits) nvmet: expose firmware revision to configfs nvmet: expose IEEE OUI to configfs nvme: rename the queue quiescing helpers nvmet: fix a memory leak in nvmet_auth_set_key nvme: return err on nvme_init_non_mdts_limits fail nvme: avoid fallback to sequential scan due to transient issues nvme-rdma: stop auth work after tearing down queues in error recovery nvme-tcp: stop auth work after tearing down queues in error recovery nvme-auth: have dhchap_auth_work wait for queues auth to complete nvme-auth: remove redundant auth_work flush nvme-auth: convert dhchap_auth_list to an array nvme-auth: check chap ctrl_key once constructed nvme-auth: no need to reset chap contexts on re-authentication nvme-auth: remove redundant deallocations nvme-auth: clear sensitive info right after authentication completes nvme-auth: guarantee dhchap buffers under memory pressure nvme-auth: don't keep long lived 4k dhchap buffer nvme-auth: remove redundant if statement nvme-auth: don't override ctrl keys before validation nvme-auth: don't ignore key generation failures when initializing ctrl keys ...
2022-11-28Merge tag 'intel-pinctrl-v6.2-2' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into devel intel-pinctrl for v6.2-2 * Enable PWM feature on Intel pin control IPs The following is an automated git shortlog grouped by driver: intel: - Enumerate PWM device when community has a capability pwm: - lpss: Rename pwm_lpss_probe() --> devm_pwm_lpss_probe() - lpss: Allow other drivers to enable PWM LPSS - lpss: Include headers we are the direct user of - lpss: Rename MAX_PWMS --> LPSS_MAX_PWMS - Add a stub for devm_pwmchip_add()
2022-11-28SUNRPC: Remove unused svc_rqst::rq_lock fieldChuck Lever
Clean up after commit 22700f3c6df5 ("SUNRPC: Improve ordering of transport processing"). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28hv: fix comment typo in vmbus_channel/low_latencyOlaf Hering
morev vs. more. Signed-off-by: Olaf Hering <olaf@aepfle.de> Link: https://lore.kernel.org/r/20221105115401.21592-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-28video: hyperv_fb: Avoid taking busy spinlock on panic pathGuilherme G. Piccoli
The Hyper-V framebuffer code registers a panic notifier in order to try updating its fbdev if the kernel crashed. The notifier callback is straightforward, but it calls the vmbus_sendpacket() routine eventually, and such function takes a spinlock for the ring buffer operations. Panic path runs in atomic context, with local interrupts and preemption disabled, and all secondary CPUs shutdown. That said, taking a spinlock might cause a lockup if a secondary CPU was disabled with such lock taken. Fix it here by checking if the ring buffer spinlock is busy on Hyper-V framebuffer panic notifier; if so, bail-out avoiding the potential lockup scenario. Cc: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Tianyu Lan <Tianyu.Lan@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Tested-by: Fabio A M Martins <fabiomirmar@gmail.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220819221731.480795-10-gpiccoli@igalia.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-28Merge tag 'mlx5-fixes-2022-11-24' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2022-11-24 This series provides bug fixes to mlx5 driver. Focusing on error handling and proper memory management in mlx5, in general and in the newly added macsec module. I still have few fixes left in my queue and I hope those will be the last ones for mlx5 for this cycle. Please pull and let me know if there is any problem. Happy thanksgiving. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-27NFS: Clear the file access cache upon loginTrond Myklebust
POSIX typically only refreshes the user's supplementary group information upon login. Since NFS servers may often refresh their concept of the user supplementary group membership at their own cadence, it is possible for the NFS client's access cache to become stale due to the user's group membership changing on the server after the user has already logged in on the client. While it is reasonable to expect that such group membership changes are rare, and that we do not want to optimise the cache to accommodate them, it is also not unreasonable for the user to expect that if they log out and log back in again, that the staleness would clear up. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-11-27mm, slab: ignore SLAB_RECLAIM_ACCOUNT with CONFIG_SLUB_TINYVlastimil Babka
SLAB_RECLAIM_ACCOUNT caches allocate their slab pages with __GFP_RECLAIMABLE and can help against fragmentation by grouping pages by mobility, but on tiny systems mobility grouping is likely disabled anyway and ignoring SLAB_RECLAIM_ACCOUNT might instead lead to merging of caches that are made incompatible just by the flag. Thus with CONFIG_SLUB_TINY, make SLAB_RECLAIM_ACCOUNT ineffective. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com>
2022-11-27mm, slub: don't create kmalloc-rcl caches with CONFIG_SLUB_TINYVlastimil Babka
Distinguishing kmalloc(__GFP_RECLAIMABLE) can help against fragmentation by grouping pages by mobility, but on tiny systems the extra memory overhead of separate set of kmalloc-rcl caches will probably be worse, and mobility grouping likely disabled anyway. Thus with CONFIG_SLUB_TINY, don't create kmalloc-rcl caches and use the regular ones. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com>
2022-11-27mm, slub: disable SYSFS support with CONFIG_SLUB_TINYVlastimil Babka
Currently SLUB enables its sysfs support depending unconditionally on the general CONFIG_SYSFS setting. To reduce the configuration combination space, make CONFIG_SLUB_TINY disable SLUB's sysfs support by reusing the existing SLAB_SUPPORTS_SYSFS define. It is unlikely that real tiny systems would combine CONFIG_SLUB_TINY with CONFIG_SYSFS, but a randconfig might. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com>