git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-04-18	io_uring/rsrc: merge nodes and io_rsrc_put	Pavel Begunkov
	struct io_rsrc_node carries a number of resources represented by struct io_rsrc_put. That was handy before for sync overhead ammortisation, but all complexity is gone and nodes are simple and lightweight. Let's allocate a separate node for each resource. Nodes and io_rsrc_put and not much different in size, and former are cached, so node allocation should work better. That also removes some overhead for nested iteration in io_rsrc_node_ref_zero() / __io_rsrc_put_work(). Another reason for the patch is that it greatly reduces complexity by moving io_rsrc_node_switch[_start]() inside io_queue_rsrc_removal(), so users don't have to care about it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c7d3a45b30cc14cd93700a710dd112edc703db98.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18	io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal	Pavel Begunkov
	For io_queue_rsrc_removal() we should always use the current active rsrc node, don't pass it directly but let the function grab it from the context. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d15939b4afea730978b4925685c2577538b823bb.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18	io_uring/rsrc: remove unused io_rsrc_node::llist	Pavel Begunkov
	->llist was needed for rsrc node destruction offload, which is removed now. Get rid of the unused field. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8e7d764c3f947489fde88d0927c3060d2e1bb599.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18	x86: set FSRS automatically on AMD CPUs that have FSRM	Linus Torvalds
	So Intel introduced the FSRS ("Fast Short REP STOS") CPU capability bit, because they seem to have done the (much simpler) REP STOS optimizations separately and later than the REP MOVS one. In contrast, when AMD introduced support for FSRM ("Fast Short REP MOVS"), in the Zen 3 core, it appears to have improved the REP STOS case at the same time, and since the FSRS bit was added by Intel later, it doesn't show up on those AMD Zen 3 cores. And now that we made use of FSRS for the "rep stos" conditional, that made those AMD machines unnecessarily slower. The Intel situation where "rep movs" is fast, but "rep stos" isn't, is just odd. The 'stos' case is a lot simpler with no aliasing, no mutual alignment issues, no complicated cases. So this just sets FSRS automatically when FSRM is available on AMD machines, to get back all the nice REP STOS goodness in Zen 3. Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: improve on the non-rep 'copy_user' function	Linus Torvalds
	The old 'copy_user_generic_unrolled' function was oddly implemented for largely historical reasons: it had been largely based on the uncached copy case, which has some other concerns. For example, the __copy_user_nocache() function uses 'movnti' for the destination stores, and those want the destination to be aligned. In contrast, the regular copy function doesn't really care, and trying to align things only complicates matters. Also, like the clear_user function, the copy function had some odd handling of the repeat counts, complicating the exception handling for no really good reason. So as with clear_user, just write it to keep all the byte counts in the %rcx register, exactly like the 'rep movs' functionality that this replaces. Unlike a real 'rep movs', we do allow for this to trash a few temporary registers to not have to unnecessarily save/restore registers on the stack. And like the clearing case, rename this to what it now clearly is: 'rep_movs_alternative', and make it one coherent function, so that it shows up as such in profiles (instead of the odd split between "copy_user_generic_unrolled" and "copy_user_short_string", the latter of which was not about strings at all, and which was shared with the uncached case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: improve on the non-rep 'clear_user' function	Linus Torvalds
	The old version was oddly written to have the repeat count in multiple registers. So instead of taking advantage of %rax being zero, it had some sub-counts in it. All just for a "single word clearing" loop, which isn't even efficient to begin with. So get rid of those games, and just keep all the state in the same registers we got it in (and that we should return things in). That not only makes this act much more like 'rep stos' (which this function is replacing), but makes it much easier to actually do the obvious loop unrolling. Also rename the function from the now nonsensical 'clear_user_original' to what it now clearly is: 'rep_stos_alternative'. End result: if we don't have a fast 'rep stosb', at least we can have a fast fallback for it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: inline the 'rep movs' in user copies for the FSRM case	Linus Torvalds
	This does the same thing for the user copies as commit 0db7058e8e23 ("x86/clear_user: Make it faster") did for clear_user(). In other words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set, avoiding the function call entirely. In order to do that, it makes the calling convention for the out-of-line case ("copy_user_generic_unrolled") match the 'rep movs' calling convention, although it does also end up clobbering a number of additional registers. Also, to simplify code sharing in the low-level assembly with the __copy_user_nocache() function (that uses the normal C calling convention), we end up with a kind of mixed return value for the low-level asm code: it will return the result in both %rcx (to work as an alternative for the 'rep movs' case), _and_ in %rax (for the nocache case). We could avoid this by wrapping __copy_user_nocache() callers in an inline asm, but since the cost is just an extra register copy, it's probably not worth it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: move stac/clac from user copy routines into callers	Linus Torvalds
	This is preparatory work for inlining the 'rep movs' case, but also a cleanup. The __copy_user_nocache() function was mis-used by the rdma code to do uncached kernel copies that don't actually want user copies at all, and as a result doesn't want the stac/clac either. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: don't use REP_GOOD or ERMS for user memory clearing	Linus Torvalds
	The modern target to use is FSRS (Fast Short REP STOS), and the other cases should only be used for bigger areas (ie mainly things like page clearing). Note! This changes the conditional for the inlining from FSRM ("fast short rep movs") to FSRS ("fast short rep stos"). We'll have a separate fixup for AMD microarchitectures that have a good 'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM was there first). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: don't use REP_GOOD or ERMS for user memory copies	Linus Torvalds
	The modern target to use is FSRM (Fast Short REP MOVS), and the other cases should only be used for bigger areas (ie mainly things like page clearing). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: don't use REP_GOOD or ERMS for small memory clearing	Linus Torvalds
	The modern target to use is FSRS (Fast Short REP STOS), and the other cases should only be used for bigger areas (ie mainly things like page clearing). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	x86: don't use REP_GOOD or ERMS for small memory copies	Linus Torvalds
	The modern target to use is FSRM (Fast Short REP MOVS), and the other cases should only be used for bigger areas (ie mainly things like page copying and clearing). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18	mm: ksm: support hwpoison for ksm page	Longlong Xia
	hwpoison_user_mappings() is updated to support ksm pages, and add collect_procs_ksm() to collect processes when the error hit an ksm page. The difference from collect_procs_anon() is that it also needs to traverse the rmap-item list on the stable node of the ksm page. At the same time, add_to_kill_ksm() is added to handle ksm pages. And task_in_to_kill_list() is added to avoid duplicate addition of tsk to the to_kill list. This is because when scanning the list, if the pages that make up the ksm page all come from the same process, they may be added repeatedly. Link: https://lkml.kernel.org/r/20230414021741.2597273-3-xialonglong1@huawei.com Signed-off-by: Longlong Xia <xialonglong1@huawei.com> Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	mm: memory-failure: refactor add_to_kill()	Longlong Xia
	Patch series "mm: ksm: support hwpoison for ksm page", v2. Currently, ksm does not support hwpoison. As ksm is being used more widely for deduplication at the system level, container level, and process level, supporting hwpoison for ksm has become increasingly important. However, ksm pages were not processed by hwpoison in 2009 [1]. The main method of implementation: 1. Refactor add_to_kill() and add new add_to_kill_*() to better accommodate the handling of different types of pages. 2. Add collect_procs_ksm() to collect processes when the error hit an ksm page. 3. Add task_in_to_kill_list() to avoid duplicate addition of tsk to the to_kill list. 4. Try_to_unmap ksm page (already supported). 5. Handle related processes such as sending SIGBUS. Tested with poisoning to ksm page from 1) different process 2) one process and with/without memory_failure_early_kill set, the processes are killed as expected with the patchset. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?h=01e00f880ca700376e1845cf7a2524ebe68e47d6 This patch (of 2): The page_address_in_vma() is used to find the user virtual address of page in add_to_kill(), but it doesn't support ksm due to the ksm page->index unusable, add an ksm_addr as parameter to add_to_kill(), let's the caller to pass it, also rename the function to __add_to_kill(), and adding add_to_kill_anon_file() for handling anonymous pages and file pages, adding add_to_kill_fsdax() for handling fsdax pages. Link: https://lkml.kernel.org/r/20230414021741.2597273-1-xialonglong1@huawei.com Link: https://lkml.kernel.org/r/20230414021741.2597273-2-xialonglong1@huawei.com Signed-off-by: Longlong Xia <xialonglong1@huawei.com> Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/memfd: fix test_sysctl	Jeff Xu
	sysctl memfd_noexec is pid-namespaced, non-reservable, and inherent to the child process. Move the inherence test from init ns to child ns, so init ns can keep the default value. Link: https://lkml.kernel.org/r/20230414022801.2545257-1-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@google.com> Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202303312259.441e35db-yujie.liu@intel.com Tested-by: Yujie Liu <yujie.liu@intel.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/mm: run hugetlb testcases of va switch	Chaitanya S Prakash
	The va_high_addr_switch selftest is used to test mmap across 128TB boundary. It divides the selftest cases into two main categories on the basis of size. One set is used to create mappings that are multiples of PAGE_SIZE while the other creates mappings that are multiples of HUGETLB_SIZE. In order to run the hugetlb testcases the binary must be appended with "--run-hugetlb" but the file that used to run the test only invokes the binary, thereby completely skipping the hugetlb testcases. Hence, the required statement has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-6-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/mm: configure nr_hugepages for arm64	Chaitanya S Prakash
	Arm64 has a default hugepage size of 512MB when CONFIG_ARM64_64K_PAGES=y is enabled. While testing on arm64 platforms having up to 4PB of virtual address space, a minimum of 6 hugepages were required for all test cases to pass. Support for this requirement has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-5-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/mm: add platform independent in code comments	Chaitanya S Prakash
	The in code comments for the selftest were made on the basis of 128TB switch, an architecture feature specific to PowerPc and x86 platforms. Keeping in mind the support added for arm64 platforms which implements a 256TB switch, a more generic explanation has been provided. Link: https://lkml.kernel.org/r/20230323105243.2807166-4-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/mm: rename va_128TBswitch to va_high_addr_switch	Chaitanya S Prakash
	As the initial selftest only took into consideration PowperPC and x86 architectures, on adding support for arm64, a platform independent naming convention is chosen. Link: https://lkml.kernel.org/r/20230323105243.2807166-3-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	selftests/mm: add support for arm64 platform on va switch	Chaitanya S Prakash
	Patch series "selftests/mm: Implement support for arm64 on va". The va_128TBswitch selftest is designed and implemented for PowerPC and x86 architectures which support a 128TB switch, up to 256TB of virtual address space and hugepage sizes of 16MB and 2MB respectively. Arm64 platforms on the other hand support a 256Tb switch, up to 4PB of virtual address space and a default hugepage size of 512MB when 64k pagesize is enabled. These architectural differences require introducing support for arm64 platforms, after which a more generic naming convention is suggested. The in code comments are amended to provide a more platform independent explanation of the working of the code and nr_hugepages are configured as required. Finally, the file running the testcase is modified in order to prevent skipping of hugetlb testcases of va_high_addr_switch. This patch (of 5): Arm64 platforms have the ability to support 64kb pagesize, 512MB default hugepage size and up to 4PB of virtual address space. The address switch occurs at 256TB as opposed to 128TB. Hence, the necessary support has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-1-chaitanyas.prakash@arm.com Link: https://lkml.kernel.org/r/20230323105243.2807166-2-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	dt-bindings: display: simplify compatibles syntax	Krzysztof Kozlowski
	Lists (items) with one item should be just const or enum because it is shorter and simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230414104230.23165-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-18	dt-bindings: display: mediatek: simplify compatibles syntax	Krzysztof Kozlowski
	Lists (items) with one item should be just enum because it is shorter, simpler and does not confuse, if one wants to add new entry with a fallback. Convert all of them to enums. OTOH, leave unused "oneOf" entries in anticipation of further growth of the entire binding. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20230414083311.12197-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-18	dt-bindings: drm/bridge: ti-sn65dsi86: Fix the video-interfaces.yaml references	Fabio Estevam
	video-interface.txt does not exist anymore, as it has been converted to video-interfaces.yaml. Instead of referencing video-interfaces.yaml multiple times, pass it as a $ref to the schema. Signed-off-by: Fabio Estevam <festevam@denx.de> Link: https://lore.kernel.org/r/20230412175800.2537812-1-festevam@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-18	dt-bindings: timer: Drop unneeded quotes	Rob Herring
	Cleanup bindings dropping unneeded quotes. Once all these are fixed, checking for this can be enabled in yamllint. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230327170146.4104556-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-18	dt-bindings: interrupt-controller: qcom,pdc: document qcom,qdu1000-pdc	Krzysztof Kozlowski
	Add QDU1000 PDC, already used in upstreamed DTS. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230416102831.105136-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-04-18	checkpatch: introduce proper bindings license check	Dmitry Rokosov
	All headers from 'include/dt-bindings/' must be verified by checkpatch together with Documentation bindings, because all of them are part of the whole DT bindings system. The requirement is dual licensed and matching patterns: * Schemas: /GPL-2\.0(?:-only)? OR BSD-2-Clause/ * Headers: /GPL-2\.0(?:-only)? OR \S+/ Above patterns suggested by Rob at: https://lore.kernel.org/all/CAL_Jsq+-YJsBO+LuPJ=ZQ=eb-monrwzuCppvReH+af7hYZzNaQ@mail.gmail.com The issue was found during patch review: https://lore.kernel.org/all/20230313201259.19998-4-ddrokosov@sberdevices.ru/ Link: https://lkml.kernel.org/r/20230404191715.7319-1-ddrokosov@sberdevices.ru Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	epoll: rename global epmutex	Davidlohr Bueso
	As of 4f04cbaf128 ("epoll: use refcount to reduce ep_mutex contention"), this lock is now specific to nesting cases - inserting an epoll fd onto another epoll fd. Rename the lock to be less generic. Link: https://lkml.kernel.org/r/20230411234159.20421-1-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: add GDB convenience functions $lx_dentry_name() and $lx_i_dentry()	Glenn Washburn
	$lx_dentry_name() generates a full VFS path from a given dentry pointer, and $lx_i_dentry() returns the dentry pointer associated with the given inode pointer, if there is one. Link: https://lkml.kernel.org/r/c9a5ad8efbfbd2cc6559e082734eed7628f43a16.1677631565.git.development@efficientek.com Signed-off-by: Glenn Washburn <development@efficientek.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Antonio Borneo <antonio.borneo@foss.st.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: create linux/vfs.py for VFS related GDB helpers	Glenn Washburn
	Patch series "GDB VFS utils". I've created a couple GDB convenience functions that I found useful when debugging some VFS issues and figure others might find them useful. For instance, they are useful in setting conditional breakpoints on VFS functions where you only care if the dentry path is a certain value. I took the opportunity to create a new "vfs" python module to give VFS related utilities a home. This patch (of 2): This will allow for more VFS specific GDB helpers to be collected in one place. Move utils.dentry_name into the vfs modules. Also a local variable in proc.py was changed from vfs to mnt to prevent a naming collision with the new vfs module. [akpm@linux-foundation.org: add SPDX-License-Identifier] Link: https://lkml.kernel.org/r/cover.1677631565.git.development@efficientek.com Link: https://lkml.kernel.org/r/7bba4c065a8c2c47f1fc5b03a7278005b04db251.1677631565.git.development@efficientek.com Signed-off-by: Glenn Washburn <development@efficientek.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Antonio Borneo <antonio.borneo@foss.st.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	uapi/linux/const.h: prefer ISO-friendly __typeof__	Kevin Brodsky
	typeof is (still) a GNU extension, which means that it cannot be used when building ISO C (e.g. -std=c99). It should therefore be avoided in uapi headers in favour of the ISO-friendly __typeof__. Unfortunately this issue could not be detected by CONFIG_UAPI_HEADER_TEST=y as the __ALIGN_KERNEL() macro is not expanded in any uapi header. This matters from a userspace perspective, not a kernel one. uapi headers and their contents are expected to be usable in a variety of situations, and in particular when building ISO C applications (with -std=c99 or similar). This particular problem can be reproduced by trying to use the __ALIGN_KERNEL macro directly in application code, say: #include <linux/const.h> int align(int x, int a) { return __KERNEL_ALIGN(x, a); } and trying to build that with -std=c99. Link: https://lkml.kernel.org/r/20230411092747.3759032-1-kevin.brodsky@arm.com Fixes: a79ff731a1b2 ("netfilter: xtables: make XT_ALIGN() usable in exported headers by exporting __ALIGN_KERNEL()") Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reported-by: Ruben Ayrapetyan <ruben.ayrapetyan@arm.com> Tested-by: Ruben Ayrapetyan <ruben.ayrapetyan@arm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	delayacct: track delays from IRQ/SOFTIRQ	Yang Yang
	Delay accounting does not track the delay of IRQ/SOFTIRQ. While IRQ/SOFTIRQ could have obvious impact on some workloads productivity, such as when workloads are running on system which is busy handling network IRQ/SOFTIRQ. Get the delay of IRQ/SOFTIRQ could help users to reduce such delay. Such as setting interrupt affinity or task affinity, using kernel thread for NAPI etc. This is inspired by "sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure"[1]. Also fix some code indent problems of older code. And update tools/accounting/getdelays.c: / # ./getdelays -p 156 -di print delayacct stats ON printing IO accounting PID 156 CPU count real total virtual total delay total delay average 15 15836008 16218149 275700790 18.380ms IO count delay total delay average 0 0 0.000ms SWAP count delay total delay average 0 0 0.000ms RECLAIM count delay total delay average 0 0 0.000ms THRASHING count delay total delay average 0 0 0.000ms COMPACT count delay total delay average 0 0 0.000ms WPCOPY count delay total delay average 36 7586118 0.211ms IRQ count delay total delay average 42 929161 0.022ms [1] commit 52b1364ba0b1("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure") Link: https://lkml.kernel.org/r/202304081728353557233@zte.com.cn Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Cc: Jiang Xuexin <jiang.xuexin@zte.com.cn> Cc: wangyong <wang.yong12@zte.com.cn> Cc: junhua huang <huang.junhua@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: timerlist: convert int chunks to str	Amjad Ouled-Ameur
	join() expects strings but integers are given. Convert chunks list to strings before passing it to join() Link: https://lkml.kernel.org/r/20230406221217.1585486-4-f.fainelli@gmail.com Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com> Signed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: print interrupts	Florian Fainelli
	This GDB script prints the interrupts in the system in the same way that /proc/interrupts does. This does include the architecture specific part done by arch_show_interrupts() for x86, ARM, ARM64 and MIPS. Example output from an ARM64 system: (gdb) lx-interruptlist CPU0 CPU1 CPU2 CPU3 10: 3167 1225 1276 2629 GICv2 30 Level arch_timer 13: 0 0 0 0 GICv2 36 Level arm-pmu 14: 0 0 0 0 GICv2 37 Level arm-pmu 15: 0 0 0 0 GICv2 38 Level arm-pmu 16: 0 0 0 0 GICv2 39 Level arm-pmu 28: 0 0 0 0 interrupt-controller@8410640 5 Edge brcmstb-gpio-wake 30: 125 0 0 0 GICv2 128 Level ttyS0 31: 0 0 0 0 interrupt-controller@8416000 0 Level mspi_done 32: 0 0 0 0 interrupt-controller@8410640 3 Edge brcmstb-waketimer 33: 0 0 0 0 interrupt-controller@8418580 8 Edge brcmstb-waketimer-rtc 34: 872 0 0 0 GICv2 230 Level brcm_scmi@0 35: 0 0 0 0 interrupt-controller@8410640 10 Edge 8d0f200.usb-phy 37: 0 0 0 0 GICv2 97 Level PCIe PME 42: 0 0 0 0 GICv2 145 Level xhci-hcd:usb1 43: 94 0 0 0 GICv2 71 Level mmc1 44: 0 0 0 0 GICv2 70 Level mmc0 IPI0: 23 666 154 98 Rescheduling interrupts IPI1: 247 1053 1701 634 Function call interrupts IPI2: 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 0 0 Timer broadcast interrupts IPI5: 7 9 5 0 IRQ work interrupts IPI6: 0 0 0 0 CPU wake-up interrupts ERR: 0 Link: https://lkml.kernel.org/r/20230406220451.1583239-1-f.fainelli@gmail.com Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: raise error with reduced debugging information	Florian Fainelli
	If CONFIG_DEBUG_INFO_REDUCED is enabled in the kernel configuration, we will typically not be able to load vmlinux-gdb.py and will fail with: Traceback (most recent call last): File "/home/fainelli/work/buildroot/output/arm64/build/linux-custom/vmlinux-gdb.py", line 25, in <module> import linux.utils File "/home/fainelli/work/buildroot/output/arm64/build/linux-custom/scripts/gdb/linux/utils.py", line 131, in <module> atomic_long_counter_offset = atomic_long_type.get_type()['counter'].bitpos KeyError: 'counter' Rather be left wondering what is happening only to find out that reduced debug information is the cause, raise an eror. This was not typically a problem until e3c8d33e0d62 ("scripts/gdb: fix 'lx-dmesg' on 32 bits arch") but it has since then. Link: https://lkml.kernel.org/r/20230406215252.1580538-1-f.fainelli@gmail.com Fixes: e3c8d33e0d62 ("scripts/gdb: fix 'lx-dmesg' on 32 bits arch") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Antonio Borneo <antonio.borneo@foss.st.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: add a Radix Tree Parser	Kieran Bingham
	Linux makes use of the Radix Tree data structure to store pointers indexed by integer values. This structure is utilised across many structures in the kernel including the IRQ descriptor tables, and several filesystems. This module provides a method to lookup values from a structure given its head node. Usage: The function lx_radix_tree_lookup, must be given a symbol of type struct radix_tree_root, and an index into that tree. The object returned is a generic integer value, and must be cast correctly to the type based on the storage in the data structure. For example, to print the irq descriptor in the sparse irq_desc_tree at index 18, try the following: (gdb) print (struct irq_desc)$lx_radix_tree_lookup(irq_desc_tree, 18) This script previously existed under commit e127a73d41ac471d7e3ba950cf128f42d6ee3448 ("scripts/gdb: add a Radix Tree Parser") and was later reverted with b447e02548a3304c47b78b5e2d75a4312a8f17e1i (Revert "scripts/gdb: add a Radix Tree Parser"). This version expects the XArray based radix tree implementation and has been verified using QEMU/x86 on Linux 6.3-rc5. [f.fainelli@gmail.com: revive and update for xarray implementation] [f.fainelli@gmail.com: guard against a NULL node in the while loop] Link: https://lkml.kernel.org/r/20230405222743.1191674-1-f.fainelli@gmail.com Link: https://lkml.kernel.org/r/20230404214049.1016811-1-f.fainelli@gmail.com Signed-off-by: Kieran Bingham <kieran.bingham@linaro.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	lib/rbtree: use '+' instead of '\|' for setting color.	Noah Goldstein
	This has a slight benefit for x86 and has no effect on other targets. The benefit to x86 is it change the codegen for setting a node to block from `mov %r0, %r1; or $RB_BLACK, %r1` to `lea RB_BLACK(%r0), %r1` which saves an instructions. In all other cases it just replace ALU with ALU (or -> and) which perform the same on all machines I am aware of. Total instructions in rbtree.o: Before - 802 After - 782 so it saves about 20 `mov` instructions. Link: https://lkml.kernel.org/r/20230404221350.3806566-1-goldstein.w.n@gmail.com Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Cc: Michel Lespinasse <michel@lespinasse.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	proc/stat: remove arch_idle_time()	Heiko Carstens
	The last (only) architecture specific arch_idle_time() implementation was removed with commit be76ea614460 ("s390/idle: remove arch_cpu_idle_time() and corresponding code"). Therefore remove the now dead code in fs/proc/stat.c as well. Link: https://lkml.kernel.org/r/20230405143452.2677172-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	checkpatch: check for misuse of the link tags	Matthieu Baerts
	"Link:" and "Closes:" tags have to be used with public URLs. It is difficult to make sure the link is public but at least we can verify the tag is followed by 'http(s)://'. With that, we avoid such a tag that is not allowed [1]: Closes: <number> Now that we check the "link" tags are followed by a URL, we can relax the check linked to "Reported-by being followed by a link tag" to only verify if a "link" tag is present after the "Reported-by" one. Link: https://lore.kernel.org/linux-doc/CAHk-=wh0v1EeDV3v8TzK81nDC40=XuTdY2MCr0xy3m3FiBV3+Q@mail.gmail.com/ [1] Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-5-d26d1fa66f9f@tessares.net Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	checkpatch: allow Closes tags with links	Matthieu Baerts
	As a follow-up of a previous patch modifying the documentation to allow using the "Closes:" tag, checkpatch.pl is updated accordingly. checkpatch.pl now no longer complain when the "Closes:" tag is used by itself: commit 76f381bb77a0 ("checkpatch: warn when unknown tags are used for links") ... or after the "Reported-by:" tag: commit d7f1d71e5ef6 ("checkpatch: warn when Reported-by: is not followed by Link:") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/373 Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-4-d26d1fa66f9f@tessares.net Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	checkpatch: use a list of "link" tags	Matthieu Baerts
	The following commit will allow the use of a similar "link" tag. Because there is a possibility that other similar tags will be added in the future and to reduce the number of places where the code will be modified to allow this new tag, a list with all these "link" tags is now used. Two variables are created from it: one to search for such tags and one to print all tags in a warning message. Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-3-d26d1fa66f9f@tessares.net Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Joe Perches <joe@perches.com> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	checkpatch: don't print the next line if not defined	Matthieu Baerts
	When checking if "Reported-by" tag is followed by "Link:", there is no need to print the next line if there is no next line. While at it, also mention in this case that the "Link:" tag should be followed by a URL, similar to the next warning. By doing that, the code is now similar to what is done above when checking if the Co-developed-by tag is properly used. Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-2-d26d1fa66f9f@tessares.net Fixes: d7f1d71e5ef6 ("checkpatch: warn when Reported-by: is not followed by Link:") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	docs: process: allow Closes tags with links	Matthieu Baerts
	Since v6.3, checkpatch.pl now complains about the use of "Closes:" tags followed by a link [1]. It also complains if a "Reported-by:" tag is followed by a "Closes:" one [2]. As detailed in the first patch, this "Closes:" tag is used for a bit of time, mainly by DRM and MPTCP subsystems. It is used by some bug trackers to automate the closure of issues when a patch is accepted. It is even planned to use this tag with bugzilla.kernel.org [3]. The first patch updates the documentation to explain what is this "Closes:" tag and how/when to use it. The second patch modifies checkpatch.pl to stop complaining about it. The DRM maintainers and their mailing list have been added in Cc as they are probably interested by these two patches as well. [1] https://lore.kernel.org/all/3b036087d80b8c0e07a46a1dbaaf4ad0d018f8d5.1674217480.git.linux@leemhuis.info/ [2] https://lore.kernel.org/all/bb5dfd55ea2026303ab2296f4a6df3da7dd64006.1674217480.git.linux@leemhuis.info/ [3] https://lore.kernel.org/linux-doc/20230315181205.f3av7h6owqzzw64p@meerkat.local/ This patch (of 5): Making sure a bug tracker is up to date is not an easy task. For example, a first version of a patch fixing a tracked issue can be sent a long time after having created the issue. But also, it can take some time to have this patch accepted upstream in its final form. When it is done, someone -- probably not the person who accepted the patch -- has to remember about closing the corresponding issue. This task of closing and tracking the patch can be done automatically by bug trackers like GitLab [1], GitHub [2] and hopefully soon [3] bugzilla.kernel.org when the appropriated tag is used. The two first ones accept multiple tags but it is probably better to pick one. According to commit 76f381bb77a0 ("checkpatch: warn when unknown tags are used for links"), the "Closes" tag seems to have been used in the past by a few people and it is supported by popular bug trackers. Here is how it has been used in the past: $ git log --no-merges --format=email -P --grep='^Closes: http' \| \ grep '^Closes: http' \| cut -d/ -f3-5 \| sort \| uniq -c \| sort -rn 391 gitlab.freedesktop.org/drm/intel 79 github.com/multipath-tcp/mptcp_net-next 8 gitlab.freedesktop.org/drm/msm 3 gitlab.freedesktop.org/drm/amd 2 gitlab.freedesktop.org/mesa/mesa 1 patchwork.freedesktop.org/series/73320 1 gitlab.freedesktop.org/lima/linux 1 gitlab.freedesktop.org/drm/nouveau 1 github.com/ClangBuiltLinux/linux 1 bugzilla.netfilter.org/show_bug.cgi?id=1579 1 bugzilla.netfilter.org/show_bug.cgi?id=1543 1 bugzilla.netfilter.org/show_bug.cgi?id=1436 1 bugzilla.netfilter.org/show_bug.cgi?id=1427 1 bugs.debian.org/625804 Likely here, the "Closes" tag was only properly used with GitLab and GitHub. We can also see that it has been used quite a few times (and still used recently) and this is then not a "random tag that makes no sense" like it was the case with "BugLink" recently [4]. It has also been misused but that was a long time ago, when it was common to use many different random tags. checkpatch.pl script should then stop complaining about this "Closes" tag. As suggested by Thorsten [5], if this tag is accepted, it should first be described in the documentation. This is what is done here in this patch. To avoid confusion, the "Closes" should be used with any public bug report. No need to check if the underlying bug tracker supports automations. Having this tag with any kind of public bug reports allows bots like regzbot to clearly identify patches fixing a specific bug and avoid false-positives, e.g. patches mentioning it is related to an issue but not fixing it. As suggested by Thorsten [6] again, if we follow the same logic, the "Closes" tag should then be used after a "Reported-by" one. Note that thanks to this "Closes" tag, the mentioned bug trackers can also locate where a patch has been applied in different branches and repositories. If only the "Link" tag is used, the tracking can also be done but the ticket will not be closed and a manual operation will be needed. Also, these bug trackers have some safeguards: the closure is only done if a commit having the "Closes:" tag is applied in a specific branch. It will then not be closed if a random commit having the same tag is published elsewhere. Also in case of closure, a notification is sent to the owners. Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-0-d26d1fa66f9f@tessares.net Link: https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#default-closing-pattern [1] Link: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests [2] Link: https://lore.kernel.org/linux-doc/20230315181205.f3av7h6owqzzw64p@meerkat.local/ [3] Link: https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@mail.gmail.com/ [4] Link: https://lore.kernel.org/all/688cd6cb-90ab-6834-a6f5-97080e39ca8e@leemhuis.info/ [5] Link: https://lore.kernel.org/linux-doc/2194d19d-f195-1a1e-41fc-7827ae569351@leemhuis.info/ [6] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/373 Link: https://lkml.kernel.org/r/20230314-doc-checkpatch-closes-tag-v4-1-d26d1fa66f9f@tessares.net Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Suggested-by: Thorsten Leemhuis <linux@leemhuis.info> Acked-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Acked-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: fix lx-timerlist for HRTIMER_MAX_CLOCK_BASES printing	Peng Liu
	HRTIMER_MAX_CLOCK_BASES is of enum type hrtimer_base_type. To print it as an integer, HRTIMER_MAX_CLOCK_BASES should be converted first. Link: https://lkml.kernel.org/r/TYCP286MB214640FF0E7F04AC3926A39EC6819@TYCP286MB2146.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Peng Liu <liupeng17@lenovo.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: fix lx-timerlist for Python3	Peng Liu
	Below incompatibilities between Python2 and Python3 made lx-timerlist fail to run under Python3. o xrange() is replaced by range() in Python3 o bytes and str are different types in Python3 o the return value of Inferior.read_memory() is memoryview object in Python3 akpm: cc stable so that older kernels are properly debuggable under newer Python. Link: https://lkml.kernel.org/r/TYCP286MB2146EE1180A4D5176CBA8AB2C6819@TYCP286MB2146.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Peng Liu <liupeng17@lenovo.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	scripts/gdb: fix lx-timerlist for struct timequeue_head change	Peng Liu
	commit 511885d7061e ("lib/timerqueue: Rely on rbtree semantics for next timer") changed struct timerqueue_head, and so print_active_timers() should be changed accordingly with its way to interpret the structure. Link: https://lkml.kernel.org/r/TYCP286MB21463BD277330B26DDC18903C6819@TYCP286MB2146.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Peng Liu <liupeng17@lenovo.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	memfd: pass argument of memfd_fcntl as int	Luca Vizzarro
	The interface for fcntl expects the argument passed for the command F_ADD_SEALS to be of type int. The current code wrongly treats it as a long. In order to avoid access to undefined bits, we should explicitly cast the argument to int. This commit changes the signature of all the related and helper functions so that they treat the argument as int instead of long. Link: https://lkml.kernel.org/r/20230414152459.816046-5-Luca.Vizzarro@arm.com Signed-off-by: Luca Vizzarro <Luca.Vizzarro@arm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Kevin Brodsky <Kevin.Brodsky@arm.com> Cc: Vincenzo Frascino <Vincenzo.Frascino@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: David Laight <David.Laight@ACULAB.com> Cc: Mark Rutland <Mark.Rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	mm: Multi-gen LRU: remove wait_event_killable()	Kalesh Singh
	Android 14 and later default to MGLRU [1] and field telemetry showed occasional long tail latency (>100ms) in the reclaim path. Tracing revealed priority inversion in the reclaim path. In try_to_inc_max_seq(), when high priority tasks were blocked on wait_event_killable(), the preemption of the low priority task to call wake_up_all() caused those high priority tasks to wait longer than necessary. In general, this problem is not different from others of its kind, e.g., one caused by mutex_lock(). However, it is specific to MGLRU because it introduced the new wait queue lruvec->mm_state.wait. The purpose of this new wait queue is to avoid the thundering herd problem. If many direct reclaimers rush into try_to_inc_max_seq(), only one can succeed, i.e., the one to wake up the rest, and the rest who failed might cause premature OOM kills if they do not wait. So far there is no evidence supporting this scenario, based on how often the wait has been hit. And this begs the question how useful the wait queue is in practice. Based on Minchan's recommendation, which is in line with his commit 6d4675e60135 ("mm: don't be stuck to rmap lock on reclaim path") and the rest of the MGLRU code which also uses trylock when possible, remove the wait queue. [1] https://android-review.googlesource.com/q/I7ed7fbfd6ef9ce10053347528125dd98c39e50bf Link: https://lkml.kernel.org/r/20230413214326.2147568-1-kaleshsingh@google.com Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Suggested-by: Minchan Kim <minchan@kernel.org> Reported-by: Wei Wang <wvw@google.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	mm: workingset: update description of the source file	Yang Yang
	The calculation of workingset size is the core logic of handling refault, it had been updated several times[1][2] after workingset.c was created[3]. But the description hadn't been updated accordingly, this mismatch may confuse the readers. So we update the description to make it consistent to the code. [1] commit 34e58cac6d8f ("mm: workingset: let cache workingset challenge anon") [2] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU") [3] commit a528910e12ec ("mm: thrash detection-based file cache sizing") Link: https://lkml.kernel.org/r/202304131634494948454@zte.com.cn Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	printk: export console trace point for kcsan/kasan/kfence/kmsan	Pavankumar Kondeti
	The console tracepoint is used by kcsan/kasan/kfence/kmsan test modules. Since this tracepoint is not exported, these modules iterate over all available tracepoints to find the console trace point. Export the trace point so that it can be directly used. Link: https://lkml.kernel.org/r/20230413100859.1492323-1-quic_pkondeti@quicinc.com Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Marco Elver <elver@google.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18	mm: vmscan: refactor updating current->reclaim_state	Yosry Ahmed
	During reclaim, we keep track of pages reclaimed from other means than LRU-based reclaim through scan_control->reclaim_state->reclaimed_slab, which we stash a pointer to in current task_struct. However, we keep track of more than just reclaimed slab pages through this. We also use it for clean file pages dropped through pruned inodes, and xfs buffer pages freed. Rename reclaimed_slab to reclaimed, and add a helper function that wraps updating it through current, so that future changes to this logic are contained within include/linux/swap.h. Link: https://lkml.kernel.org/r/20230413104034.1086717-4-yosryahmed@google.com Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Lameter <cl@linux.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: NeilBrown <neilb@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>