summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-10mm/page_alloc: remove prefetchw() on freeing page to buddy systemWei Yang
The prefetchw() is introduced from an ancient patch[1]. The change log says: The basic idea is to free higher order pages instead of going through every single one. Also, some unnecessary atomic operations are done away with and replaced with non-atomic equivalents, and prefetching is done where it helps the most. For a more in-depth discusion of this patch, please see the linux-ia64 archives (topic is "free bootmem feedback patch"). So there are several changes improve the bootmem freeing, in which the most basic idea is freeing higher order pages. And as Matthew says, "Itanium CPUs of this era had no prefetchers." I did 10 round bootup tests before and after this change, the data doesn't prove prefetchw() help speeding up bootmem freeing. The sum of the 10 round bootmem freeing time after prefetchw() removal even 5.2% faster than before. [1]: https://lore.kernel.org/linux-ia64/40F46962.4090604@sgi.com/ Link: https://lkml.kernel.org/r/20240702020931.7061-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10kernel/fork.c: put set_max_threads()/task_struct_whitelist() in __init sectionWei Yang
The functions set_max_threads() and task_struct_whitelist() are only used by fork_init() during bootup. Let's add __init tag to them. Link: https://lkml.kernel.org/r/20240701013410.17260-2-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10kernel/fork.c: get totalram_pages from memblock to calculate max_threadsWei Yang
Since we plan to move the accounting into __free_pages_core(), totalram_pages may not represent the total usable pages on system at this point when defer_init is enabled. Instead we can get the total usable pages from memblock directly. Link: https://lkml.kernel.org/r/20240701013410.17260-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10mm: remove CONFIG_MEMCG_KMEMJohannes Weiner
CONFIG_MEMCG_KMEM used to be a user-visible option for whether slab tracking is enabled. It has been default-enabled and equivalent to CONFIG_MEMCG for almost a decade. We've only grown more kernel memory accounting sites since, and there is no imaginable cgroup usecase going forward that wants to track user pages but not the multitude of user-drivable kernel allocations. Link: https://lkml.kernel.org/r/20240701153148.452230-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10mm: memcg: add cache line padding to mem_cgroup_per_nodeRoman Gushchin
Memcg v1-specific fields serve a buffer function between read-mostly and update often parts of the mem_cgroup_per_node structure. If CONFIG_MEMCG_V1 is not set and these fields are not present, an explicit cacheline padding is needed. Link: https://lkml.kernel.org/r/20240701185932.704807-2-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10mm: memcg: drop obsolete cache line padding in struct mem_cgroupRoman Gushchin
After the grouping of the cgroup v1-related fields and the corresponding reorganization of the struct mem_cgroup, the existing cache line padding doesn't make much sense anymore. Let's drop it for now and put back to new places, if necessary. Link: https://lkml.kernel.org/r/20240701185932.704807-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/index: add links to admin-guide docSeongJae Park
Readers of DAMON subsystem documents index would want to further learn how they can use DAMON from the user-space. Add the link to the admin guide. Link: https://lkml.kernel.org/r/20240701192706.51415-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/index: add links to designSeongJae Park
DAMON subsystem documents index page provides a short intro of DAMON core concepts. Add links to sections of the design document to let users easily browse to the details. Link: https://lkml.kernel.org/r/20240701192706.51415-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: add links to sections of DAMON sysfs interface usage docSeongJae Park
Readers of the design document would wonder how they can configure and use specific DAMON features. Add links to sections of DAMON sysfs interface usage document that provides the answers for easier browsing. Link: https://lkml.kernel.org/r/20240701192706.51415-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: remove 'Programmable Modules' section in favor of ↵SeongJae Park
'Modules' section 'Programmable Modules' section provides high level descriptions of the DAMON API-based kernel modules layer. But 'Modules' section, which is at the end of the document, provides every detail about the layer including that of 'Programmable Modules' section. Since the brief summary of the layers at the beginning of the document has a link to the 'Modules' section, browsing to the section is not that difficult. Remove 'Programmable Modules' section in favor of 'Modules' section and reducing duplicates. Link: https://lkml.kernel.org/r/20240701192706.51415-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: move 'Configurable Operations Set' section into ↵SeongJae Park
'Operations Set Layer' section 'Configurable Operations Set' section is for providing a description of the pluggability of the operations set layer. Just after that, 'Operations Set Layer' section, which is dedicated for the entire things of the layer, follows. The layout is odd, and some descriptions are duplicated. Move 'Configurable Operations Set' section into 'Operations Set Layer' and re-write some of the detailed descriptions. Link: https://lkml.kernel.org/r/20240701192706.51415-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: add links from overall architecture to sections of detailsSeongJae Park
DAMON design document briefly explains the overall layers architecture first, and then provides detailed explanations of each layer with dedicated sections. Letting readers go directly to the detailed sections for specific layers could help easy browsing of the not-very-short document. Add links from the overall summary to the sections of details. Link: https://lkml.kernel.org/r/20240701192706.51415-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/admin-guide/mm/damon/start: add access pattern snapshot exampleSeongJae Park
DAMON user-space tool (damo) provides access pattern snapshot feature, which is expected to be frequently used for real time access pattern analysis. The snapshot output is also showing what DAMON provides on its own, including the 'age' information. In contrast, the recorded access patterns, which is shown as an example usage on the quick start section, shows what users can make from what DAMON provided. It includes information that generated outside of DAMON and makes the 'age' concept bit unclear. Hence snapshot output is easier at understanding the raw realtime output of DAMON. Add the snapshot usage example on the quick start section. Link: https://lkml.kernel.org/r/20240701192706.51415-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: clarify regions merging operationSeongJae Park
DAMON design document is not explaining how min_nr_regions limit is kept, and what happens if the number of regions exceeds max_nr_regions. Add more clarification for those. Link: https://lkml.kernel.org/r/20240701192706.51415-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Docs/mm/damon/design: fix two typosSeongJae Park
Patch series "Docs/damon: minor fixups and improvements". Fixup typos, clarify regions merging operation design with recent change, add access pattern snapshot example use case, and improve readability of the design document and subsystem documents index by reorganizing/wordsmithing and adding links to other sections and/or documents for easy browsing. This patch (of 9): Fix two typos. The first one is just a simple typo: s/accurach/accuracy/ The second one is made by the author being out of their mind. 'Region Based Sampling' section of the doc is mistakenly calling the access frequency counter of region as 'nr_regions'. Fix it with the correct name, 'nr_accesses'. Link: https://lkml.kernel.org/r/20240701192706.51415-1-sj@kernel.org Link: https://lkml.kernel.org/r/20240701192706.51415-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10mm/shmem: fix input and output inconsistenciesBang Li
Commit 19eaf44954df ("mm: thp: support allocation of anonymous multi-size THP") added mTHP support for anonymous shmem. We can configure different policies through the multi-size THP sysfs interface for anonymous shmem. But when we configure the "advise" policy of /sys/kernel/mm/transparent_hugepage/hugepages-xxxkB/shmem_enabled, we cannot write the "advise", but write the "madvise", which is unreasonable. We should keep the output and input values consistent, which is more convenient for users. Link: https://lkml.kernel.org/r/20240628032327.16987-1-libang.li@antgroup.com Fixes: 61a57f1b1da9 ("mm: shmem: add multi-size THP sysfs interface for anonymous shmem") Signed-off-by: Bang Li <libang.li@antgroup.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Bang Li <libang.li@antgroup.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10selftests: centralize -D_GNU_SOURCE= to CFLAGS in lib.mkEdward Liaw
Centralize the _GNU_SOURCE definition to CFLAGS in lib.mk. Remove redundant defines from Makefiles that import lib.mk. Convert any usage of "#define _GNU_SOURCE 1" to "#define _GNU_SOURCE". This uses the form "-D_GNU_SOURCE=", which is equivalent to "#define _GNU_SOURCE". Otherwise using "-D_GNU_SOURCE" is equivalent to "-D_GNU_SOURCE=1" and "#define _GNU_SOURCE 1", which is less commonly seen in source code and would require many changes in selftests to avoid redefinition warnings. Link: https://lkml.kernel.org/r/20240625223454.1586259-2-edliaw@google.com Signed-off-by: Edward Liaw <edliaw@google.com> Suggested-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: André Almeida <andrealmeid@igalia.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <kees@kernel.org> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10tools/mm: introduce a tool to assess swap entry allocation for thp_swapoutBarry Song
Both Ryan and Chris have been utilizing the small test program to aid in debugging and identifying issues with swap entry allocation. While a real or intricate workload might be more suitable for assessing the correctness and effectiveness of the swap allocation policy, a small test program presents a simpler means of understanding the problem and initially verifying the improvements being made. Let's endeavor to integrate it into tools/mm. Although it presently only accommodates 64KB and 4KB, I'm optimistic that we can expand its capabilities to support multiple sizes and simulate more complex systems in the future as required. Basically, we have 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code under high exercise in a short time. 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in freeing memory, as well as for munmap, app exits, or OOM killer scenarios. This ensures new mTHP is always generated, released or swapped out, similar to the behavior on a PC or Android phone where many applications are frequently started and terminated. 3. Swap in with or without the "-a" option to observe how fragments due to swap-in and the incoming swap-in of large folios will impact swap-out fallback. Due to 2, we ensure a certain proportion of mTHP. Similarly, because of 3, we maintain a certain proportion of small folios, as we don't support large folios swap-in, meaning any swap-in will immediately result in small folios. Therefore, with both 2 and 3, we automatically achieve a system containing both mTHP and small folios. Additionally, 1 provides the ability to continuously swap them out. We can also use "-s" to add a dedicated small folios memory area. [akpm@linux-foundation.org: thp_swap_allocator_test.c needs mman.h, per Kairui Song] Link: https://lkml.kernel.org/r/20240622071231.576056-2-21cnbao@gmail.com Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Chris Li <chrisl@kernel.org> Tested-by: Chris Li <chrisl@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-10Merge tag 'vfio-v6.10' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fix from Alex Williamson: - Recent stable backports are exposing a bug introduced in the v6.10 development cycle where a counter value is uninitialized. This leads to regressions in userspace drivers like QEMU where where the kernel might ask for an arbitrary buffer size or return out of memory itself based on a bogus value. Zero initialize the counter. (Yi Liu) * tag 'vfio-v6.10' of https://github.com/awilliam/linux-vfio: vfio/pci: Init the count variable in collecting hot-reset devices
2024-07-10Merge branch 'use network helpers, part 8'Martin KaFai Lau
Geliang Tang says: ==================== v11: - new patches 2, 4, 6. - drop expect_errno from network_helper_opts as Eduard and Martin suggested. - drop sockmap_ktls patches from this set. v10: - a new patch 10 is added. - patches 1-6, 8-9 unchanged, only commit logs updated. - "err = -errno" is used in patches 7, 11, 12 to get the real error number before checking value of "err". v9: - new patches 5-7, new struct member expect_errno for network_helper_opts. - patches 1-4, 8-9 unchanged. - update patches 10-11 to make sure all tests pass. v8: - only patch 8 updated, to fix errors reported by CI. v7: - address Martin's comments in v6. (thanks) - use MAX(opts->backlog, 0) instead of opts->backlog. - use connect_to_fd_opts instead connect_to_fd. - more ASSERT_* to check errors. v6: - update patch 6 as Daniel suggested. (thanks) v5: - keep make_server and make_client as Eduard suggested. v4: - a new patch to use make_sockaddr in sockmap_ktls. - a new patch to close fd in error path in drop_on_reuseport. - drop make_server() in patch 7. - drop make_client() too in patch 9. v3: - a new patch to add backlog for network_helper_opts. - use start_server_str in sockmap_ktls now, not start_server. v2: - address Eduard's comments in v1. (thanks) - fix errors reported by CI. This patch set uses network helpers in sk_lookup, and drop the local helpers inetaddr_len() and make_socket(). ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Use connect_fd_to_fd in sk_lookupGeliang Tang
This patch uses public helper connect_fd_to_fd() exported in network_helpers.h instead of using getsockname() + connect() in run_lookup_prog() in prog_tests/sk_lookup.c. This can simplify the code. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/7077c277cde5a1864cdc244727162fb75c8bb9c5.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Use start_server_addr in sk_lookupGeliang Tang
This patch uses public helper start_server_addr() in udp_recv_send() in prog_tests/sk_lookup.c to simplify the code. And use ASSERT_OK_FD() to check fd returned by start_server_addr(). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/f11cabfef4a2170ecb66a1e8e2e72116d8f621b3.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Use start_server_str in sk_lookupGeliang Tang
This patch uses public helper start_server_str() to simplify make_server() in prog_tests/sk_lookup.c. Add a callback setsockopts() to do all sockopts, set it to post_socket_cb pointer of struct network_helper_opts. And add a new struct cb_opts to save the data needed to pass to the callback. Then pass this network_helper_opts to start_server_str(). Also use ASSERT_OK_FD() to check fd returned by start_server_str(). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/5981539f5591d2c4998c962ef2bf45f34c940548.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Close fd in error path in drop_on_reuseportGeliang Tang
In the error path when update_lookup_map() fails in drop_on_reuseport in prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed. This patch fixes this by using "goto close_srv1" lable instead of "detach" to close "server1" in this case. Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Add ASSERT_OK_FD macroGeliang Tang
Add a new dedicated ASSERT macro ASSERT_OK_FD to test whether a socket FD is valid or not. It can be used to replace macros ASSERT_GT(fd, 0, ""), ASSERT_NEQ(fd, -1, "") or statements (fd < 0), (fd != -1). Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/ded75be86ac630a3a5099739431854c1ec33f0ea.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10selftests/bpf: Add backlog for network_helper_optsGeliang Tang
Some callers expect __start_server() helper to pass their own "backlog" value to listen() instead of the default of 1. So this patch adds struct member "backlog" for network_helper_opts to allow callers to set "backlog" value via start_server_str() helper. listen(fd, 0 /* backlog */) can be used to enforce syncookie. Meaning backlog 0 is a legit value. Using 0 as a default and changing it to 1 here is fine. It makes the test program easier to write for the common case. Enforcing syncookie mode by using backlog 0 is a niche use case but it should at least have a way for the caller to do that. Thus, -ve backlog value is used here for the syncookie use case. Please see the comment in network_helpers.h for the details. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/1660229659b66eaad07aa2126e9c9fe217eba0dd.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-10Merge tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: - Switch some asserts to WARN() - Fix a few "transaction not locked" asserts in the data read retry paths and backpointers gc - Fix a race that would cause the journal to get stuck on a flush commit - Add missing fsck checks for the fragmentation LRU - The usual assorted ssorted syzbot fixes * tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs: (22 commits) bcachefs: Add missing bch2_trans_begin() bcachefs: Fix missing error check in journal_entry_btree_keys_validate() bcachefs: Warn on attempting a move with no replicas bcachefs: bch2_data_update_to_text() bcachefs: Log mount failure error code bcachefs: Fix undefined behaviour in eytzinger1_first() bcachefs: Mark bch_inode_info as SLAB_ACCOUNT bcachefs: Fix bch2_inode_insert() race path for tmpfiles closures: fix closure_sync + closure debugging bcachefs: Fix journal getting stuck on a flush commit bcachefs: io clock: run timer fns under clock lock bcachefs: Repair fragmentation_lru in alloc_write_key() bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref() bcachefs: bch2_btree_write_buffer_maybe_flush() bcachefs: Add missing printbuf_tabstops_reset() calls bcachefs: Fix loop restart in bch2_btree_transactions_read() bcachefs: Fix bch2_read_retry_nodecode() bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs bcachefs: Fix shift greater than integer size bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning ...
2024-07-10selftests/bpf: fix compilation failure when CONFIG_NF_FLOW_TABLE=mAlan Maguire
In many cases, kernel netfilter functionality is built as modules. If CONFIG_NF_FLOW_TABLE=m in particular, progs/xdp_flowtable.c (and hence selftests) will fail to compile, so add a ___local version of "struct flow_ports". Fixes: c77e572d3a8c ("selftests/bpf: Add selftest for bpf_xdp_flow_lookup kfunc") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20240710150051.192598-1-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-10mailbox: Add support for QTI CPUCP mailbox controllerSibi Sankar
Add support for CPUSS Control Processor (CPUCP) mailbox controller, this driver enables communication between AP and CPUCP by acting as a doorbell between them. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10dt-bindings: mailbox: qcom: Add CPUCP mailbox controller bindingsSibi Sankar
Add devicetree binding for CPUSS Control Processor (CPUCP) mailbox controller. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10dt-bindings: remoteproc: qcom,sa8775p-pas: Document the SA8775p ADSP, CDSP ↵Bartosz Golaszewski
and GPDSP Document the components used to boot the ADSP, CDSP0, CDSP1, GPDSP0 and GPDSP1 on the SA8775p SoC. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: mtk-cmdq: add missing MODULE_DESCRIPTION() macroJeff Johnson
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mailbox/mtk-cmdq-mailbox.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: bcm-pdc: remove unused struct 'pdc_dma_map'Dr. David Alan Gilbert
'pdf_dma_map' has been unused since the original commit a24532f8d17b ("mailbox: Add Broadcom PDC mailbox driver"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: imx: fix TXDB_V2 channel race conditionPeng Fan
Two TXDB_V2 channels are used between Linux and System Manager(SM). Channel0 for normal TX, Channel 1 for notification completion. The TXDB_V2 trigger logic is using imx_mu_xcr_rmw which uses read/modify/update logic. Note: clear MUB GSR BITs, the MUA side GCR BITs will also got cleared per hardware design. Channel0 Linux read GCR->modify GCR->write GCR->M33 SM->read GSR----->clear GSR |-(1)-| Channel1 Linux start in time slot(1) read GCR->modify GCR->write GCR->M33 SM->read GSR->clear GSR So Channel1 read GCR will read back the GCR that Channel0 wrote, because M33 has not finish clear GSR, this means Channel1 GCR writing will trigger Channel1 and Channel0 interrupt both which is wrong. Channel0 will be freed(SCMI channel status set to FREE) in M33 SM when processing the 1st Channel0 interrupt. So when 2nd interrupt trigger (channel 0/1 trigger together), SM will see a freed Channel0, and report protocol error. To address the issue, not using read/modify/update logic, just use write, because write 0 to GCR will be ignored. And after write MUA GCR, wait the SM to clear MUB GSR by looping MUA GCR value. Fixes: 5bfe4067d350 ("mailbox: imx: support channel type tx doorbell v2") Reviewed-by: Ranjani Vaidyanathan <ranjani.vaidyanathan@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: omap: Fix mailbox interrupt sharingAndrew Davis
Multiple mailbox users can share one interrupt line. This flag was mistakenly dropped as part of the FIFO removal. Mark the IRQ as shared. Reported-by: Beleswar Padhi <b-padhi@ti.com> Fixes: 3f58c1f4206f ("mailbox: omap: Remove kernel FIFO message queuing") Signed-off-by: Andrew Davis <afd@ti.com> Tested-by: Beleswar Padhi <b-padhi@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: mtk-cmdq: Dynamically allocate clk_bulk_data structureAngeloGioacchino Del Regno
Now that the clock probing code uses devm_kasprintf(), there is no more restriction on the number of GCEs: dynamically allocate the clk_bulk_data clocks array to improve flexibility and also to get a slight memory saving on platforms featuring only one CMDQ mailbox (and consequently only one Global Command Engine). Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: mtk-cmdq: Move and partially refactor clocks probeAngeloGioacchino Del Regno
Move the clocks probe to a new cmdq_get_clocks() function; while at it, partially refactor the code: Drop the clk_names[] array and assign clock names to the array of clk_bulk_data with devm_kasprintf() instead, slightly reduce the indentation for the multi-gce clock probe path and add a comment describing the reason why we get clocks of other GCE instance instead of just the clock from the one that it is getting probed. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10mailbox: mtk-cmdq: Stop requiring name for GCE clockAngeloGioacchino Del Regno
The Global Command Engine mailbox has only one clock hence requiring clock-names is useless. Get the first (and only) clock instead, without name checks. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-10dt-bindings: mailbox: Add mediatek,gce-props.yamlJason-JH.Lin
Add mediatek,gce-props.yaml for common GCE properties that is used for both mailbox providers and consumers. We place the common property "mediatek,gce-events" in this binding currently. The property "mediatek,gce-events" is used for GCE event ID corresponding to a hardware event signal sent by the hardware or a software driver. If the mailbox providers or consumers want to manipulate the value of the event ID, they need to know the specific event ID. Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-07-11bus: sunxi-rsb: Constify struct regmap_busJavier Carrasco
`regmap_sunxi_rsb` is not modified and can be declared as const to move its data to a read-only section. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20240705-sunxi-rsb-bus-const-regmap_bus-v1-1-129094960ce9@gmail.com Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-07-10dt-bindings: watchdog: renesas,wdt: Document RZ/G3S supportClaudiu Beznea
Document the support for the watchdog IP available on RZ/G3S SoC. The watchdog IP available on RZ/G3S SoC is identical to the one found on RZ/G2L SoC. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-10-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Add suspend/resume supportClaudiu Beznea
The RZ/G3S supports deep sleep states where power to most of the IP blocks is cut off. To ensure proper working of the watchdog when resuming from such states, the suspend function is stopping the watchdog and the resume function is starting it. There is no need to configure the watchdog in case the watchdog was stopped prior to starting suspend. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-9-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Rely on the reset driver for doing proper resetClaudiu Beznea
The reset driver has been adapted in commit da235d2fac21 ("clk: renesas: rzg2l: Check reset monitor registers") to check the reset monitor bits before declaring reset asserts/de-asserts as successful/failure operations. With that, there is no need to keep the reset workaround for RZ/V2M in place in the watchdog driver. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-8-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Remove comparison with zeroClaudiu Beznea
devm_add_action_or_reset() could return -ENOMEM or zero. Thus, remove comparison with zero of the returning value to make code simpler. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-7-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Remove reset de-assert from probeClaudiu Beznea
There is no need to de-assert the reset signal on probe as the watchdog is not used prior executing start. Also, the clocks are not enabled in probe (pm_runtime_enable() doesn't do that), thus this is another indicator that the watchdog wasn't used previously like this. Instead, keep the watchdog hardware in its previous state at probe (by default it is in reset state), enable it when it is started and move it to reset state when it is stopped. This saves some extra power when the watchdog is unused. Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-6-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Check return status of pm_runtime_put()Claudiu Beznea
pm_runtime_put() may return an error code. Check its return status. Along with it the rzg2l_wdt_set_timeout() function was updated to propagate the result of rzg2l_wdt_stop() to its caller. Fixes: 2cbc5cd0b55f ("watchdog: Add Watchdog Timer driver for RZ/G2L") Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-5-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Use pm_runtime_resume_and_get()Claudiu Beznea
pm_runtime_get_sync() may return with error. In case it returns with error dev->power.usage_count needs to be decremented. pm_runtime_resume_and_get() takes care of this. Thus use it. Along with it the rzg2l_wdt_set_timeout() function was updated to propagate the result of rzg2l_wdt_start() to its caller. Fixes: 2cbc5cd0b55f ("watchdog: Add Watchdog Timer driver for RZ/G2L") Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-4-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Make the driver depend on PMClaudiu Beznea
The rzg2l_wdt watchdog driver cannot work w/o CONFIG_PM=y (e.g. the clocks are enabled though pm_runtime_* specific APIs). To avoid building a driver that doesn't work make explicit the dependency on CONFIG_PM. Suggested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-3-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: rzg2l_wdt: Restrict the driver to ARCH_RZG2L and ARCH_R9A09G011Claudiu Beznea
The rzg2l_wdt driver is used only by ARCH_RZG2L and ARCH_R9A09G011 micro-architectures of Renesas. Thus, limit it's usage only to these. Suggested-by: Biju Das <biju.das.jz@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240531065723.1085423-2-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-07-10watchdog: imx7ulp_wdt: keep already running watchdog enabledSascha Hauer
When the bootloader enabled the watchdog before Kernel started then keep it enabled during initialization. Otherwise the time between the watchdog probing and the userspace taking over the watchdog won't be covered by the watchdog. When keeping the watchdog enabled inform the Kernel about this by setting the WDOG_HW_RUNNING so that the periodic watchdog feeder is started when desired. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240703111603.1096424-1-s.hauer@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>