Age | Commit message (Collapse) | Author |
|
Patch series "mm/gup: Unify hugetlb, speed up thp", v4.
Hugetlb has a special path for slow gup that follow_page_mask() is
actually skipped completely along with faultin_page(). It's not only
confusing, but also duplicating a lot of logics that generic gup already
has, making hugetlb slightly special.
This patchset tries to dedup the logic, by first touching up the slow gup
code to be able to handle hugetlb pages correctly with the current follow
page and faultin routines (where we're mostly there.. due to 10 years ago
we did try to optimize thp, but half way done; more below), then at the
last patch drop the special path, then the hugetlb gup will always go the
generic routine too via faultin_page().
Note that hugetlb is still special for gup, mostly due to the pgtable
walking (hugetlb_walk()) that we rely on which is currently per-arch. But
this is still one small step forward, and the diffstat might be a proof
too that this might be worthwhile.
Then for the "speed up thp" side: as a side effect, when I'm looking at
the chunk of code, I found that thp support is actually partially done.
It doesn't mean that thp won't work for gup, but as long as **pages
pointer passed over, the optimization will be skipped too. Patch 6 should
address that, so for thp we now get full speed gup.
For a quick number, "chrt -f 1 ./gup_test -m 512 -t -L -n 1024 -r 10"
gives me 13992.50us -> 378.50us. Gup_test is an extreme case, but just to
show how it affects thp gups.
This patch (of 8):
Firstly, the no_page_table() is meaningless for hugetlb which is a no-op
there, because a hugetlb page always satisfies:
- vma_is_anonymous() == false
- vma->vm_ops->fault != NULL
So we can already safely remove it in hugetlb_follow_page_mask(), alongside
with the page* variable.
Meanwhile, what we do in follow_hugetlb_page() actually makes sense for a
dump: we try to fault in the page only if the page cache is already
allocated. Let's do the same here for follow_page_mask() on hugetlb.
It should so far has zero effect on real dumps, because that still goes
into follow_hugetlb_page(). But this may start to influence a bit on
follow_page() users who mimics a "dump page" scenario, but hopefully in a
good way. This also paves way for unifying the hugetlb gup-slow.
Link: https://lkml.kernel.org/r/20230628215310.73782-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20230628215310.73782-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kirill A . Shutemov <kirill@shutemov.name>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
As a result of the patches "mm: Call arch_swap_restore() from
do_swap_page()" and "mm: Call arch_swap_restore() from unuse_pte()", there
are no circumstances in which a swapped-in page is installed in a page
table without first having arch_swap_restore() called on it. Therefore,
we no longer need the logic in set_pte_at() that restores the tags, so
remove it.
Link: https://lkml.kernel.org/r/20230523004312.1807357-4-pcc@google.com
Link: https://linux-review.googlesource.com/id/I8ad54476f3b2d0144ccd8ce0c1d7a2963e5ff6f3
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: kasan-dev@googlegroups.com
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: "Kuan-Ying Lee (李冠穎)" <Kuan-Ying.Lee@mediatek.com>
Cc: Qun-Wei Lin <qun-wei.lin@mediatek.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We would like to move away from requiring architectures to restore
metadata from swap in the set_pte_at() implementation, as this is not only
error-prone but adds complexity to the arch-specific code. This requires
us to call arch_swap_restore() before calling swap_free() whenever pages
are restored from swap. We are currently doing so everywhere except in
unuse_pte(); do so there as well.
Link: https://lkml.kernel.org/r/20230523004312.1807357-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/I68276653e612d64cde271ce1b5a99ae05d6bbc4f
Signed-off-by: Peter Collingbourne <pcc@google.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: "Kuan-Ying Lee (李冠穎)" <Kuan-Ying.Lee@mediatek.com>
Cc: Qun-Wei Lin <qun-wei.lin@mediatek.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
All callers of show_free_areas() pass 0 and NULL, so we can directly use
show_mem() instead of show_free_areas(0, NULL), which could make
show_free_areas() a static function.
Link: https://lkml.kernel.org/r/20230630062253.189440-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
All callers of show_mem() pass 0 and NULL, so we can remove the two
arguments by directly calling __show_mem(0, NULL, MAX_NR_ZONES - 1) in
show_mem().
Link: https://lkml.kernel.org/r/20230630062253.189440-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The memfd_create() syscall, enabled by CONFIG_MEMFD_CREATE, is useful on
its own even when not required by CONFIG_TMPFS or CONFIG_HUGETLBFS.
Split it into its own proper bool option that can be enabled by users.
Move that option into mm/ where the code itself also lies. Also add
"select" statements to CONFIG_TMPFS and CONFIG_HUGETLBFS so they
automatically enable CONFIG_MEMFD_CREATE as before.
Link: https://lkml.kernel.org/r/20230630-config-memfd-v1-1-9acc3ae38b5a@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Tested-by: Zhangjin Wu <falcon@tinylab.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
After converting the last user to folio_raw_mapping(), we can safely
remove the function.
Link: https://lkml.kernel.org/r/20230701032853.258697-3-zhangpeng362@huawei.com
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We can replace four implicit calls to compound_head() with one by using
folio.
Link: https://lkml.kernel.org/r/20230701032853.258697-2-zhangpeng362@huawei.com
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Our test finds a WARN_ON in add_to_avail_list. During add_to_avail_list,
avail_lists is already in swap_avail_heads, while leads to this WARN_ON.
Here is the simplified calltrace:
------------[ cut here ]------------
Call trace:
add_to_avail_list+0xb8/0xc0
swap_range_free+0x110/0x138
swapcache_free_entries+0x100/0x1c0
free_swap_slot+0xbc/0xe0
put_swap_folio+0x1f0/0x2ec
delete_from_swap_cache+0x6c/0xd0
folio_free_swap+0xa4/0xe4
__try_to_reclaim_swap+0x9c/0x190
free_swap_and_cache+0x84/0x88
unmap_page_range+0x31c/0x934
unmap_single_vma.isra.0+0x48/0x84
unmap_vmas+0x98/0x10c
exit_mmap+0xa4/0x210
mmput+0x88/0x158
do_exit+0x284/0x970
do_group_exit+0x34/0x90
post_copy_siginfo_from_user32+0x0/0x1cc
do_notify_resume+0x15c/0x470
el0_svc+0x74/0x84
el0t_64_sync_handler+0xb8/0xbc
el0t_64_sync+0x190/0x194
During swapoff, try_to_unuse fails to alloc memory due to memory limit and
this leads to the failure of swapoff and causes re-insertion of swap space
back into swap_list. During _enable_swap_info, this swap device is added
to avail list even this swap device if full. At the same time, one entry
in this full swap device in released and we try to add this device into
avail list and find it is already in the avail list. This causes this
WARN_ON.
To fix this. Don't add to avail list is swap is full.
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20230627120833.2230766-3-mawupeng1@huawei.com
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "fix WARN_ON in add_to_avail_list".
Empty check for plist_node is checked in add_to_avail_list and plist_add.
Drop the duplicate one in add_to_avail_list.
Link: https://lkml.kernel.org/r/20230627120833.2230766-1-mawupeng1@huawei.com
Link: https://lkml.kernel.org/r/20230627120833.2230766-2-mawupeng1@huawei.com
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using
the existing helper folio_next_index().
Link: https://lkml.kernel.org/r/20230627174349.491803-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since commit 633c0666b5a5 ("Memoryless nodes: drop one memoryless node boot
warning"), the warning for a node with no available memory is removed.
Update the corresponding comment.
Link: https://lkml.kernel.org/r/20230625033340.1054103-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The documentation of mt_next() claims that it starts the search at the
provided index. That's incorrect as it starts the search after the
provided index.
The documentation of mt_find() is slightly confusing. "Handles locking"
is not really helpful as it does not explain how the "locking" works.
Also the documentation of index talks about a range, while in reality the
index is updated on a succesful search to the index of the found entry
plus one.
Fix similar issues for mt_find_after() and mt_prev().
Reword the confusing "Note: Will not return the zero entry." comment on
mt_for_each() and document @__index correctly.
Link: https://lkml.kernel.org/r/87ttw2n556.ffs@tglx
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A folio turns into a Workingset during:
1) shrink_active_list() placing the folio from active to inactive list.
2) When a workingset transition is happening during the folio refault.
And when Workingset is set on a folio, PSI for memory can be accounted
during a) That folio is being reclaimed and b) Refault of that folio,
for usual reclaims.
This accounting of PSI for memory is not consistent for reclaim +
refault operation between usual reclaim and madvise(COLD/PAGEOUT) which
deactivate or proactively reclaim a folio:
a) A folio started at inactive and moved to active as part of accesses.
Workingset is absent on the folio thus refault of it when reclaimed
through MADV_PAGEOUT operation doesn't account for PSI.
b) When the same folio transition from inactive->active and then to
inactive through shrink_active_list(). Workingset is set on the folio
thus refault of it when reclaimed through MADV_PAGEOUT operation
accounts for PSI.
c) When the same folio is part of active list directly as a result of
folio refault and this was a workingset folio prior to eviction.
Workingset is set on the folio thus the refault of it when reclaimed
through MADV_PAGEOUT/MADV_COLD operation accounts for PSI.
d) MADV_COLD transfers the folio from active list to inactive
list. Such folios may not have the Workingset thus refault operation on
such folio doesn't account for PSI.
As said above, refault operation caused because of MADV_PAGEOUT on a
folio is accounts for memory PSI in b) and c) but not in a). Refault
caused by the reclaim of a folio on which MADV_COLD is performed
accounts memory PSI in c) but not in d). These behaviours are
inconsistent w.r.t usual reclaim + refault operation. Make this PSI
accounting always consistent by turning a folio into a workingset one
whenever it is leaving the active list. Also, accounting of PSI on a
folio whenever it leaves the active list as part of the
MADV_COLD/PAGEOUT operation helps the users whether they are operating
on proper folios[1].
[1] https://lore.kernel.org/all/20230605180013.GD221380@cmpxchg.org/
Link: https://lkml.kernel.org/r/1688393201-11135-1-git-send-email-quic_charante@quicinc.com
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: Sai Manobhiram Manapragada <quic_smanapra@quicinc.com>
Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix in_flight[issue_type] value error to properly manage requests
MMC host:
- wbsd: Fix double free in the probe error path
- sunplus: Fix error path in probe
- sdhci_f_sdh30: Fix order of function calls in sdhci_f_sdh30_remove"
* tag 'mmc-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: f-sdh30: fix order of function calls in sdhci_f_sdh30_remove
mmc: sunplus: Fix error handling in spmmc_drv_probe()
mmc: sunplus: fix return value check of mmc_add_host()
mmc: wbsd: fix double mmc_free_host() in wbsd_init()
mmc: block: Fix in_flight[issue_type] value error
|
|
ACPI TRBE does not have a HID for identification which could create and add
a platform device into the platform bus. Also without a platform device, it
cannot be probed and bound to a platform driver.
This creates a dummy platform device for TRBE after ascertaining that ACPI
provides required interrupts uniformly across all cpus on the system. This
device gets created inside drivers/perf/arm_pmu_acpi.c to accommodate TRBE
being built as a module.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230817055405.249630-3-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Sanity checking all the GICC tables for same interrupt number, and ensuring
a homogeneous ACPI based machine, could be used for other platform devices
as well. Hence this refactors arm_spe_acpi_register_device() into a common
helper arm_acpi_register_pmu_device().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Co-developed-by: Will Deacon <will@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230817055405.249630-2-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The EFI stub is supported on RISC-V so update the documentation that
explains how the boot image header was reused to support it.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230817130734.10387-3-alexghiti@rivosinc.com
|
|
This document describes the constraints and requirements of the early
boot process in a RISC-V kernel.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230817130734.10387-2-alexghiti@rivosinc.com
|
|
The bootargs node is also added by the EFI stub in the function
update_fdt(), so add it to the table.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Song Shuai <songshuaishuai@tinylab.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230817130734.10387-1-alexghiti@rivosinc.com
|
|
Merge series from Biju Das <biju.das.jz@bp.renesas.com>:
This patch series aims to add trivial fixes for raa215300 driver.
These issues were reported by Pavel while backporting this patch
to 6.1.y cip kernel[1].
[1] https://lore.kernel.org/all/ZN3%2FSjL50ls+3dnD@duo.ucw.cz/
v1->v2:
* Split Kconfig and add missing space for comment block as separate
patch.
Biju Das (3):
regulator: raa215300: Change rate from 32000->32768
regulator: raa215300: Add missing blank space
regulator: raa215300: Update help description
drivers/regulator/Kconfig | 6 +++++-
drivers/regulator/raa215300.c | 4 ++--
2 files changed, 7 insertions(+), 3 deletions(-)
--
2.25.1
|
|
The parser of the CPU lists is bitmap_parselist() that supports
special notations with the plain numbers. bitmap_parse() never
supported those and will fail in case one will try it.
Fixes: b18def121f07 ("bitmap_parse: Support 'all' semantics")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230817140432.507889-1-andriy.shevchenko@linux.intel.com
|
|
Commit 5f47adf762b7 ("mm/memory_hotplug: allow to specify a default
online_type") allows to specify a default online_type which make
online memory to kernel or movable zone possible but fail to update
to doc. Update doc to fit this change.
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230802074312.2111074-1-mawupeng1@huawei.com
|
|
Include to process/kernel-docs.rst a book on Linux system administration
published in May, 2023 (with ISBN 978-1098109035).
Signed-off-by: Carlos Bilbao <carlos.bilbao@amd.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230803142417.965313-1-carlos.bilbao@amd.com
|
|
The http and git links are invalid, replace them with valid links.
Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230804112320.35592-1-minhuadotchen@gmail.com
|
|
Commit 3e3271549670 ("vfs: get rid of old '->iterate' directory operation")
removed the iterate() file_operations member, but neglected to clean up the
associated documentation. Get rid of the leftovers.
Link: https://lore.kernel.org/r/874jl945bv.fsf@meer.lwn.net
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
It is common for university researchers to want to poll the community with
online surveys, but that approach distracts developers while yielding
little in the way of useful data. Encourage alternatives instead.
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/87il9v7u55.fsf@meer.lwn.net
|
|
and fix all in-tree references.
Architecture-specific documentation is being moved into Documentation/arch/
as a way of cleaning up the top-level documentation directory and making
the docs hierarchy more closely match the source hierarchy.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230725043835.2249678-1-costa.shul@redhat.com
|
|
and fix all in-tree references.
Architecture-specific documentation is being moved into Documentation/arch/
as a way of cleaning up the top-level documentation directory and making
the docs hierarchy more closely match the source hierarchy.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230717192456.453124-1-costa.shul@redhat.com
|
|
1GB HugeTLB page consists of 262144 base pages.
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230207114456.2304801-1-usama.arif@bytedance.com
|
|
The code calling ima_free_kexec_buffer runs long after the memblock
allocator has already been torn down, potentially resulting in a use
after free in memblock_isolate_range.
With KASAN or KFENCE, this use after free will result in a BUG
from the idle task, and a subsequent kernel panic.
Switch ima_free_kexec_buffer over to memblock_free_late to avoid
that issue.
Fixes: fee3ff99bc67 ("powerpc: Move arch independent ima kexec functions to drivers/of/kexec.c")
Cc: stable@kernel.org
Signed-off-by: Rik van Riel <riel@surriel.com>
Suggested-by: Mike Rappoport <rppt@kernel.org>
Link: https://lore.kernel.org/r/20230817135759.0888e5ef@imladris.surriel.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Cleaning up the driver to use pm_ptr() and *_PM_OPS() macros that
make it simpler and allows the compiler to remove those functions
if built without CONFIG_PM and CONFIG_PM_SLEEP support.
The lp_gpio_resume() is also assigned to .thaw and .restore members.
This is not a problem as the function it enables input pins that
had been disabled by firmware and repetion of that doesn't change
the pin configuration, i.e. it is idempotent.
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20230717172821.62827-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
The CS42L43 is an audio CODEC with integrated MIPI SoundWire interface
(Version 1.2.1 compliant), I2C, SPI, and I2S/TDM interfaces designed
for portable applications. It provides a high dynamic range, stereo
DAC for headphone output, two integrated Class D amplifiers for
loudspeakers, and two ADCs for wired headset microphone input or
stereo line input. PDM inputs are provided for digital microphones.
The SPI component incorporates a SPI controller interface for
communication with other peripheral components.
Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Signed-off-by: Maciej Strozek <mstrozek@opensource.cirrus.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230804104602.395892-6-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Fixes two issues with the Qualcomm SA8775P platform:
- Some minor device tree binding flunky that is nice to iron out but
more importantly:
- Support the increased interrupt targets mask from 3 to 4 bits,
making interrupts with higher (hardware) numbers work"
* tag 'pinctrl-v6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: Add intr_target_width field to support increased number of interrupt targets
dt-bindings: pinctrl: qcom,sa8775p-tlmm: add gpio function constant
|
|
merge window
Merge tag 'ib-mfd-pinctrl-soundwire-v6.6' of https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into tmp
Immutable branch between MFD, Pinctrl and soundwire due for the v6.6 merge window
|
|
Cleaning up the driver to use pm_ptr() and *_PM_OPS() macros that
make it simpler and allows the compiler to remove those functions
if built without CONFIG_PM and CONFIG_PM_SLEEP support.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20230717172821.62827-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Fix typos in Documentation/devicetree/bindings. The changes are in
descriptions or comments where they shouldn't affect functionality.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230814212822.193684-3-helgaas@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"As usual, mostly DT fixes for the major Arm platforms from Qualcomm
and NXP, plus a bit for Rockchips and others:
The qualcomm fixes mainly deal with their higher-end arm64 devices
trees, fixing issues in L3 interconnect, crypto, thermal, UFS and a
regression for the DSI phy.
NXP i.MX has two correctness fixes for the 64-bit chips, dealing with
the imx93 "anatop" module and the CSI interface. On the 32-bit side,
there are functional fixes for RTC, display and SD card intefaces.
Rockchip fixes are for wifi support on certain boards, a eMMC
stability and DT build warnings.
On TI OMAP, a regulator is described in DT to avoid problems with the
ethernet phy initialization.
The code changes include a missing MMIO serialization on OMAP, plus a
few minor fixes on ASpeed and AMD/Zynq chips"
* tag 'soc-fixes-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
ARM: dts: am335x-bone-common: Add vcc-supply for on-board eeprom
ARM: dts: am335x-bone-common: Add GPIO PHY reset on revision C3 board
soc: aspeed: socinfo: Add kfree for kstrdup
soc: aspeed: uart-routing: Use __sysfs_match_string
ARM: dts: integrator: fix PCI bus dtc warnings
arm64: dts: imx93: Fix anatop node size
arm64: dts: qcom: sc7180: Fix DSI0_PHY reg-names
ARM: dts: imx: Set default tuning step for imx6sx usdhc
arm64: dts: imx8mm: Drop CSI1 PHY reference clock configuration
arm64: dts: imx8mn: Drop CSI1 PHY reference clock configuration
ARM: dts: imx: Set default tuning step for imx7d usdhc
ARM: dts: imx6: phytec: fix RTC interrupt level
ARM: dts: imx6sx: Remove LDB endpoint
arm64: dts: rockchip: Fix Wifi/Bluetooth on ROCK Pi 4 boards
ARM: zynq: Explicitly include correct DT includes
arm64: dts: qcom: sa8775p-ride: Update L4C parameters
arm64: dts: rockchip: minor whitespace cleanup around '='
arm64: dts: rockchip: Disable HS400 for eMMC on ROCK 4C+
arm64: dts: rockchip: Disable HS400 for eMMC on ROCK Pi 4
arm64: dts: rockchip: add missing space before { on indiedroid nova
...
|
|
Remove the zynqmp-genpd.txt binding. Add the power-domain-cells
property from the zynqmp-genpd.txt binding to firmware binding.
Signed-off-by: Naman Trivedi Manojbhai <naman.trivedimanojbhai@amd.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230816130309.1338446-1-naman.trivedimanojbhai@amd.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic regression fix from Arnd Bergmann:
"Just one partial revert for a commit from the merge window that caused
annoying behavior when building old kernels on arm64 hosts"
* tag 'asm-generic-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: partially revert "Unify uapi bitsperlong.h for arm64, riscv and loongarch"
|
|
This patch adds selftests that exercise kfunc flavor relocation
functionality added in the previous patch. The actual kfunc defined
in kernel/bpf/helpers.c is:
struct task_struct *bpf_task_acquire(struct task_struct *p)
The following relocation behaviors are checked:
struct task_struct *bpf_task_acquire___one(struct task_struct *name)
* Should succeed despite differing param name
struct task_struct *bpf_task_acquire___two(struct task_struct *p, void *ctx)
* Should fail because there is no two-param bpf_task_acquire
struct task_struct *bpf_task_acquire___three(void *ctx)
* Should fail because, despite vmlinux's bpf_task_acquire having one param,
the types don't match
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230817225353.2570845-2-davemarchevsky@fb.com
|
|
The function signature of kfuncs can change at any time due to their
intentional lack of stability guarantees. As kfuncs become more widely
used, BPF program writers will need facilities to support calling
different versions of a kfunc from a single BPF object. Consider this
simplified example based on a real scenario we ran into at Meta:
/* initial kfunc signature */
int some_kfunc(void *ptr)
/* Oops, we need to add some flag to modify behavior. No problem,
change the kfunc. flags = 0 retains original behavior */
int some_kfunc(void *ptr, long flags)
If the initial version of the kfunc is deployed on some portion of the
fleet and the new version on the rest, a fleetwide service that uses
some_kfunc will currently need to load different BPF programs depending
on which some_kfunc is available.
Luckily CO-RE provides a facility to solve a very similar problem,
struct definition changes, by allowing program writers to declare
my_struct___old and my_struct___new, with ___suffix being considered a
'flavor' of the non-suffixed name and being ignored by
bpf_core_type_exists and similar calls.
This patch extends the 'flavor' facility to the kfunc extern
relocation process. BPF program writers can now declare
extern int some_kfunc___old(void *ptr)
extern int some_kfunc___new(void *ptr, int flags)
then test which version of the kfunc exists with bpf_ksym_exists.
Relocation and verifier's dead code elimination will work in concert as
expected, allowing this pattern:
if (bpf_ksym_exists(some_kfunc___old))
some_kfunc___old(ptr);
else
some_kfunc___new(ptr, 0);
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com
|
|
The hwcaps selftest currently relies on the assembler being able to
assemble the crc32w instruction but this is not in the base v8.0 so is not
accepted by the standard GCC configurations used by many distributions.
Switch to manually encoding to fix the build.
Fixes: 09d2e95a04ad ("kselftest/arm64: add crc32 feature to hwcap test")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230816-arm64-fix-crc32-build-v1-1-40165c1290f2@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
Attaching 1 probe...
hit
hit
[...]
^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.
Signed-off-by: Tomislav Novak <tnovak@meta.com>
Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
module
It should call platform_device_unregister() instead of
platform_device_del() to unregister and free the device.
Fixes: 23a1b46f15d5 ("iommufd/selftest: Make the mock iommu driver into a real driver")
Link: https://lore.kernel.org/r/20230816081318.1232865-1-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add intel_iommu_hw_info() to report cap_reg and ecap_reg information.
Link: https://lore.kernel.org/r/20230818101033.4100-6-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add a mock_domain_hw_info function and an iommu_test_hw_info data
structure. This allows to test the IOMMU_GET_HW_INFO ioctl passing the
test_reg value for the mock_dev.
Link: https://lore.kernel.org/r/20230818101033.4100-5-yi.l.liu@intel.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Under nested IOMMU translation, userspace owns the stage-1 translation
table (e.g. the stage-1 page table of Intel VT-d or the context table of
ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific, and
need to be compatible with the underlying IOMMU hardware. Hence, userspace
should know the IOMMU hardware capability before creating and configuring
the stage-1 translation table to kernel.
This adds IOMMU_GET_HW_INFO ioctl to query the IOMMU hardware information
(a.k.a capability) for a given device. The returned data is vendor
specific, userspace needs to decode it with the structure by the output
@out_data_type field.
As only physical devices have IOMMU hardware, so this will return error if
the given device is not a physical device.
Link: https://lore.kernel.org/r/20230818101033.4100-4-yi.l.liu@intel.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Introduce a new iommu op to get the IOMMU hardware capabilities for
iommufd. This information will be used by any vIOMMU driver which is owned
by userspace.
This op chooses to make the special parameters opaque to the core. This
suits the current usage model where accessing any of the IOMMU device
special parameters does require a userspace driver that matches the kernel
driver. If a need for common parameters, implemented similarly by several
drivers, arises then there's room in the design to grow a generic
parameter set as well. No wrapper API is added as it is supposed to be
used by iommufd only.
Different IOMMU hardware would have different hardware information. So the
information reported differs as well. To let the external user understand
the difference, enum iommu_hw_info_type is defined. For the iommu drivers
that are capable to report hardware information, it should have a unique
iommu_hw_info_type and return to caller. For the driver doesn't report
hardware information, caller just uses IOMMU_HW_INFO_TYPE_NONE if a type
is required.
Link: https://lore.kernel.org/r/20230818101033.4100-3-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
dev_iommu_ops() is essentially only used in iommu subsystem, so move to a
private header to avoid being abused by other drivers.
Link: https://lore.kernel.org/r/20230818101033.4100-2-yi.l.liu@intel.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|