Age | Commit message (Collapse) | Author |
|
Pull documentation updates from Jonathan Corbet:
"It was a moderately busy cycle for documentation; highlights include:
- After a long period of inactivity, the Japanese translations are
seeing some much-needed maintenance and updating.
- Reworked IOMMU documentation
- Some new documentation for static-analysis tools
- A new overall structure for the memory-management documentation.
This is an LSFMM outcome that, it is hoped, will help encourage
developers to fill in the many gaps. Optimism is eternal...but
hopefully it will work.
- More Chinese translations.
Plus the usual typo fixes, updates, etc"
* tag 'docs-5.19' of git://git.lwn.net/linux: (70 commits)
docs: pdfdocs: Add space for chapter counts >= 100 in TOC
docs/zh_CN: Add dev-tools/gdb-kernel-debugging.rst Chinese translation
input: Docs: correct ntrig.rst typo
input: Docs: correct atarikbd.rst typos
MAINTAINERS: Become the docs/zh_CN maintainer
docs/zh_CN: fix devicetree usage-model translation
mm,doc: Add new documentation structure
Documentation: drop more IDE boot options and ide-cd.rst
Documentation/process: use scripts/get_maintainer.pl on patches
MAINTAINERS: Add entry for DOCUMENTATION/JAPANESE
docs/trans/ja_JP/howto: Don't mention specific kernel versions
docs/ja_JP/SubmittingPatches: Request summaries for commit references
docs/ja_JP/SubmittingPatches: Add Suggested-by as a standard signature
docs/ja_JP/SubmittingPatches: Randy has moved
docs/ja_JP/SubmittingPatches: Suggest the use of scripts/get_maintainer.pl
docs/ja_JP/SubmittingPatches: Update GregKH links
Documentation/sysctl: document max_rcu_stall_to_panic
Documentation: add missing angle bracket in cgroup-v2 doc
Documentation: dev-tools: use literal block instead of code-block
docs/zh_CN: add vm numa translation
...
|
|
Almost all other initialization of variables in f2fs_fill_super are
extraced to a single function. Also do it for write_io[], which can
make code more clean.
This patch just refactors the code, theres no functional change.
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[Jaegeuk Kim: clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Use PAGE_ALIGNED macro instead of IS_ALIGNED and passing PAGE_SIZE.
Link: https://lkml.kernel.org/r/20220520021833.121405-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The default "timeout" for one kselftest is 45 seconds, while some cases in
run_vmtests.sh require more time. This will cause testing timeout like:
not ok 4 selftests: vm: run_vmtests.sh # TIMEOUT 45 seconds
Therefore, add the "settings" file with timeout variable so users can set
the "timeout" value.
Link: https://lkml.kernel.org/r/20220521083825.319654-4-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The "test_hmm.sh" file used by run_vmtests.sh dose not be installed into
INSTALL_PATH. Thus run_vmtests.sh can not call it in INSTALL_PATH:
---------------------------
running ./test_hmm.sh smoke
---------------------------
./run_vmtests.sh: line 74: ./test_hmm.sh: No such file or directory
[FAIL]
-----------------------
Add "test_hmm.sh" to TEST_FILES so that it will be installed.
Link: https://lkml.kernel.org/r/20220521083825.319654-3-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
in ksm_tests
Patch series "selftests: vm: a few fixup patches".
This series contains three fixup patches for vm selftests. They are
independent. Please see the patches.
This patch (of 3):
Currently, ksm_tests operates "merge_across_nodes" with NUMA either
enabled or disabled. In a system with NUMA disabled, these operations
will fail and output a misleading report given "merge_across_nodes" does
not exist in sysfs:
----------------------------
running ./ksm_tests -M -p 10
----------------------------
f /sys/kernel/mm/ksm/merge_across_nodes
fopen: No such file or directory
Cannot save default tunables
[FAIL]
----------------------
So check numa_available() before those operations to skip them if NUMA is
disabled.
Link: https://lkml.kernel.org/r/20220521083825.319654-1-patrick.wang.shcn@gmail.com
Link: https://lkml.kernel.org/r/20220521083825.319654-2-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add newly added migration test object to .gitignore file.
Link: https://lkml.kernel.org/r/20220521094313.166505-1-usama.anjum@collabora.com
Fixes: 0c2d08728470 ("mm: add selftests for migration entries")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.
Link: https://lkml.kernel.org/r/20220521111145.81697-80-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.
Link: https://lkml.kernel.org/r/20220521111145.81697-94-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Introduce process_mrelease syscall sanity tests which include tests
which expect to fail:
- process_mrelease with invalid pidfd and flags inputs
- process_mrelease on a live process with no pending signals
and valid process_mrelease usage which is expected to succeed. Because
process_mrelease has to be used against a process with a pending SIGKILL,
it's possible that the process exits before process_mrelease gets called.
In such cases we retry the test with a victim that allocates twice more
memory up to 1GB. This would require the victim process to spend more
time during exit and process_mrelease has a better chance of catching the
process before it exits and succeeding.
On success the test reports the amount of memory the child had to allocate
for reaping to succeed. Sample output:
$ mrelease_test
Success reaping a child with 1MB of memory allocations
On failure the test reports the failure. Sample outputs:
$ mrelease_test
All process_mrelease attempts failed!
$ mrelease_test
process_mrelease: Invalid argument
Link: https://lkml.kernel.org/r/20220518204316.13131-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit 3a235693d3930e1276c8d9cc0ca5807ef292cf0a.
Its premise was that cgroup reclaim cares about freeing memory inside the
cgroup, and demotion just moves them around within the cgroup limit.
Hence, pages from toptier nodes should be reclaimed directly.
However, with NUMA balancing now doing tier promotions, demotion is part
of the page aging process. Global reclaim demotes the coldest toptier
pages to secondary memory, where their life continues and from which they
have a chance to get promoted back. Essentially, tiered memory systems
have an LRU order that spans multiple nodes.
When cgroup reclaims pages coming off the toptier directly, there can be
colder pages on lower tier nodes that were demoted by global reclaim.
This is an aging inversion, not unlike if cgroups were to reclaim directly
from the active lists while there are inactive pages.
Proactive reclaim is another factor. The goal of that it is to offload
colder pages from expensive RAM to cheaper storage. When lower tier
memory is available as an intermediate layer, we want offloading to take
advantage of it instead of bypassing to storage.
Revert the patch so that cgroups respect the LRU order spanning the memory
hierarchy.
Of note is a specific undercommit scenario, where all cgroup limits in the
system add up to <= available toptier memory. In that case, shuffling
pages out to lower tiers first to reclaim them from there is inefficient.
This is something could be optimized/short-circuited later on (although
care must be taken not to accidentally recreate the aging inversion).
Let's ensure correctness first.
Link: https://lkml.kernel.org/r/20220518190911.82400-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
By printing information, we can friendly prompt the status change
information of kfence by dmesg and record by syslog.
Also, set kfence_enabled to false only when needed.
Link: https://lkml.kernel.org/r/20220518073105.3160335-1-liu.yun@linux.dev
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Co-developed-by: Marco Elver <elver@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
percpu_alloc_percpu event trace"
Fix sparse warning about incorrect gfp_t cast.
Link: https://lkml.kernel.org/r/001979f3-e978-0998-cbed-61a4a2ac87b8@openvz.org
Fixes: f67bed134a05 ("percpu: improve percpu_alloc_percpu event trace")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
conversion"
Redefines __def_gfpflag_names array according to akpm@, willy@ and Joe
Perches recommendations.
Link: https://lkml.kernel.org/r/6f811e19-41c6-f3e8-fca6-23a19a62e313@openvz.org
Fixes: fe573327ffb1 ("tracing: incorrect gfp_t conversion")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In isolate_single_pageblock() called by start_isolate_page_range(), there
are some pageblock isolation issues causing a potential infinite loop when
isolating a page range. This is reported by Qian Cai.
1. the pageblock was isolated by just changing pageblock migratetype
without checking unmovable pages. Calling set_migratetype_isolate() to
isolate pageblock properly.
2. an off-by-one error caused migrating pages unnecessarily, since the page
is not crossing pageblock boundary.
3. migrating a compound page across pageblock boundary then splitting the
free page later has a small race window that the free page might be
allocated again, so that the code will try again, causing an potential
infinite loop. Temporarily set the to-be-migrated page's pageblock to
MIGRATE_ISOLATE to prevent that and bail out early if no free page is
found after page migration.
An additional fix to split_free_page() aims to avoid crashing in
__free_one_page(). When the free page is split at the specified
split_pfn_offset, free_page_order should check both the first bit of
free_page_pfn and the last bit of split_pfn_offset and use the smaller
one. For example, if free_page_pfn=0x10000, split_pfn_offset=0xc000,
free_page_order should first be 0x8000 then 0x4000, instead of 0x4000 then
0x8000, which the original algorithm did.
[akpm@linux-foundation.org: suppress min() warning]
Link: https://lkml.kernel.org/r/20220524194756.1698351-1-zi.yan@sent.com
Fixes: b2c9e2fbba3253 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Ren <renzhengeek@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
I have been focusing on mm for the past two years. e.g. developing,
fixing bugs, reviewing related to HugeTLB system. I would like to help
Mike and other people working on HugeTLB by reviewing their work.
When I first introduced the vmemmmap reduction, I forgot to update
MAINTAINERS file. Let's update it as well. And rename "HUGETLB
FILESYSTEM" to "HUGETLB SUBSYSTEM" since some files are not only related
to filesystem but also memory management (the name of FILESYSTEM cannot
cover this area).
Link: https://lkml.kernel.org/r/20220521074103.79468-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
ZSMALLOC depends on MMU so ZRAM should also depend on MMU since 'select'
does not follow any dependency chains.
Fixes this Kconfig warning:
WARNING: unmet direct dependencies detected for ZSMALLOC
Depends on [n]: MMU [=n]
Selected by [y]:
- ZRAM [=y] && BLK_DEV [=y] && BLOCK [=y] && SYSFS [=y] && (CRYPTO_LZO [=y] || CRYPTO_ZSTD [=m] || CRYPTO_LZ4 [=m] || CRYPTO_LZ4HC [=n] || CRYPTO_842 [=n])
Link: https://lkml.kernel.org/r/20220522204027.22964-1-rdunlap@infradead.org
Fixes: b3fbd58fcbb10 ("mm: Kconfig: simplify zswap configuration")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Shmem swapoff makes no progress: the index to indices is not incremented.
But "ret" is no longer a return value, so use folio_batch_count() instead.
Link: https://lkml.kernel.org/r/c32bee8a-f0aa-245-f94e-24dd271924fa@google.com
Fixes: da08e9b79323 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If the first goto is taken, 'fd' is not opened yet (and is un-initialized).
So a direct return is safer.
Link: https://lkml.kernel.org/r/628312312eb40e0e39463a2c06415fde5295c716.1653229120.git.christophe.jaillet@wanadoo.fr
Fixes: c1a31a2f7a9c ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: David Vernet <void@manifault.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Offload writing printk() messages on consoles to per-console
kthreads.
It prevents soft-lockups when an extensive amount of messages is
printed. It was observed, for example, during boot of large systems
with a lot of peripherals like disks or network interfaces.
It prevents live-lockups that were observed, for example, when
messages about allocation failures were reported and a CPU handled
consoles instead of reclaiming the memory. It was hard to solve even
with rate limiting because it would need to take into account the
amount of messages and the speed of all consoles.
It is a must to have for real time. Otherwise, any printk() might
break latency guarantees.
The per-console kthreads allow to handle each console on its own
speed. Slow consoles do not longer slow down faster ones. And
printk() does not longer unpredictably slows down various code paths.
There are situations when the kthreads are either not available or
not reliable, for example, early boot, suspend, or panic. In these
situations, printk() uses the legacy mode and tries to handle
consoles immediately.
- Add documentation for the printk index.
* tag 'printk-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk, tracing: fix console tracepoint
printk: remove @console_locked
printk: extend console_lock for per-console locking
printk: add kthread console printers
printk: add functions to prefer direct printing
printk: add pr_flush()
printk: move buffer definitions into console_emit_next_record() caller
printk: refactor and rework printing logic
printk: add con_printk() macro for console details
printk: call boot_delay_msec() in printk_delay()
printk: get caller_id/timestamp after migration disable
printk: wake waiters for safe and NMI contexts
printk: wake up all waiters
printk: add missing memory barrier to wake_up_klogd()
printk: cpu sync always disable interrupts
printk: rename cpulock functions
printk/index: Printk index feature documentation
MAINTAINERS: Add printk indexing maintainers on mention of printk_index
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Conversion of slub_debug stack traces to stackdepot, allowing more
useful debugfs-based inspection for e.g. memory leak debugging.
Allocation and free debugfs info now includes full traces and is
sorted by the unique trace frequency.
The stackdepot conversion was already attempted last year but
reverted by ae14c63a9f20. The memory overhead (while not actually
enabled on boot) has been meanwhile solved by making the large
stackdepot allocation dynamic. The xfstest issues haven't been
reproduced on current kernel locally nor in -next, so the slab cache
layout changes that originally made that bug manifest were probably
not the root cause.
- Refactoring of dma-kmalloc caches creation.
- Trivial cleanups such as removal of unused parameters, fixes and
clarifications of comments.
- Hyeonggon Yoo joins as a reviewer.
* tag 'slab-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
MAINTAINERS: add myself as reviewer for slab
mm/slub: remove unused kmem_cache_order_objects max
mm: slab: fix comment for __assume_kmalloc_alignment
mm: slab: fix comment for ARCH_KMALLOC_MINALIGN
mm/slub: remove unneeded return value of slab_pad_check
mm/slab_common: move dma-kmalloc caches creation into new_kmalloc_cache()
mm/slub: remove meaningless node check in ___slab_alloc()
mm/slub: remove duplicate flag in allocate_slab()
mm/slub: remove unused parameter in setup_object*()
mm/slab.c: fix comments
slab, documentation: add description of debugfs files for SLUB caches
mm/slub: sort debugfs output by frequency of stack traces
mm/slub: distinguish and print stack traces in debugfs files
mm/slub: use stackdepot to save stack trace in objects
mm/slub: move struct track init out of set_track()
lib/stackdepot: allow requesting early initialization dynamically
mm/slub, kunit: Make slub_kunit unaffected by user specified flags
mm/slab: remove some unused functions
|
|
Commit c724c866bb70 ("linux/types.h: remove unnecessary __bitwise__")
was right that there are no users of __bitwise__ in the kernel, but it
turns out there are user space users of it that do expect it.
It is, after all, in the uapi directory, so user space usage is to be
expected.
Instead of reverting the commit completely, let's just clarify the
situation so that it doesn't happen again, and have some in-code
explanations for why that "__bitwise__" still exists.
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Link: https://lore.kernel.org/all/b5c0a68d-8387-4909-beea-f70ab9e6e3d5@kernel.org/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit b2a90f4fcb14 ("media: lirc: remove unused lirc features") removed
feature flags which were never implemented, but they are still used by
the lirc daemon went built from source.
Reinstate these symbols in order not to break the lirc build.
Fixes: b2a90f4fcb14 ("media: lirc: remove unused lirc features")
Link: https://lore.kernel.org/all/a0470450-ecfd-2918-e04a-7b57c1fd7694@kernel.org/
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The IXP4xx Kconfig we ended up with for mach-ixp4xx creates
as kismet warning:
WARNING: unmet direct dependencies detected for GPIO_IXP4XX
Depends on [n]: GPIOLIB [=y] && HAS_IOMEM [=y] && ARCH_IXP4XX [=y] && OF [=n]
Selected by [y]:
- ARCH_IXP4XX [=y] && <choice>
This is because it is possible to select ARCH_IXP4XX witout
OF while that selects the GPIO driver that now depends on
OF.
Fix this by creating a single ARCH_IXP4XX kconfig that selects
USE_OF.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Imre Kaloz <kaloz@openwrt.org>
Cc: Krzysztof Halasa <khalasa@piap.pl>
Link: https://lore.kernel.org/r/20220522072356.34062-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Update documentation for using the tool to support performance level
change via OOB (Out of Band) interface.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add Meteor Lake PCI ID for processor thermal device.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add Meteor Lake ACPI IDs for DPTF devices.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull OPP (Operating Performance Points) updates for 5.19-rc1 from Viresh
Kumar:
- Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
Oudjana).
- Use list iterator only inside the list_for_each_entry loop (Xiaomeng
Tong, and Jakob Koschel).
- New APIs related to finding OPP based on interconnect bandwidth
(Krzysztof Kozlowski).
- Fix the missing of_node_put() in _bandwidth_supported() (Dan
Carpenter).
- Cleanups (Krzysztof Kozlowski, and Viresh Kumar).
* tag 'opp-updates-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Reorder definition of ceil/floor helpers
opp: Add apis to retrieve opps with interconnect bandwidth
dt-bindings: opp: opp-v2-kryo-cpu: Remove SMEM
opp: use list iterator only inside the loop
opp: replace usage of found with dedicated list iterator variable
PM: opp: simplify with dev_err_probe()
OPP: call of_node_put() on error path in _bandwidth_supported()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull ARM cpufreq updates for 5.19-rc1 from Viresh Kumar:
- Tegra234 cpufreq support (Sumit Gupta).
- Mediatek cleanups and enhancements (Wan Jiabing, Rex-BC Chen, and
Jia-Wei Chang).
* tag 'cpufreq-arm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (21 commits)
cpufreq: mediatek: Add support for MT8186
cpufreq: mediatek: Link CCI device to CPU
dt-bindings: cpufreq: mediatek: Add MediaTek CCI property
cpufreq: mediatek: Fix potential deadlock problem in mtk_cpufreq_set_target
cpufreq: mediatek: Add opp notification support
cpufreq: mediatek: Refine mtk_cpufreq_voltage_tracking()
cpufreq: mediatek: Move voltage limits to platform data
cpufreq: mediatek: Unregister platform device on exit
cpufreq: mediatek: Fix NULL pointer dereference in mediatek-cpufreq
cpufreq: mediatek: Make sram regulator optional
cpufreq: mediatek: Record previous target vproc value
cpufreq: mediatek: Replace old_* with pre_*
cpufreq: mediatek: Use device print to show logs
cpufreq: mediatek: Enable clocks and regulators
cpufreq: mediatek: Remove unused headers
cpufreq: mediatek: Cleanup variables and error handling in mtk_cpu_dvfs_info_init()
cpufreq: mediatek: Use module_init and add module_exit
arm64: tegra: add node for tegra234 cpufreq
cpufreq: tegra194: Add support for Tegra234
cpufreq: tegra194: add soc data to support multiple soc
...
|
|
We're unconditionally registering sys-off handler for the legacy
pm_power_off() callback, this causes problem for platforms that don't
use power-off handlers at all and should be halted. Now reboot syscall
assumes that there is a power-off handler installed and tries to power
off system instead of halting it.
To fix the trouble, move the handler's registration to the reboot syscall
and check the pm_power_off() presence.
Fixes: 0e2110d2e910 ("kernel/reboot: Add kernel_can_power_off()")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Some older servers seem to require the workstation name during ntlmssp
to be at most 15 chars (RFC1001 name length), so truncate it before
sending when using insecure dialects.
Link: https://lore.kernel.org/r/e6837098-15d9-acb6-7e34-1923cf8c6fe1@winds.org
Reported-by: Byron Stanoszek <gandalf@winds.org>
Tested-by: Byron Stanoszek <gandalf@winds.org>
Fixes: 49bd49f983b5 ("cifs: send workstation name during ntlmssp session setup")
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
On m68k with CONFIG_VIRT=y (e.g. virt_defconfig or allmodconfig):
arch/m68k/virt/config.c: In function ‘config_virt’:
arch/m68k/virt/config.c:129:2: error: ‘mach_power_off’ undeclared (first use in this function); did you mean ‘pm_power_off’?
129 | mach_power_off = virt_halt;
| ^~~~~~~~~~~~~~
| pm_power_off
Commit 05d51e42df06f021 ("m68k: Introduce a virtual m68k machine")
introduced a new user of mach_power_off.
Convert it to the new sys-off handler API, too.
Reported-by: noreply@ellerman.id.au
Fixes: f0f7e5265b3b37b0 ("m68k: Switch to new sys-off handler API")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
cppcheck reports
[drivers/video/fbdev/xen-fbfront.c:226]: (style) Assignment of function parameter has no effect outside the function.
The value parameter 'transp' is not used, so setting it can be removed.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
When kernel handles the vm-exit caused by external interrupts and NMI,
it always sets kvm_intr_type to tell if it's dealing an IRQ or NMI. For
the PMI scenario, it could be IRQ or NMI.
However, intel_pt PMIs are only generated for HARDWARE perf events, and
HARDWARE events are always configured to generate NMIs. Use
kvm_handling_nmi_from_guest() to precisely identify if the intel_pt PMI
came from the guest; this avoids false positives if an intel_pt PMI/NMI
arrives while the host is handling an unrelated IRQ VM-Exit.
Fixes: db215756ae59 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Message-Id: <20220523140821.1345605-1-yanfei.xu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fixing side effect of the so-called opportunistic change in the commit.
Fixes: dc8a9febbab0 ("KVM: selftests: x86: Fix test failure on arch lbr capable platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220518170118.66263-2-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Commit ddd7ed842627 ("x86/kvm: Alloc dummy async #PF token outside of
raw spinlock") leads to the following Smatch static checker warning:
arch/x86/kernel/kvm.c:212 kvm_async_pf_task_wake()
warn: sleeping in atomic context
arch/x86/kernel/kvm.c
202 raw_spin_lock(&b->lock);
203 n = _find_apf_task(b, token);
204 if (!n) {
205 /*
206 * Async #PF not yet handled, add a dummy entry for the token.
207 * Allocating the token must be down outside of the raw lock
208 * as the allocator is preemptible on PREEMPT_RT kernels.
209 */
210 if (!dummy) {
211 raw_spin_unlock(&b->lock);
--> 212 dummy = kzalloc(sizeof(*dummy), GFP_KERNEL);
^^^^^^^^^^
Smatch thinks the caller has preempt disabled. The `smdb.py preempt
kvm_async_pf_task_wake` output call tree is:
sysvec_kvm_asyncpf_interrupt() <- disables preempt
-> __sysvec_kvm_asyncpf_interrupt()
-> kvm_async_pf_task_wake()
The caller is this:
arch/x86/kernel/kvm.c
290 DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
291 {
292 struct pt_regs *old_regs = set_irq_regs(regs);
293 u32 token;
294
295 ack_APIC_irq();
296
297 inc_irq_stat(irq_hv_callback_count);
298
299 if (__this_cpu_read(apf_reason.enabled)) {
300 token = __this_cpu_read(apf_reason.token);
301 kvm_async_pf_task_wake(token);
302 __this_cpu_write(apf_reason.token, 0);
303 wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1);
304 }
305
306 set_irq_regs(old_regs);
307 }
The DEFINE_IDTENTRY_SYSVEC() is a wrapper that calls this function
from the call_on_irqstack_cond(). It's inside the call_on_irqstack_cond()
where preempt is disabled (unless it's already disabled). The
irq_enter/exit_rcu() functions disable/enable preempt.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The timer is disarmed when switching between TSC deadline and other modes;
however, the pending timer is still in-flight, so let's accurately remove
any traces of the previous mode.
Fixes: 4427593258 ("KVM: x86: thoroughly disarm LAPIC timer around TSC deadline switch")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop the raw spinlock in kvm_async_pf_task_wake() before allocating the
the dummy async #PF token, the allocator is preemptible on PREEMPT_RT
kernels and must not be called from truly atomic contexts.
Opportunistically document why it's ok to loop on allocation failure,
i.e. why the function won't get stuck in an infinite loop.
Reported-by: Yajun Deng <yajun.deng@linux.dev>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Whenever x86_decode_emulated_instruction() detects a breakpoint, it
returns the value that kvm_vcpu_check_breakpoint() writes into its
pass-by-reference second argument. Unfortunately this is completely
bogus because the expected outcome of x86_decode_emulated_instruction
is an EMULATION_* value.
Then, if kvm_vcpu_check_breakpoint() does "*r = 0" (corresponding to
a KVM_EXIT_DEBUG userspace exit), it is misunderstood as EMULATION_OK
and x86_emulate_instruction() is called without having decoded the
instruction. This causes various havoc from running with a stale
emulation context.
The fix is to move the call to kvm_vcpu_check_breakpoint() where it was
before commit 4aa2691dcbd3 ("KVM: x86: Factor out x86 instruction
emulation with decoding") introduced x86_decode_emulated_instruction().
The other caller of the function does not need breakpoint checks,
because it is invoked as part of a vmexit and the processor has already
checked those before executing the instruction that #GP'd.
This fixes CVE-2022-1852.
Reported-by: Qiuhao Li <qiuhao@sysec.org>
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Fixes: 4aa2691dcbd3 ("KVM: x86: Factor out x86 instruction emulation with decoding")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220311032801.3467418-2-seanjc@google.com>
[Rewrote commit message according to Qiuhao's report, since a patch
already existed to fix the bug. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
For some sev ioctl interfaces, the length parameter that is passed maybe
less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data
that PSP firmware returns. In this case, kmalloc will allocate memory
that is the size of the input rather than the size of the data.
Since PSP firmware doesn't fully overwrite the allocated buffer, these
sev ioctl interface may return uninitialized kernel slab memory.
Reported-by: Andy Nguyen <theflow@google.com>
Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Peter Gonda <pgonda@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: eaf78265a4ab3 ("KVM: SVM: Move SEV code to separate file")
Fixes: 2c07ded06427d ("KVM: SVM: add support for SEV attestation command")
Fixes: 4cfdd47d6d95a ("KVM: SVM: Add KVM_SEV SEND_START command")
Fixes: d3d1af85e2c75 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
Fixes: eba04b20e4861 ("KVM: x86: Account a variety of miscellaneous allocations")
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Peter Gonda <pgonda@google.com>
Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Set the starting uABI size of KVM's guest FPU to 'struct kvm_xsave',
i.e. to KVM's historical uABI size. When saving FPU state for usersapce,
KVM (well, now the FPU) sets the FP+SSE bits in the XSAVE header even if
the host doesn't support XSAVE. Setting the XSAVE header allows the VM
to be migrated to a host that does support XSAVE without the new host
having to handle FPU state that may or may not be compatible with XSAVE.
Setting the uABI size to the host's default size results in out-of-bounds
writes (setting the FP+SSE bits) and data corruption (that is thankfully
caught by KASAN) when running on hosts without XSAVE, e.g. on Core2 CPUs.
WARN if the default size is larger than KVM's historical uABI size; all
features that can push the FPU size beyond the historical size must be
opt-in.
==================================================================
BUG: KASAN: slab-out-of-bounds in fpu_copy_uabi_to_guest_fpstate+0x86/0x130
Read of size 8 at addr ffff888011e33a00 by task qemu-build/681
CPU: 1 PID: 681 Comm: qemu-build Not tainted 5.18.0-rc5-KASAN-amd64 #1
Hardware name: /DG35EC, BIOS ECG3510M.86A.0118.2010.0113.1426 01/13/2010
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x45
print_report.cold+0x45/0x575
kasan_report+0x9b/0xd0
fpu_copy_uabi_to_guest_fpstate+0x86/0x130
kvm_arch_vcpu_ioctl+0x72a/0x1c50 [kvm]
kvm_vcpu_ioctl+0x47f/0x7b0 [kvm]
__x64_sys_ioctl+0x5de/0xc90
do_syscall_64+0x31/0x50
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
Allocated by task 0:
(stack is not available)
The buggy address belongs to the object at ffff888011e33800
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 0 bytes to the right of
512-byte region [ffff888011e33800, ffff888011e33a00)
The buggy address belongs to the physical page:
page:0000000089cd4adb refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11e30
head:0000000089cd4adb order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 dead000000000100 dead000000000122 ffff888001041c80
raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888011e33900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff888011e33980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888011e33a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888011e33a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888011e33b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
Fixes: be50b2065dfa ("kvm: x86: Add support for getting/setting expanded xstate buffer")
Fixes: c60427dd50ba ("x86/fpu: Add uabi_size to guest_fpu")
Reported-by: Zdenek Kaspar <zkaspar82@gmail.com>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Zdenek Kaspar <zkaspar82@gmail.com>
Message-Id: <20220504001219.983513-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fix and feature for 5.19
- ultravisor communication device driver
- fix TEID on terminating storage key ops
|
|
KVM/riscv changes for 5.19
- Added Sv57x4 support for G-stage page table
- Added range based local HFENCE functions
- Added remote HFENCE functions based on VCPU requests
- Added ISA extension registers in ONE_REG interface
- Updated KVM RISC-V maintainers entry to cover selftests support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 5.19
- Add support for the ARMv8.6 WFxT extension
- Guard pages for the EL2 stacks
- Trap and emulate AArch32 ID registers to hide unsupported features
- Ability to select and save/restore the set of hypercalls exposed
to the guest
- Support for PSCI-initiated suspend in collaboration with userspace
- GICv3 register-based LPI invalidation support
- Move host PMU event merging into the vcpu data structure
- GICv3 ITS save/restore fixes
- The usual set of small-scale cleanups and fixes
[Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated
from 4 to 6. - Paolo]
|
|
On Arch LBR capable platforms, LBR_FMT in perf capability msr is 0x3f,
so the last format test will fail. Use a true invalid format(0x30) for
the test if it's running on these platforms. Opportunistically change
the file name to reflect the tests actually carried out.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220512084046.105479-1-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In commit ec0671d5684a ("KVM: LAPIC: Delay trace_kvm_wait_lapic_expire
tracepoint to after vmexit", 2019-06-04), trace_kvm_wait_lapic_expire
was moved after guest_exit_irqoff() because invoking tracepoints within
kvm_guest_enter/kvm_guest_exit caused a lockdep splat.
These days this is not necessary, because commit 87fa7f3e98a1 ("x86/kvm:
Move context tracking where it belongs", 2020-07-09) restricted
the RCU extended quiescent state to be closer to vmentry/vmexit.
Moving the tracepoint back to __kvm_wait_lapic_expire is more accurate,
because it will be reported even if vcpu_enter_guest causes multiple
vmentries via the IPI/Timer fast paths, and it allows the removal of
advance_expire_delta.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1650961551-38390-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|