Age | Commit message (Collapse) | Author |
|
The reuslt:
$ tools/testing/selftests/bpf/test_progs --name=fexit_noreturns
#99/1 fexit_noreturns/noreturns:OK
#99 fexit_noreturns:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20250318114447.75484-3-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When s2[i] = '\0', if s1[i] != '\0', it will be judged by ret,
and if s1[i] = '\0', it will be judegd by !s1[i].
So in reality, s2 [i] will never make a judgment
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250314031013.94480-1-yangfeng59949@163.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The pyrf_event__new() method copies the event obtained from the perf
ring buffer to a structure that will then be turned into a python object
for further consumption, so it copies perf_event.header.size bytes to
its 'event' member:
$ pahole -C pyrf_event /tmp/build/perf-tools-next/python/perf.cpython-312-x86_64-linux-gnu.so
struct pyrf_event {
PyObject ob_base; /* 0 16 */
struct evsel * evsel; /* 16 8 */
struct perf_sample sample; /* 24 312 */
/* XXX last struct has 7 bytes of padding, 2 holes */
/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
union perf_event event; /* 336 4168 */
/* size: 4504, cachelines: 71, members: 4 */
/* member types with holes: 1, total: 2 */
/* paddings: 1, sum paddings: 7 */
/* last cacheline: 24 bytes */
};
$
It was doing so without checking if the event just obtained has more
than that space, fix it.
This isn't a proper, final solution, as we need to support larger
events, but for the time being we at least bounds check and document it.
Fixes: 877108e42b1b9ba6 ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-7-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When processing tracepoints the perf python binding was parsing the
event before calling perf_mmap__consume(&md->core) in
pyrf_evlist__read_on_cpu().
But part of this event parsing was to set the perf_sample->raw_data
pointer to the payload of the event, which then could be overwritten by
other event before tracepoint fields were asked for via event.prev_comm
in a python program, for instance.
This also happened with other fields, but strings were were problems
were surfacing, as there is UTF-8 validation for the potentially garbled
data.
This ended up showing up as (with some added debugging messages):
( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 ) ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 ) ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 ) ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
( XXX '��' len=16, raw_size=68) ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
main()
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
event.next_comm,
^^^^^^^^^^^^^^^
AttributeError: 'perf.sample_event' object has no attribute 'next_comm'
When event.next_comm was asked for, the PyUnicode_FromString() python
API would fail and that tracepoint field wouldn't be available, stopping
the tools/perf/python/tracepoint.py test tool.
But, since we already do a copy of the whole event in pyrf_event__new,
just use it and while at it remove what was done in in e8968e654191390a
("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we
don't really need to wait for parsing the sample before declaring the
event as consumed.
This copy is questionable as is now, as it limits the maximum event +
sample_type and tracepoint payload to sizeof(union perf_event), this all
has been "working" because 'struct perf_event_mmap2', the largest entry
in 'union perf_event' is:
$ pahole -C perf_event ~/bin/perf | grep mmap2
struct perf_record_mmap2 mmap2; /* 0 4168 */
$
Fixes: bae57e3825a3dded ("perf python: Add support to resolve tracepoint fields")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-6-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
To avoid a leak if we have the python object but then something happens
and we need to return the operation, decrement the offset of the newly
created object.
Fixes: 377f698db12150a1 ("perf python: Add struct evsel into struct pyrf_event")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Otherwise when debugging we see just "python" in perf, top, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
When python2 support was removed in e7e9943c87d857da ("perf python:
Remove python 2 scripting support"), all use of the
_PyUnicode_FromString(arg), _PyUnicode_FromFormat(...), and
_PyLong_FromLong(arg) macros was removed as well, so remove it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Some old cut'n'paste error, its "ip", so the description should be
"event ip", not "event type".
Fixes: 877108e42b1b9ba6 ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-2-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Trigger vEVENTs by feeding an idev ID and validating the returned output
virt_ids whether they equal to the value that was set to the vDEVICE.
Link: https://patch.msgid.link/r/e829532ec0a3927d61161b7674b20e731ecd495b.1741719725.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When attaching a device to a vIOMMU-based nested domain, vdev_id must be
present. Add a piece of code hard-requesting it, preparing for a vEVENTQ
support in the following patch. Then, update the TEST_F.
A HWPT-based nested domain will return a NULL new_viommu, thus no such a
vDEVICE requirement.
Link: https://patch.msgid.link/r/4051ca8a819e51cb30de6b4fe9e4d94d956afe3d.1741719725.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add test for Zicbom and its block size into CBO tests, when
Zicbom is present, test that cbo.clean/flush may be issued and works.
As the software can't verify the clean/flush functions, we just judged
that cbo.clean/flush isn't executed illegally.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-4-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"15 hotfixes. 7 are cc:stable and the remainder address post-6.13
issues or aren't considered necessary for -stable kernels.
13 are for MM and the other two are for squashfs and procfs.
All are singletons. Please see the individual changelogs for details"
* tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/page_alloc: fix memory accept before watermarks gets initialized
mm: decline to manipulate the refcount on a slab page
memcg: drain obj stock on cpu hotplug teardown
mm/huge_memory: drop beyond-EOF folios with the right number of refs
selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation
mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT
mm: memcontrol: fix swap counter leak from offline cgroup
mm/vma: do not register private-anon mappings with khugepaged during mmap
squashfs: fix invalid pointer dereference in squashfs_cache_delete
mm/migrate: fix shmem xarray update during migration
mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
mm/damon/core: initialize damos->walk_completed in damon_new_scheme()
mm/damon: respect core layer filters' allowance decision on ops layer
filemap: move prefaulting out of hot write path
proc: fix UAF in proc_get_inode()
|
|
The DSO data read test opens a file but as dsos__exit is used the test
file isn't closed. This causes the subsequent subtests in don't fork
(-F) mode to fail as one more than expected file descriptor is open.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
dso__list_del with address sanitizer and/or reference count checking
will call dso__put that can call dso__data_close reentrantly trying to
lock the dso__data_open_lock and deadlocking. Switch from pthread
mutexes to perf's mutex so that lock checking is performed in debug
builds. Add lock annotations that diagnosed the problem. Release the
dso__data_open_lock around the dso__put to avoid the deadlock.
Change the declaration of dso__data_get_fd to return a boolean,
indicating the fd is valid and the lock is held, to make it compatible
with the thread safety annotations as a try lock.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Used to annotate when locks shouldn't be held for a function or if a
function returns a lock that's used by later mutex lock unlock
operations.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Error handling doesn't check the correct return value. This patch will
fix it.
Link: https://lkml.kernel.org/r/20250312043840.71799-1-cyan.yang@sifive.com
Fixes: f4b5fd6946e2 ("selftests/vm: anon_cow: THP tests")
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
It splits page cache folios to orders from 0 to 8 at different in-folio
offset.
Link: https://lkml.kernel.org/r/20250307174001.242794-9-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Kairui Song <kasong@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Buddy allocator like (or non-uniform) folio split", v10.
This patchset adds a new buddy allocator like (or non-uniform) large folio
split from a order-n folio to order-m with m < n. It reduces
1. the total number of after-split folios from 2^(n-m) to n-m+1;
2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to
n/6-m/6, assuming XA_CHUNK_SHIFT=6;
3. keep more large folios after a split from all order-m folios to
order-(n-1) to order-m folios.
For example, to split an order-9 to order-0, folio split generates 10 (or
11 for anonymous memory) folios instead of 512, allocates 1 xa_node
instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2
order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0
folios.
Instead of duplicating existing split_huge_page*() code, __folio_split()
is introduced as the shared backend code for both
split_huge_page_to_list_to_order() and folio_split(). __folio_split() can
support both uniform split and buddy allocator like (or non-uniform)
split. All existing split_huge_page*() users can be gradually converted
to use folio_split() if possible. In this patchset, I converted
truncate_inode_partial_folio() to use folio_split().
xfstests quick group passed for both tmpfs and xfs. I also
semi-replicated Hugh's test[12] and ran it without any issue for almost 24
hours.
This patch (of 8):
A preparation patch for non-uniform folio split, which always split a
folio into half iteratively, and minimal xarray entry split.
Currently, xas_split_alloc() and xas_split() always split all slots from a
multi-index entry. They cost the same number of xa_node as the
to-be-split slots. For example, to split an order-9 entry, which takes
2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8
xa_node are needed. Instead xas_try_split() is intended to be used
iteratively to split the order-9 entry into 2 order-8 entries, then split
one order-8 entry, based on the given index, to 2 order-7 entries, ...,
and split one order-1 entry to 2 order-0 entries. When splitting the
order-6 entry and a new xa_node is needed, xas_try_split() will try to
allocate one if possible. As a result, xas_try_split() would only need 1
xa_node instead of 8.
When a new xa_node is needed during the split, xas_try_split() can try to
allocate one but no more. -ENOMEM will be return if a node cannot be
allocated. -EINVAL will be return if a sibling node is split or cascade
split happens, where two or more new nodes are needed, and these are not
supported by xas_try_split().
xas_split_alloc() and xas_split() split an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
| | | |
------- --- --- -------
| | ... | |
V V V V
----------- ----------- ----------- -----------
| xa_node | | xa_node | ... | xa_node | | xa_node |
----------- ----------- ----------- -----------
xas_try_split() splits an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
|
|
V
-----------
| xa_node |
-----------
Link: https://lkml.kernel.org/r/20250307174001.242794-1-ziy@nvidia.com
Link: https://lkml.kernel.org/r/20250307174001.242794-2-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Kairui Song <kasong@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Fixed some formatting specifiers errors, such as using %d for int and %u
for unsigned int, as well as other byte-length types.
Perform type cast using the type derived from the data type itself, for
example, if it's originally an int, it will be cast to unsigned int if
forced to unsigned.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250311112809.81901-3-jiayuan.chen@linux.dev
|
|
This commit adds the -Wformat-signedness compiler flag to detect and
prevent printf format errors, where signed or unsigned types are
mismatched with format specifiers. This helps to catch potential issues at
compile-time, ensuring that our code is more robust and reliable. With
this flag, the compiler will now warn about incorrect format strings, such
as using %d with unsigned types or %u with signed types.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250311112809.81901-2-jiayuan.chen@linux.dev
|
|
Add selftests to verify that it is possible to load freplace program
from user namespace if BPF token is initialized by bpf_object__prepare
before calling bpf_program__set_attach_target.
Negative test is added as well.
Modified type of the priv_prog to xdp, as kprobe did not work on aarch64
and s390x.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250317174039.161275-5-mykyta.yatsenko5@gmail.com
|
|
Pass BPF token from bpf_program__set_attach_target to
BPF_BTF_GET_FD_BY_ID bpf command.
When freplace program attaches to target program, it needs to look up
for BTF of the target, this may require BPF token, if, for example,
running from user namespace.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250317174039.161275-4-mykyta.yatsenko5@gmail.com
|
|
Currently BPF_BTF_GET_FD_BY_ID requires CAP_SYS_ADMIN, which does not
allow running it from user namespace. This creates a problem when
freplace program running from user namespace needs to query target
program BTF.
This patch relaxes capable check from CAP_SYS_ADMIN to CAP_BPF and adds
support for BPF token that can be passed in attributes to syscall.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250317174039.161275-2-mykyta.yatsenko5@gmail.com
|
|
Verify interval_tree_span_iter_xxx() helpers works as expected.
Link: https://lkml.kernel.org/r/20250310074938.26756-6-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Verify interval_tree_iter_xxx() helpers could find intersection ranges
as expected.
[sfr@canb.auug.org.au: some of tools/ uses -Wno-unused-parameter]
Link: https://lkml.kernel.org/r/20250312113612.31ac808e@canb.auug.org.au
Link: https://lkml.kernel.org/r/20250310074938.26756-5-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Current test use pseudo rand function with fixed seed, which means the
test data is the same pattern each time.
Add random seed parameter to randomize the test.
Link: https://lkml.kernel.org/r/20250310074938.26756-4-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "lib/interval_tree: add some test cases and cleanup", v2.
Since rbtree/augmented tree/interval tree share similar data structure,
besides new cases for interval tree, this patch set also does cleanup for
others.
This patch (of 7):
Currently we have some tests for rbtree related data structure, e.g.
rbtree, augmented rbtree, interval tree, in lib/ as kernel module.
To facilitate the test and debug for those fundamental data structure,
this patch enable those tests in userland.
Link: https://lkml.kernel.org/r/20250310074938.26756-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20250310074938.26756-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add emulation to support Set Feature mailbox command to cxl_test.
The only feature supported is the device patrol scrub feature. The
set feature allows activation of patrol scrub for the cxl_test
emulated device. The command does not support partial data transfer
even though the spec allows it. This restriction is to reduce complexity
of the emulation given the patrol scrub feature is very minimal.
Link: https://patch.msgid.link/r/20250307205648.1021626-8-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add emulation of Get Feature command to cxl_test. The feature for
device patrol scrub is returned by the emulation code. This is
the only feature currently supported by cxl_test. It returns
the information for the device patrol scrub feature.
Link: https://patch.msgid.link/r/20250307205648.1021626-7-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add fwctl support code to allow sending of CXL feature commands from
userspace through as ioctls via FWCTL. Provide initial setup bits. The
CXL PCI probe function will call devm_cxl_setup_fwctl() after the
cxl_memdev has been enumerated in order to setup FWCTL char device under
the cxl_memdev like the existing memdev char device for issuing CXL raw
mailbox commands from userspace via ioctls.
Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add CXL Features support. Setup code for enabling in kernel usage of CXL
Features. Expecting EDAC/RAS to utilize CXL Features in kernel for
things such as memory sparing. Also prepartion for enabling of CXL FWCTL
support to issue allowed Features from user space.
|
|
Recent Ubuntu enforces 3-argument open() with O_CREAT:
CC /home/mingo/tip/tools/objtool/builtin-check.o
In file included from /usr/include/fcntl.h:341,
from builtin-check.c:9:
In function ‘open’,
inlined from ‘copy_file’ at builtin-check.c:201:11:
/usr/include/x86_64-linux-gnu/bits/fcntl2.h:52:11: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
52 | __open_missing_mode ();
| ^~~~~~~~~~~~~~~~~~~~~~
Use 0400 as the most restrictive mode for the new file.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
|
|
Recreating objtool errors can be a manual process. Kbuild removes the
object, so it has to be compiled or linked again before running objtool.
Then the objtool args need to be reversed engineered.
Make that all easier by automatically making a backup of the object file
on error, and print a modified version of the args which can be used to
recreate.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/7571e30636359b3e173ce6e122419452bb31882f.1741975349.git.jpoimboe@kernel.org
|
|
This is similar to GCC's behavior and makes it more obvious why the
build failed.
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/56f0565b15b4b4caa9a08953fa9c679dfa973514.1741975349.git.jpoimboe@kernel.org
|
|
Any objtool warning has the potential of reflecting (or triggering) a
major bug in the kernel or compiler which could result in crashing the
kernel or breaking the livepatch consistency model.
In preparation for failing the build on objtool errors/warnings, add a
new --Werror option.
[ jpoimboe: commit log, comments, error out on fatal errors too ]
Co-developed-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/e423ea4ec297f510a108aa6c78b52b9fe30fa8c1.1741975349.git.jpoimboe@kernel.org
|
|
Add option to allow writing the changed binary to a separate file rather
than changing it in place.
Libelf makes this suprisingly hard, so take the easy way out and just
copy the file before editing it.
Also steal the -o short option from --orc. Nobody will notice ;-)
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/0da308d42d82b3bbed16a31a72d6bde52afcd6bd.1741975349.git.jpoimboe@kernel.org
|
|
Force the user to fix their cmdline if they forget the '--link' option.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/8380bbf3a0fa86e03fd63f60568ae06a48146bc1.1741975349.git.jpoimboe@kernel.org
|
|
The option validations are a bit scattered, consolidate them.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/8f886502fda1d15f39d7351b70d4ebe5903da627.1741975349.git.jpoimboe@kernel.org
|
|
With unret validation enabled and IBT/LTO disabled, objtool runs on TUs
with --rethunk and on vmlinux.o with --unret. So this dependency isn't
valid as they don't always run on the same object.
This error never triggered before because --unret is always coupled with
--noinstr, so the first conditional in opts_valid() returns early due to
opts.noinstr being true.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/c6f5635784a28ed4b10ac4307b1858e015e6eff0.1741975349.git.jpoimboe@kernel.org
|
|
Increase the per-function WARN_FUNC() rate limit from 1 to 2. If the
number of warnings for a given function goes beyond 2, print "skipping
duplicate warning(s)". This helps root out additional warnings in a
function that might be hiding behind the first one.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/aec318d66c037a51c9f376d6fb0e8ff32812a037.1741975349.git.jpoimboe@kernel.org
|
|
Fix some outdated information in the objtool doc.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/2552ee8b48631127bf269359647a7389edf5f002.1741975349.git.jpoimboe@kernel.org
|
|
Clarify what needs to be done to resolve the missing __noreturn warning.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ab835a35d00bacf8aff0b56257df93f14fdd8224.1741975349.git.jpoimboe@kernel.org
|
|
Make sure all fatal errors are funneled through the 'out' label with a
negative ret.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/0f49d6a27a080b4012e84e6df1e23097f44cc082.1741975349.git.jpoimboe@kernel.org
|
|
The CONFIG_X86_ESPFIX64 version of exc_double_fault() can return to its
caller, but the !CONFIG_X86_ESPFIX64 version never does. In the latter
case the compiler and/or objtool may consider it to be implicitly
noreturn.
However, due to the currently inflexible way objtool detects noreturns,
a function's noreturn status needs to be consistent across configs.
The current workaround for this issue is to suppress unreachable
warnings for exc_double_fault()'s callers. Unfortunately that can
result in ORC coverage gaps and potentially worse issues like inert
static calls and silently disabled CPU mitigations.
Instead, prevent exc_double_fault() from ever being implicitly marked
noreturn by forcing a return behind a never-taken conditional.
Until a more integrated noreturn detection method exists, this is likely
the least objectionable workaround.
Fixes: 55eeab2a8a11 ("objtool: Ignore exc_double_fault() __noreturn warnings")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/d1f4026f8dc35d0de6cc61f2684e0cb6484009d1.1741975349.git.jpoimboe@kernel.org
|
|
Add a test to the guard region self tests to assert that the
/proc/$pid/pagemap information now made availabile to the user correctly
identifies and reports guard regions.
As a part of this change, update vm_util.h to add the new bit (note there
is no header file in the kernel where this is exposed, the user is
expected to provide their own mask) and utilise the helper functions there
for pagemap functionality.
[lorenzo.stoakes@oracle.com: fixup define name]
Link: https://lkml.kernel.org/r/32e83941-e6f5-42ee-9292-a44c16463cf1@lucifer.local
Link: https://lkml.kernel.org/r/164feb0a43ae72650e6b20c3910213f469566311.1740139449.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
It's not really possible to start diagnosing this without knowing the
actual error.
Also update the mlock2 helper to behave like libc would by setting errno
and returning -1.
Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-12-dec210a658f5@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If running from a directory that can't be read by unprivileged users,
executing on-fault-test via the nobody user will fail.
The kselftest build does give the file the correct permissions, but after
being installed it might be in a directory without global execute
permissions.
Since the script can't safely fix that, just skip if it happens. Note
that the stderr of the `ls` command is unfiltered meaning the user sees a
"permission denied" error that can help inform them why the test was
skipped.
Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-11-dec210a658f5@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This test allocates a page of every available size and doesn't have any
SKIP logic if the allocation fails. So, ensure it's available and skip
the test if we can't do so.
Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-10-dec210a658f5@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This script must be run as root anyway (see all the writing to privileged
files in /proc etc).
Remove the unnecessary use of sudo to avoid breaking on single-user
systems that don't have sudo. This also avoids confusing readers.
Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-9-dec210a658f5@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Some filesystems don't support ftruncate()ing unlinked files. They return
ENOENT. In that case, skip the test.
Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-8-dec210a658f5@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|