Age | Commit message (Collapse) | Author |
|
Bring in the changes to the file infrastructure for this cycle. Mostly
cleanups and some performance tweaks.
* file: remove __receive_fd()
* file: stop exposing receive_fd_user()
* fs: replace f_rcuhead with f_task_work
* file: remove pointless wrapper
* file: s/close_fd_get_file()/file_close_fd()/g
* Improve __fget_files_rcu() code generation (and thus __fget_light())
* file: massage cleanup of files that failed to open
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
/proc/pid/maps shows device and inode numbers of vma->vm_file-s. Here is
an issue. If a mapped file is on a stackable file system (e.g.,
overlayfs), vma->vm_file is a backing file whose f_inode is on the
underlying filesystem. To show correct numbers, we need to get a user
file and shows its numbers. The same trick is used to show file paths in
/proc/pid/maps.
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Andrei Vagin <avagin@google.com>
Link: https://lore.kernel.org/r/20231214064439.1023011-1-avagin@google.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The buffer-dma code was using two queues, incoming and outgoing, to
manage the state of the blocks in use.
While this totally works, it adds some complexity to the code,
especially since the code only manages 2 blocks. It is much easier to
just check each block's state manually, and keep a counter for the next
block to dequeue.
Since the new DMABUF based API wouldn't use the outgoing queue anyway,
getting rid of it now makes the upcoming changes simpler.
With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
be removed.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Link: https://lore.kernel.org/r/20231219175009.65482-2-paul@crapouillou.net
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-12-21
Hi David, hi Jakub, hi Paolo, hi Eric,
The following pull-request contains BPF updates for your *net* tree.
We've added 3 non-merge commits during the last 5 day(s) which contain
a total of 4 files changed, 45 insertions(+).
The main changes are:
1) Fix a syzkaller splat which triggered an oob issue in bpf_link_show_fdinfo(),
from Jiri Olsa.
2) Fix another syzkaller-found issue which triggered a NULL pointer dereference
in BPF sockmap for unconnected unix sockets, from John Fastabend.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add missing BPF_LINK_TYPE invocations
bpf: sockmap, test for unconnected af_unix sock
bpf: syzkaller found null ptr deref in unix_bpf proto add
====================
Link: https://lore.kernel.org/r/20231221104844.1374-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-w1 into char-misc-next
Krzysztof writes:
1-Wire bus drivers for v6.8
1. Add new AMD AXI 1-wire host driver for AMD programmable logic IP
core.
2. Add support for Analog Devices DS28EC20 EEPROM to existing DS2433
driver.
3. Few cleanups in W1 GPIO driver.
* tag 'w1-drv-6.8' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-w1:
w1: ds2433: add support for ds28ec20 eeprom
w1: ds2433: use the kernel bitmap implementation
w1: ds2433: introduce a configuration structure
w1: ds2433: remove unused definitions
w1: ds2490: support block sizes larger than 128 bytes in ds_read_block
w1: amd_axi_w1: Explicitly include correct DT includes
w1: gpio: rename pointer to driver data from pdata to ddata
w1: gpio: Drop unused enable_external_pullup from driver data
w1: gpio: Don't use platform data for driver data
w1: Add AXI 1-wire host driver for AMD programmable logic IP core
dt-bindings: w1: Add AMD AXI w1 host and MAINTAINERS entry
|
|
This reverts commit 3dec89b14d37ee635e772636dad3f09f78f1ab87.
The commit has some race conditions given how expires is managed on a
fib6_info in relation to gc start, adding the entry to the gc list and
setting the timer value leading to UAF. Revert the commit and try again
in a later release.
Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes")
Cc: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231219030243.25687-1-dsahern@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add management PF modules, which introduce support for the structures
needed to create the resources for the MGMT PF to work.
Also, add the necessary calls and functions to establish this
functionality.
Signed-off-by: Armen Ratner <armeng@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
|
|
Have an actual mlx5_sd instance in the core device, and fix the getter
accordingly. This allows SD stuff to flow, the feature becomes supported
only here.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The sd_group field moved in the HW spec from the MPIR register
to the vport context.
Align the query accordingly.
Fixes: f5e956329960 ("net/mlx5: Expose Management PCIe Index Register (MPIR)")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit moved the code of mlx5e_create_tises() and changed the
loop to create TISes over MLX5_MAX_PORTS constant value, instead of
getting the correct lag ports supported by the device, which can cause
FW errors on devices with less than MLX5_MAX_PORTS ports.
Change that back to mlx5e_get_num_lag_ports(mdev).
Also IPoIB interfaces create there own TISes, they don't use the eth
TISes, pass a flag to indicate that.
Fixes: b25bd37c859f ("net/mlx5: Move TISes from priv to mdev HW resources")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
We don't actually use any timekeeping types, no need to pull in
time64.h.
Also, sched.h uses restart_block; add it as a direct dependency.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
held_lock is embedded in task_struct, and we don't want sched.h pulling
in all of lockdep.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: Waiman Long <longman@redhat.com>
|
|
More sched.h dependency pruning.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More sched.h dependency pruning.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Christian Brauner <brauner@kernel.org>
|
|
More pruning of sched.h dependencies.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More trimming of sched.h dependencies.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We should't be depending on time.h; we should only be pulling in other
uapi headers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
thread_info.h pulls in a lot of junk that sched.h that we don't need; in
particular, this helps to kill the printk.h dependency.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
more sched.h header dependency trimming
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're working on only pulling in type definitions to sched.h whenever
possible.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
pruning sched.h dependencies, headers shouldn't pull in more than they
need.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
list_head is in types.h, not list.h., and the uapi header wasn't needed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More sched.h dependency culling - this lets us kill a rhashtable-types.h
dependency on workqueue.h.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Cutting down on sched.h dependencies: this is going to be used in
workqueue_types.h in the next patch, so we can kill the sched.h
dependency on workqueue.h.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
list_head is in types.h, not list.h.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trimming down sched.h dependencies: timerqueue_types can include just
rbtree_types.h instead of pulling in rbtree.h.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
list_head is defined in types.h, not list.h - this kills a sched.h
dependency.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's really no overlap between uapi/linux/wait.h and linux/wait.h.
There are two files which rely on the uapi file being implcitly included,
so explicitly include it there and remove it from the main header file.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Christian Brauner <brauner@kernel.org>
|
|
Trimming down sched.h dependencies: we don't want to include more than
the base types.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is needed for killing the sched.h dependency on rcupdate.h, and
pid.h is a better place for this code anyways.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trimming down sched.h dependencies: we dont't want to include more than
the base types.
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trimming down sched.h dependencies: we don't want to include more than
the base types.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trimming down sched.h dependencies: we don't want to include more than
the base types.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trimming down sched.h dependencies: we don't want to include more than
the base types.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We need to reduce the scope of what's included in sched.h: task_struct
includes a hrtimer, so split out the core types into their own header.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
ktime.h pulls in quite a few headers recursively (including printk.h) -
this is going to help with trimming sched.h dependencies.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This lets us kill the dependency on spinlock.h.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Moving these stub functions to a .c file means we can kill a sched.h
dependency on printk.h.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Slimming down recursive header includes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
prandom.h doesn't use percpu.h - this fixes some circular header issues.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
|
|
sched.h, which defines task_struct, needs nodemask_t - but sched.h is a
frequently used header and ideally shouldn't be pulling in any more code
that it needs to.
This splits out nodemask_types.h which has the definition sched.h needs,
which will avoid a circular header dependency in the alloc tagging patch
series, and as a bonus should speed up kernel build times.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
more header dependency pruning/fixing
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Patch series "kexec_file: print out debugging message if required", v4.
Currently, specifying '-d' on kexec command will print a lot of debugging
informationabout kexec/kdump loading with kexec_load interface.
However, kexec_file_load prints nothing even though '-d' is specified.
It's very inconvenient to debug or analyze the kexec/kdump loading when
something wrong happened with kexec/kdump itself or develper want to check
the kexec/kdump loading.
In this patchset, a kexec_file flag is KEXEC_FILE_DEBUG added and checked
in code. If it's passed in, debugging message of kexec_file code will be
printed out and can be seen from console and dmesg. Otherwise, the
debugging message is printed like beofre when pr_debug() is taken.
Note:
****
=====
1) The code in kexec-tools utility also need be changed to support
passing KEXEC_FILE_DEBUG to kernel when 'kexec -s -d' is specified.
The patch link is here:
=========
[PATCH] kexec_file: add kexec_file flag to support debug printing
http://lists.infradead.org/pipermail/kexec/2023-November/028505.html
2) s390 also has kexec_file code, while I am not sure what debugging
information is necessary. So leave it to s390 developer.
Test:
****
====
Testing was done in v1 on x86_64 and arm64. For v4, tested on x86_64
again. And on x86_64, the printed messages look like below:
--------------------------------------------------------------
kexec measurement buffer for the loaded kernel at 0x207fffe000.
Loaded purgatory at 0x207fff9000
Loaded boot_param, command line and misc at 0x207fff3000 bufsz=0x1180 memsz=0x1180
Loaded 64bit kernel at 0x207c000000 bufsz=0xc88200 memsz=0x3c4a000
Loaded initrd at 0x2079e79000 bufsz=0x2186280 memsz=0x2186280
Final command line is: root=/dev/mapper/fedora_intel--knightslanding--lb--02-root ro
rd.lvm.lv=fedora_intel-knightslanding-lb-02/root console=ttyS0,115200N81 crashkernel=256M
E820 memmap:
0000000000000000-000000000009a3ff (1)
000000000009a400-000000000009ffff (2)
00000000000e0000-00000000000fffff (2)
0000000000100000-000000006ff83fff (1)
000000006ff84000-000000007ac50fff (2)
......
000000207fff6150-000000207fff615f (128)
000000207fff6160-000000207fff714f (1)
000000207fff7150-000000207fff715f (128)
000000207fff7160-000000207fff814f (1)
000000207fff8150-000000207fff815f (128)
000000207fff8160-000000207fffffff (1)
nr_segments = 5
segment[0]: buf=0x000000004e5ece74 bufsz=0x211 mem=0x207fffe000 memsz=0x1000
segment[1]: buf=0x000000009e871498 bufsz=0x4000 mem=0x207fff9000 memsz=0x5000
segment[2]: buf=0x00000000d879f1fe bufsz=0x1180 mem=0x207fff3000 memsz=0x2000
segment[3]: buf=0x000000001101cd86 bufsz=0xc88200 mem=0x207c000000 memsz=0x3c4a000
segment[4]: buf=0x00000000c6e38ac7 bufsz=0x2186280 mem=0x2079e79000 memsz=0x2187000
kexec_file_load: type:0, start:0x207fff91a0 head:0x109e004002 flags:0x8
---------------------------------------------------------------------------
This patch (of 7):
When specifying 'kexec -c -d', kexec_load interface will print loading
information, e.g the regions where kernel/initrd/purgatory/cmdline are
put, the memmap passed to 2nd kernel taken as system RAM ranges, and
printing all contents of struct kexec_segment, etc. These are very
helpful for analyzing or positioning what's happening when kexec/kdump
itself failed. The debugging printing for kexec_load interface is made in
user space utility kexec-tools.
Whereas, with kexec_file_load interface, 'kexec -s -d' print nothing.
Because kexec_file code is mostly implemented in kernel space, and the
debugging printing functionality is missed. It's not convenient when
debugging kexec/kdump loading and jumping with kexec_file_load interface.
Now add KEXEC_FILE_DEBUG to kexec_file flag to control the debugging
message printing. And add global variable kexec_file_dbg_print and macro
kexec_dprintk() to facilitate the printing.
This is a preparation, later kexec_dprintk() will be used to replace the
existing pr_debug(). Once 'kexec -s -d' is specified, it will print out
kexec/kdump loading information. If '-d' is not specified, it regresses
to pr_debug().
Link: https://lkml.kernel.org/r/20231213055747.61826-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20231213055747.61826-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Correct typos/spellos and punctutation.
Link: https://lkml.kernel.org/r/20231213043316.10128-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm/damon: misc updates for 6.8".
Update comments, tests, and documents for DAMON.
This patch (of 6):
SeongJae is using his kernel.org account for DAMON development. Update
the old email addresses on the comments of DAMON source files.
Link: https://lkml.kernel.org/r/20231213190338.54146-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20231213190338.54146-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Introduce the logic to allow THP to be configured (through the new sysfs
interface we just added) to allocate large folios to back anonymous
memory, which are larger than the base page size but smaller than
PMD-size. We call this new THP extension "multi-size THP" (mTHP).
mTHP continues to be PTE-mapped, but in many cases can still provide
similar benefits to traditional PMD-sized THP: Page faults are
significantly reduced (by a factor of e.g. 4, 8, 16, etc. depending on
the configured order), but latency spikes are much less prominent because
the size of each page isn't as huge as the PMD-sized variant and there is
less memory to clear in each page fault. The number of per-page
operations (e.g. ref counting, rmap management, lru list management) are
also significantly reduced since those ops now become per-folio.
Some architectures also employ TLB compression mechanisms to squeeze more
entries in when a set of PTEs are virtually and physically contiguous and
approporiately aligned. In this case, TLB misses will occur less often.
The new behaviour is disabled by default, but can be enabled at runtime by
writing to /sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled (see
documentation in previous commit). The long term aim is to change the
default to include suitable lower orders, but there are some risks around
internal fragmentation that need to be better understood first.
[ryan.roberts@arm.com: resolve some multi-size THP review nits]
Link: https://lkml.kernel.org/r/20231214160251.3574571-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20231207161211.2374093-5-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <v-songbaohua@oppo.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Itaru Kitayama <itaru.kitayama@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|