| Age | Commit message (Collapse) | Author |
|
A new hybrid poll is implemented on the io_uring layer. Once an IO is
issued, it will not poll immediately, but rather block first and re-run
before IO complete, then poll to reap IO. While this poll method could
be a suboptimal solution when running on a single thread, it offers
performance lower than regular polling but higher than IRQ, and CPU
utilization is also lower than polling.
To use hybrid polling, the ring must be setup with both the
IORING_SETUP_IOPOLL and IORING_SETUP_HYBRID)IOPOLL flags set. Hybrid
polling has the same restrictions as IOPOLL, in that commands must
explicitly support it.
Signed-off-by: hexue <xue01.he@samsung.com>
Link: https://lore.kernel.org/r/20241101091957.564220-2-xue01.he@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently cloning a buffer table will fail if the destination already has
a table. But it should be possible to use it to replace existing elements.
Add a IORING_REGISTER_DST_REPLACE cloning flag, which if set, will allow
the destination to already having a buffer table. If that is the case,
then entries designated by offset + nr buffers will be replaced if they
already exist.
Note that it's allowed to use IORING_REGISTER_DST_REPLACE and not have
an existing table, in which case it'll work just like not having the
flag set and an empty table - it'll just assign the newly created table
for that case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Right now buffer cloning is an all-or-nothing kind of thing - either the
whole table is cloned from a source to a destination ring, or nothing at
all.
However, it's not always desired to clone the whole thing. Allow for
the application to specify a source and destination offset, and a
number of buffers to clone. If the destination offset is non-zero, then
allocate sparse nodes upfront.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
F2FS should understand how the device aliasing file works and support
deleting the file after use. A device aliasing file can be created by
mkfs.f2fs tool and it can map the whole device with an extent, not
using node blocks. The file space should be pinned and normally used for
read-only usages.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.12-rc6).
Conflicts:
drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
cbe84e9ad5e2 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
188a1bf89432 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/
net/mac80211/cfg.c
c4382d5ca1af ("wifi: mac80211: update the right link for tx power")
8dd0498983ee ("wifi: mac80211: Fix setting txpower with emulate_chanctx")
drivers/net/ethernet/intel/ice/ice_ptp_hw.h
6e58c3310622 ("ice: fix crash on probe for DPLL enabled E810 LOM")
e4291b64e118 ("ice: Align E810T GPIO to other products")
ebb2693f8fbd ("ice: Read SDP section from NVM for pin definitions")
ac532f4f4251 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.13
The first -next "new features" pull request for v6.13. This is a big
one as we have not been able to send one earlier. We have also some
patches affecting other subsystems: in staging we deleted the rtl8192e
driver and in debugfs added a new interface to save struct
file_operations memory; both were acked by GregKH.
Because of the lib80211/libipw move there were quite a lot of
conflicts and to solve those we decided to merge net-next into
wireless-next.
Major changes:
cfg80211/mac80211
* stop exporting wext symbols
* new mac80211 op to indicate that a new interface is to be added
* support radio separation of multi-band devices
Wireless Extensions
* move wext spy implementation to libiw
* remove iw_public_data from struct net_device
brcmfmac
* optional LPO clock support
ipw2x00
* move remaining lib80211 code into libiw
wilc1000
* WILC3000 support
rtw89
* RTL8852BE and RTL8852BE-VT BT-coexistence improvements
* tag 'wireless-next-2024-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (126 commits)
mac80211: Remove NOP call to ieee80211_hw_config
wifi: iwlwifi: work around -Wenum-compare-conditional warning
wifi: mac80211: re-order assigning channel in activate links
wifi: mac80211: convert debugfs files to short fops
debugfs: add small file operations for most files
wifi: mac80211: remove misleading j_0 construction parts
wifi: mac80211_hwsim: use hrtimer_active()
wifi: mac80211: refactor BW limitation check for CSA parsing
wifi: mac80211: filter on monitor interfaces based on configured channel
wifi: mac80211: refactor ieee80211_rx_monitor
wifi: mac80211: add support for the monitor SKIP_TX flag
wifi: cfg80211: add monitor SKIP_TX flag
wifi: mac80211: add flag to opt out of virtual monitor support
wifi: cfg80211: pass net_device to .set_monitor_channel
wifi: mac80211: remove status->ampdu_delimiter_crc
wifi: cfg80211: report per wiphy radio antenna mask
wifi: mac80211: use vif radio mask to limit creating chanctx
wifi: mac80211: use vif radio mask to limit ibss scan frequencies
wifi: cfg80211: add option for vif allowed radios
wifi: iwlwifi: allow IWL_FW_CHECK() with just a string
...
====================
Link: https://patch.msgid.link/20241025170705.5F6B2C4CEC3@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Useful for testing performance/efficiency impact of registered files
and buffers, vs (particularly) non-registered files.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Generally applications have 1 or a few waits of waiting, yet they pass
in a struct io_uring_getevents_arg every time. This needs to get copied
and, in turn, the timeout value needs to get copied.
Rather than do this for every invocation, allow the application to
register a fixed set of wait regions that can simply be indexed when
asking the kernel to wait on events.
At ring setup time, the application can register a number of these wait
regions and initialize region/index 0 upfront:
struct io_uring_reg_wait *reg;
reg = io_uring_setup_reg_wait(ring, nr_regions, &ret);
/* set timeout and mark as set, sigmask/sigmask_sz as needed */
reg->ts.tv_sec = 0;
reg->ts.tv_nsec = 100000;
reg->flags = IORING_REG_WAIT_TS;
where nr_regions >= 1 && nr_regions <= PAGE_SIZE / sizeof(*reg). The
above initializes index 0, but 63 other regions can be initialized,
if needed. Now, instead of doing:
struct __kernel_timespec timeout = { .tv_nsec = 100000, };
io_uring_submit_and_wait_timeout(ring, &cqe, nr, &t, NULL);
to wait for events for each submit_and_wait, or just wait, operation, it
can just reference the above region at offset 0 and do:
io_uring_submit_and_wait_reg(ring, &cqe, nr, 0);
to achieve the same goal of waiting 100usec without needing to copy
both struct io_uring_getevents_arg (24b) and struct __kernel_timeout
(16b) for each invocation. Struct io_uring_reg_wait looks as follows:
struct io_uring_reg_wait {
struct __kernel_timespec ts;
__u32 min_wait_usec;
__u32 flags;
__u64 sigmask;
__u32 sigmask_sz;
__u32 pad[3];
__u64 pad2[2];
};
embedding the timeout itself in the region, rather than passing it as
a pointer as well. Note that the signal mask is still passed as a
pointer, both for compatability reasons, but also because there doesn't
seem to be a lot of high frequency waits scenarios that involve setting
and resetting the signal mask for each wait.
The application is free to modify any region before a wait call, or it
can use keep multiple regions with different settings to avoid needing to
modify the same one for wait calls. Up to a page size of regions is mapped
by default, allowing PAGE_SIZE / 64 available regions for use.
The registered region must fit within a page. On a 4kb page size system,
that allows for 64 wait regions if a full page is used, as the size of
struct io_uring_reg_wait is 64b. The region registered must be aligned
to io_uring_reg_wait in size. It's valid to register less than 64
entries.
In network performance testing with zero-copy, this reduced the time
spent waiting on the TX side from 3.12% to 0.3% and the RX side from 4.4%
to 0.3%.
Wait regions are fixed for the lifetime of the ring - once registered,
they are persistent until the ring is torn down. The regions support
minimum wait timeout as well as the regular waits.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Once a ring has been created, the size of the CQ and SQ rings are fixed.
Usually this isn't a problem on the SQ ring side, as it merely controls
the available number of requests that can be submitted in a single
system call, and there's rarely a need to change that.
For the CQ ring, it's a different story. For most efficient use of
io_uring, it's important that the CQ ring never overflows. This means
that applications must size it for the worst case scenario, which can
be wasteful.
Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize
the existing rings. It takes a struct io_uring_params argument, the same
one which is used to setup the ring initially, and resizes rings
according to the sizes given.
Certain properties are always inherited from the original ring setup,
like SQE128/CQE32 and other setup options. The implementation only
allows flag associated with how the CQ ring is sized and clamped.
Existing unconsumed SQE and CQE entries are copied as part of the
process. If either the SQ or CQ resized destination ring cannot hold the
entries already present in the source rings, then the operation is failed
with -EOVERFLOW. Any register op holds ->uring_lock, which prevents new
submissions, and the internal mapping holds the completion lock as well
across moving CQ ring state.
To prevent races between mmap and ring resizing, add a mutex that's
solely used to serialize ring resize and mmap. mmap_sem can't be used
here, as as fork'ed process may be doing mmaps on the ring as well.
The ctx->resize_lock is held across mmap operations, and the resize
will grab it before swapping out the already mapped new data.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Normally MSG_RING requires both a source and a destination ring. But
some users don't always have a ring avilable to send a message from, yet
they still need to notify a target ring.
Add support for using io_uring_register(2) without having a source ring,
using a file descriptor of -1 for that. Internally those are called
blind registration opcodes. Implement IORING_REGISTER_SEND_MSG_RING as a
blind opcode, which simply takes an sqe that the application can put on
the stack and use the normal liburing helpers to initialize it. Then the
app can call:
io_uring_register(-1, IORING_REGISTER_SEND_MSG_RING, &sqe, 1);
and get the same behavior in terms of the target, where a CQE is posted
with the details given in the sqe.
For now this takes a single sqe pointer argument, and hence arg must
be set to that, and nr_args must be 1. Could easily be extended to take
an array of sqes, but for now let's keep it simple.
Link: https://lore.kernel.org/r/20240924115932.116167-3-axboe@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently all flows for a certain SA must be processed by the same
cpu to avoid packet reordering and lock contention of the xfrm
state lock.
To get rid of this limitation, the IETF standardized per cpu SAs
in RFC 9611. This patch implements the xfrm part of it.
We add the cpu as a lookup key for xfrm states and a config option
to generate acquire messages for each cpu.
With that, we can have on each cpu a SA with identical traffic selector
so that flows can be processed in parallel on all cpus.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Tested-by: Antony Antony <antony.antony@secunet.com>
Tested-by: Tobias Brunner <tobias@strongswan.org>
|
|
Introduce new flag (IOMMU_HWPT_ALLOC_PASID) to domain_alloc_users() ops.
If IOMMU supports PASID it will allocate domain. Otherwise return error.
In error path check for -EOPNOTSUPP and try to allocate non-PASID
domain so that DMA-API mode work fine for drivers which does not support
PASID as well.
Also modify __iommu_group_alloc_default_domain() to call
iommu_paging_domain_alloc_flags() with appropriate flag when allocating
paging domain.
Signed-off-by: Jason Gunthorpe <jgg@ziepe.ca>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241028093810.5901-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.13-2024-10-25:
amdgpu:
- SDMA queue reset support
- SMU 13.0.6 updates
- Add debugfs interface to help limit jpeg queue scheduling for testing
- JPEG 4.0.3 updates
- Initial runtime repartitioning support
- GFX9 fixes
- Misc code cleanups
- Rework IP structures to better handle multiple instances of an IP
- DML updates
- DSC fixes
- HDR fixes
- Brightness control updates
- Runtime pm cleanup
- DMCUB fixes
- DCN 3.5 updates
- Struct drm_edid cleanup
- Fetch EDID from _DDC if available
- Ring noop optimizations
- MES logging fixes
- 3DLUT fixes
- DCN 4.x fixes
- SMU 13.x fixes
- Fixes for set_soft_freq_range()
- ACPI fixes
- SMU 14.x updates
- PSR-SU fixes
- fdinfo cleanup
- DCN documentation updates
amdkfd:
- Misc code cleanups
- Increase event FIFO size
- Copy wave state fixes for SDMA
radeon:
- Fix possible overflow in packet3 check
- Late init connector fix
- Always set GEM function pointer
Documentation:
- Update drm-memory documentation
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241025132336.2416913-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Flag KFD support for per-queue reset on GFX9 devices.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Define the IOMMU_IOAS_MAP_FILE ioctl interface, which allows a user to
register memory by passing a memfd plus offset and length. Implement it
using the memfd_pin_folios() kAPI.
Link: https://patch.msgid.link/r/1729861919-234514-8-git-send-email-steven.sistare@oracle.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts and no adjacent changes.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Cross-merge bpf fixes after downstream PR.
No conflicts.
Adjacent changes in:
include/linux/bpf.h
include/uapi/linux/bpf.h
kernel/bpf/btf.c
kernel/bpf/helpers.c
kernel/bpf/syscall.c
kernel/bpf/verifier.c
kernel/trace/bpf_trace.c
mm/slab_common.c
tools/include/uapi/linux/bpf.h
tools/testing/selftests/bpf/Makefile
Link: https://lore.kernel.org/all/20241024215724.60017-1-daniel@iogearbox.net/
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The v1.3 PSCI spec (https://developer.arm.com/documentation/den0022) adds
the SYSTEM_OFF2 function. Add definitions for it and its hibernation type
parameter.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20241019172459.2241939-2-dwmw2@infradead.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
This allows a tracer to control the ABI of the tracee, as on arm64.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
RISC-V supports pointer masking with a variable number of tag bits
(which is called "PMLEN" in the specification) and which is configured
at the next higher privilege level.
Wire up the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctls
so userspace can request a lower bound on the number of tag bits and
determine the actual number of tag bits. As with arm64's
PR_TAGGED_ADDR_ENABLE, the pointer masking configuration is
thread-scoped, inherited on clone() and fork() and cleared on execve().
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-5-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap
link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for
sockmap link
Also add comments for bpf_link_type to prevent missing updates in the
future.
Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com
|
|
The thresholds exist but there is no notification neither action code
related to them yet.
These changes implement the netlink for the notifications when the
thresholds are crossed, added, deleted or flushed as well as the
commands which allows to get the list of the thresholds, flush them,
add and delete.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20241022155147.463475-3-daniel.lezcano@linaro.org
[ rjw: Use the thermal_zone guard for locking, subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
A common pattern when using pid fds is having to get information
about the process, which currently requires /proc being mounted,
resolving the fd to a pid, and then do manual string parsing of
/proc/N/status and friends. This needs to be reimplemented over
and over in all userspace projects (e.g.: I have reimplemented
resolving in systemd, dbus, dbus-daemon, polkit so far), and
requires additional care in checking that the fd is still valid
after having parsed the data, to avoid races.
Having a programmatic API that can be used directly removes all
these requirements, including having /proc mounted.
As discussed at LPC24, add an ioctl with an extensible struct
so that more parameters can be added later if needed. Start with
returning pid/tgid/ppid and creds unconditionally, and cgroupid
optionally.
Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com>
Link: https://lore.kernel.org/r/20241010155401.2268522-1-luca.boccassi@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This can be used to indicate that the user is not interested in receiving
locally sent packets on the monitor interface.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/f0c20f832eadd36c71fba9a2a16ba57d78389b6c.1728462320.git-series.nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
With multi-radio devices, each radio typically gets a fixed set of antennas.
In order to be able to disable specific antennas for some radios, user space
needs to know which antenna mask bits are assigned to which radio.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/e0a26afa2c88eaa188ec96ec6d17ecac4e827641.1728462320.git-series.nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This allows users to prevent a vif from affecting radios other than the
configured ones. This can be useful in cases where e.g. an AP is running
on one radio, and triggering a scan on another radio should not disturb it.
Changing the allowed radios list for a vif is supported, but only while
it is down.
While it is possible to achieve the same by always explicitly specifying
a frequency list for scan requests and ensuring that the wrong channel/band
is never accidentally set on an unrelated interface, this change makes
multi-radio wiphy setups a lot easier to deal with for CLI users.
By itself, this patch only enforces the radio mask for scanning requests
and remain-on-channel. Follow-up changes build on this to limit configured
frequencies.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/eefcb218780f71a1549875d149f1196486762756.1728462320.git-series.nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This reverts commit a3ab2d45b9887ee609cd3bea39f668236935774c.
The userspace side for this code is not ready yet so revert
for now.
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Philip Yang <Philip.Yang@amd.com>
|
|
After a SED drive is provisioned, there is no way to change the SID
password via the ioctl() interface. A new ioctl IOC_OPAL_SET_SID_PW
will allow the password to be changed. The valid current password is
required.
Signed-off-by: Greg Joyce <gjoyce@linux.ibm.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Link: https://lore.kernel.org/r/20240829175639.6478-2-gjoyce@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk currently supports the following behaviors on ublk server exit:
A: outstanding I/Os get errors, subsequently issued I/Os get errors
B: outstanding I/Os get errors, subsequently issued I/Os queue
C: outstanding I/Os get reissued, subsequently issued I/Os queue
and the following behaviors for recovery of preexisting block devices by
a future incarnation of the ublk server:
1: ublk devices stopped on ublk server exit (no recovery possible)
2: ublk devices are recoverable using start/end_recovery commands
The userspace interface allows selection of combinations of these
behaviors using flags specified at device creation time, namely:
default behavior: A + 1
UBLK_F_USER_RECOVERY: B + 2
UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_REISSUE: C + 2
The behavior A + 2 is currently unsupported. Add support for this
behavior under the new flag combination
UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_FAIL_IO.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20241007182419.3263186-5-ushankar@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Deprecation period of reiserfs ends with the end of this year so it is
time to remove it from the kernel.
Acked-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need the iio fixes from 6.12-rc4 in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull bpf fixes from Daniel Borkmann:
- Fix BPF verifier to not affect subreg_def marks in its range
propagation (Eduard Zingerman)
- Fix a truncation bug in the BPF verifier's handling of
coerce_reg_to_size_sx (Dimitar Kanaliev)
- Fix the BPF verifier's delta propagation between linked registers
under 32-bit addition (Daniel Borkmann)
- Fix a NULL pointer dereference in BPF devmap due to missing rxq
information (Florian Kauer)
- Fix a memory leak in bpf_core_apply (Jiri Olsa)
- Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
arrays of nested structs (Hou Tao)
- Fix build ID fetching where memory areas backing the file were
created with memfd_secret (Andrii Nakryiko)
- Fix BPF task iterator tid filtering which was incorrectly using pid
instead of tid (Jordan Rome)
- Several fixes for BPF sockmap and BPF sockhash redirection in
combination with vsocks (Michal Luczaj)
- Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)
- Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
of an infinite BPF tailcall (Pu Lehui)
- Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
be resolved (Thomas Weißschuh)
- Fix a bug in kfunc BTF caching for modules where the wrong BTF object
was returned (Toke Høiland-Jørgensen)
- Fix a BPF selftest compilation error in cgroup-related tests with
musl libc (Tony Ambardar)
- Several fixes to BPF link info dumps to fill missing fields (Tyrone
Wu)
- Add BPF selftests for kfuncs from multiple modules, checking that the
correct kfuncs are called (Simon Sundberg)
- Ensure that internal and user-facing bpf_redirect flags don't overlap
(Toke Høiland-Jørgensen)
- Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
Riel)
- Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
under RT (Wander Lairson Costa)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
lib/buildid: Handle memfd_secret() files in build_id_parse()
selftests/bpf: Add test case for delta propagation
bpf: Fix print_reg_state's constant scalar dump
bpf: Fix incorrect delta propagation between linked registers
bpf: Properly test iter/task tid filtering
bpf: Fix iter/task tid filtering
riscv, bpf: Make BPF_CMPXCHG fully ordered
bpf, vsock: Drop static vsock_bpf_prot initialization
vsock: Update msg_count on read_skb()
vsock: Update rx_bytes on read_skb()
bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
selftests/bpf: Add asserts for netfilter link info
bpf: Fix link info netfilter flags to populate defrag flag
selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
bpf: Fix truncation bug in coerce_reg_to_size_sx()
selftests/bpf: Assert link info uprobe_multi count & path_size if unset
bpf: Fix unpopulated path_size when uprobe_multi fields unset
selftests/bpf: Fix cross-compiling urandom_read
selftests/bpf: Add test for kfunc module order
...
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Fix target passthrough identifier (Nilay)
- Fix tcp locking (Hannes)
- Replace list with sbitmap for tracking RDMA rsp tags (Guixen)
- Remove unnecessary fallthrough statements (Tokunori)
- Remove ready-without-media support (Greg)
- Fix multipath partition scan deadlock (Keith)
- Fix concurrent PCI reset and remove queue mapping (Maurizio)
- Fabrics shutdown fixes (Nilay)
- Fix for a kerneldoc warning (Keith)
- Fix a race with blk-rq-qos and wakeups (Omar)
- Cleanup of checking for always-set tag_set (SurajSonawane2415)
- Fix for a crash with CPU hotplug notifiers (Ming)
- Don't allow zero-copy ublk on unprivileged device (Ming)
- Use array_index_nospec() for CDROM (Josh)
- Remove dead code in drbd (David)
- Tweaks to elevator loading (Breno)
* tag 'block-6.12-20241018' of git://git.kernel.dk/linux:
cdrom: Avoid barrier_nospec() in cdrom_ioctl_media_changed()
nvme: use helper nvme_ctrl_state in nvme_keep_alive_finish function
nvme: make keep-alive synchronous operation
nvme-loop: flush off pending I/O while shutting down loop controller
nvme-pci: fix race condition between reset and nvme_dev_disable()
ublk: don't allow user copy for unprivileged device
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
nvme-multipath: defer partition scanning
blk-mq: setup queue ->tag_set before initializing hctx
elevator: Remove argument from elevator_find_get
elevator: do not request_module if elevator exists
drbd: Remove unused conn_lowest_minor
nvme: disable CC.CRIME (NVME_CC_CRIME)
nvme: delete unnecessary fallthru comment
nvmet-rdma: use sbitmap to replace rsp free list
block: Fix elevator_get_default() checking for NULL q->tag_set
nvme: tcp: avoid race between queue_lock lock and destroy
nvmet-passthru: clear EUID/NGUID/UUID while using loop target
block: fix blk_rq_map_integrity_sg kernel-doc
|
|
When working in "fd mode", fanotify_read() needs to open an fd
from a dentry to report event->fd to userspace.
Opening an fd from dentry can fail for several reasons.
For example, when tasks are gone and we try to open their
/proc files or we try to open a WRONLY file like in sysfs
or when trying to open a file that was deleted on the
remote network server.
Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
For a group with FAN_REPORT_FD_ERROR, we will send the
event with the error instead of the open fd, otherwise
userspace may not get the error at all.
For an overflow event, we report -EBADF to avoid confusing FAN_NOFD
with -EPERM. Similarly for pidfd open errors we report either -ESRCH
or the open error instead of FAN_NOPIDFD and FAN_EPIDFD.
In any case, userspace will not know which file failed to
open, so add a debug print for further investigation.
Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
Link: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20241003142922.111539-1-amir73il@gmail.com
|
|
UBLK_F_USER_COPY requires userspace to call write() on ublk char
device for filling request buffer, and unprivileged device can't
be trusted.
So don't allow user copy for unprivileged device.
Cc: stable@vger.kernel.org
Fixes: 1172d5b8beca ("ublk: support user copy")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20241016134847.2911721-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add support for frame-based frame format, which can be used to support
multiple formats like H264 or H265, in addition to MJPEG and YUV frames.
The frame-based format is set to H264 by default, but it can be updated
to other formats by modifying the GUID through the guid configfs
attribute. Different structures are used for all three formats, as
H264 has a different structure compared to MJPEG and uncompressed
formats. These structures will be passed to the frame make function
based on the active format, using a common frame structure with
additional parameters needed only for frame-based formats. These
parameters are handled at runtime in the UVC driver.
Signed-off-by: Akash Kumar <quic_akakum@quicinc.com>
Link: https://lore.kernel.org/r/20240927152138.31416-1-quic_akakum@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for Raspberry Pi CFE. The CFE is a hardware block that
contains:
- MIPI D-PHY
- MIPI CSI-2 receiver
- Front End ISP (FE)
The driver has been upported from the Raspberry Pi kernel commit
88a681df9623 ("ARM: dts: bcm2712-rpi: Add i2c<n>_pins labels").
Co-developed-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Add two meta formats for PiSP FE: V4L2_META_FMT_RPI_FE_CFG and
V4L2_META_FMT_RPI_FE_STATS. The former is used to provide configuration
for the FE and the latter is used to read the statistics from the FE.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- Add flex array to struct batadv_tvlv_tt_data, by Erick Archer
- Use string choice helper to print booleans, by Sven Eckelmann
- replace call_rcu by kfree_rcu for simple kmem_cache_free callback,
by Julia Lawall
* tag 'batadv-next-pullrequest-20241015' of git://git.open-mesh.org/linux-merge:
batman-adv: replace call_rcu by kfree_rcu for simple kmem_cache_free callback
batman-adv: Use string choice helper to print booleans
batman-adv: Add flex array to struct batadv_tvlv_tt_data
batman-adv: Start new development cycle
====================
Link: https://patch.msgid.link/20241015073946.46613-1-sw@simonwunderlich.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-10-14
The following pull-request contains BPF updates for your *net-next* tree.
We've added 21 non-merge commits during the last 18 day(s) which contain
a total of 21 files changed, 1185 insertions(+), 127 deletions(-).
The main changes are:
1) Put xsk sockets on a struct diet and add various cleanups. Overall, this helps
to bump performance by 12% for some workloads, from Maciej Fijalkowski.
2) Extend BPF selftests to increase coverage of XDP features in combination
with BPF cpumap, from Alexis Lothoré (eBPF Foundation).
3) Extend netkit with an option to delegate skb->{mark,priority} scrubbing to
its BPF program, from Daniel Borkmann.
4) Make the bpf_get_netns_cookie() helper available also to tc(x) BPF programs,
from Mahe Tardy.
5) Extend BPF selftests covering a BPF program setting socket options per MPTCP
subflow, from Geliang Tang and Nicolas Rybowski.
bpf-next-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (21 commits)
xsk: Use xsk_buff_pool directly for cq functions
xsk: Wrap duplicated code to function
xsk: Carry a copy of xdp_zc_max_segs within xsk_buff_pool
xsk: Get rid of xdp_buff_xsk::orig_addr
xsk: s/free_list_node/list_node/
xsk: Get rid of xdp_buff_xsk::xskb_list_node
selftests/bpf: check program redirect in xdp_cpumap_attach
selftests/bpf: make xdp_cpumap_attach keep redirect prog attached
selftests/bpf: fix bpf_map_redirect call for cpu map test
selftests/bpf: add tcx netns cookie tests
bpf: add get_netns_cookie helper to tc programs
selftests/bpf: add missing header include for htons
selftests/bpf: Extend netkit tests to validate skb meta data
tools: Sync if_link.h uapi tooling header
netkit: Add add netkit scrub support to rt_link.yaml
netkit: Simplify netkit mode over to use NLA_POLICY_MAX
netkit: Add option for scrubbing skb meta data
bpf: Remove unused macro
selftests/bpf: Add mptcp subflow subtest
selftests/bpf: Add getsockopt to inspect mptcp subflow
...
====================
Link: https://patch.msgid.link/20241014211110.16562-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to set per-NAPI defer_hard_irqs and gro_flush_timeout.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241011184527.16393-7-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support dumping gro_flush_timeout for a NAPI ID.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241011184527.16393-5-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support dumping defer_hard_irqs for a NAPI ID.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241011184527.16393-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the index is ORed with V4L2_FMTDESC_FLAG_ENUM_ALL the
driver clears the flag and enumerate all the possible formats,
ignoring any limitations from the current configuration.
Drivers which do not support this flag yet always return an EINVAL.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil: improved doc when the new flag is not supported by the driver]
|
|
Linux 6.12-rc2
Resolved movement of asm/unaligned.h to linux/unaligned.h
|
|
The user thresholds mechanism is a way to have the userspace to tell
the thermal framework to send a notification when a temperature limit
is crossed. There is no id, no hysteresis, just the temperature and
the direction of the limit crossing. That means we can be notified
when a threshold is crossed the way up only, or the way down only or
both ways. That allows to create hysteresis values if it is needed.
A threshold can be added, deleted or flushed. The latter means all
thresholds belonging to a thermal zone will be deleted.
When a threshold is added:
- if the same threshold (temperature and direction) exists, an error
is returned
- if a threshold is specified with the same temperature but a
different direction, the specified direction is added
- if there is no threshold with the same temperature then it is
created
When a threshold is deleted:
- if the same threshold (temperature and direction) exists, it is
deleted
- if a threshold is specified with the same temperature but a
different direction, the specified direction is removed
- if there is no threshold with the same temperature, then an error
is returned
When the threshold are flushed:
- All thresholds related to a thermal zone are deleted
When a threshold is crossed:
- the userspace does not need to know which threshold(s) have been
crossed, it will be notified with the current temperature and the
previous temperature
- if multiple thresholds have been crossed between two updates only
one notification will be send to the userspace, it is pointless to
send a notification per thresholds crossed as the userspace can
handle that easily when it has the temperature delta information
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20240923100005.2532430-2-daniel.lezcano@linaro.org
[ rjw: Subject edit, use BIT(0) and BIT(1) in symbol definitions ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only
on x86. This was presumably true at the time of writing, but it's now
supported on many other architectures too. Drop this statement, since
it's not correct anymore and it fits better in other documentation
anyway.
Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com>
Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The `index` argument to bpf_loop() is threaded as an u64.
This lead in a subtle verifier denial where clang cloned the argument
in another register[1].
[1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895
Signed-off-by: Matteo Croce <teknoraver@meta.com>
Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Allow the user-space to fine-grain query the shaping features
supported by the NIC on each domain.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/3ddd10e450e3fe7d4b944c0d0b886d4483529ee6.1728460186.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|