Age | Commit message (Collapse) | Author |
|
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Runtime PM updates related to autosuspend for 6.17
Make several autosuspend functions mark last busy stamp and update
the documentation accordingly (Sakari Ailus).
|
|
inet6_sk(sk)->ipv6_ac_list is protected by lock_sock().
In ipv6_sock_ac_join(), only __dev_get_by_index(), __dev_get_by_flags(),
and __in6_dev_get() require RTNL.
__dev_get_by_flags() is only used by ipv6_sock_ac_join() and can be
converted to RCU version.
Let's replace RCU version helper and drop RTNL from IPV6_JOIN_ANYCAST.
setsockopt_needs_rtnl() will be removed in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250702230210.3115355-15-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:
====================
mlx5-next updates 2025-07-08
The following pull-request contains common mlx5 updates
for your *net-next* tree.
v2: https://lore.kernel.org/1751574385-24672-1-git-send-email-tariqt@nvidia.com
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Check device memory pointer before usage
net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow
net/mlx5: Add IFC bits for PCIe Congestion Event object
net/mlx5: Small refactor for general object capabilities
net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain
====================
Link: https://patch.msgid.link/1752002102-11316-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Propagate find_random_bit() to cpumask API.
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
|
|
Generalize node_random() and make it available to general bitmaps and
cpumasks users.
Notice, find_first_bit() is generally faster than find_nth_bit(), and we
employ it when there's a single set bit in the bitmap.
See commit 3e061d924fe9c7b4 ("lib/nodemask: optimize node_random for
nodemask with single NUMA node").
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
|
|
Populate the gic_kvm_info struct based on support for
FEAT_GCIE_LEGACY. The struct is used by KVM to probe for a compatible
GIC.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20250627100847.1022515-3-sascha.bischoff@arm.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
All drivers are now converted to dedicated _rxfh_context ops.
Remove the use of >set_rxfh() to manage additional contexts.
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250707184115.2285277-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The GICv5 architecture implements the Interrupt Wire Bridge (IWB) in
order to support wired interrupts that cannot be connected directly
to an IRS and instead uses the ITS to translate a wire event into
an IRQ signal.
Add the wired-to-MSI IWB driver to manage IWB wired interrupts.
An IWB is connected to an ITS and it has its own deviceID for all
interrupt wires that it manages; the IWB input wire number must be
exposed to the ITS as an eventID with a 1:1 mapping.
This eventID is not programmable and therefore requires a new
msi_alloc_info_t flag to make sure the ITS driver does not allocate
an eventid for the wire but rather it uses the msi_alloc_info_t.hwirq
number to gather the ITS eventID.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-29-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The GICv5 architecture implements Interrupt Translation Service
(ITS) components in order to translate events coming from peripherals
into interrupt events delivered to the connected IRSes.
Events (ie MSI memory writes to ITS translate frame), are translated
by the ITS using tables kept in memory.
ITS translation tables for peripherals is kept in memory storage
(device table [DT] and Interrupt Translation Table [ITT]) that
is allocated by the driver on boot.
Both tables can be 1- or 2-level; the structure is chosen by the
driver after probing the ITS HW parameters and checking the
allowed table splits and supported {device/event}_IDbits.
DT table entries are allocated on demand (ie when a device is
probed); the DT table is sized using the number of supported
deviceID bits in that that's a system design decision (ie the
number of deviceID bits implemented should reflect the number
of devices expected in a system) therefore it makes sense to
allocate a DT table that can cater for the maximum number of
devices.
DT and ITT tables are allocated using the kmalloc interface;
the allocation size may be smaller than a page or larger,
and must provide contiguous memory pages.
LPIs INTIDs backing the device events are allocated one-by-one
and only upon Linux IRQ allocation; this to avoid preallocating
a large number of LPIs to cover the HW device MSI vector
size whereas few MSI entries are actually enabled by a device.
ITS cacheability/shareability attributes are programmed
according to the provided firmware ITS description.
The GICv5 partially reuses the GICv3 ITS MSI parent infrastructure
and adds functions required to retrieve the ITS translate frame
addresses out of msi-map and msi-parent properties to implement
the GICv5 ITS MSI parent callbacks.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-28-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In some irqchip implementations the fwnode representing the IRQdomain
and the MSI controller fwnode do not match; in particular the IRQdomain
fwnode is the MSI controller fwnode parent.
To support selecting such IRQ domains, add a flag in core IRQ domain
code that explicitly tells the MSI lib to use the parent fwnode while
carrying out IRQ domain selection.
Update the msi-lib select callback with the resulting logic.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-27-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
IRQchip drivers need a PCI/MSI function to map a RID to a MSI
controller deviceID namespace and at the same time retrieve the
struct device_node pointer of the MSI controller the RID is mapped
to.
Add pci_msi_map_rid_ctlr_node() to achieve this purpose.
Cc Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-25-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add an of_msi_xlate() helper that maps a device ID and returns
the device node of the MSI controller the device ID is mapped to.
Required by core functions that need an MSI controller device node
pointer at the same time as a mapped device ID, of_msi_map_id() is not
sufficient for that purpose.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-24-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
An IRS supports Logical Peripheral Interrupts (LPIs) and implement
Linux IPIs on top of it.
LPIs are used for interrupt signals that are translated by a
GICv5 ITS (Interrupt Translation Service) but also for software
generated IRQs - namely interrupts that are not driven by a HW
signal, ie IPIs.
LPIs rely on memory storage for interrupt routing and state.
LPIs state and routing information is kept in the Interrupt
State Table (IST).
IRSes provide support for 1- or 2-level IST tables configured
to support a maximum number of interrupts that depend on the
OS configuration and the HW capabilities.
On systems that provide 2-level IST support, always allow
the maximum number of LPIs; On systems with only 1-level
support, limit the number of LPIs to 2^12 to prevent
wasting memory (presumably a system that supports a 1-level
only IST is not expecting a large number of interrupts).
On a 2-level IST system, L2 entries are allocated on
demand.
The IST table memory is allocated using the kmalloc() interface;
the allocation required may be smaller than a page and must be
made up of contiguous physical pages if larger than a page.
On systems where the IRS is not cache-coherent with the CPUs,
cache mainteinance operations are executed to clean and
invalidate the allocated memory to the point of coherency
making it visible to the IRS components.
On GICv5 systems, IPIs are implemented using LPIs.
Add an LPI IRQ domain and implement an IPI-specific IRQ domain created
as a child/subdomain of the LPI domain to allocate the required number
of LPIs needed to implement the IPIs.
IPIs are backed by LPIs, add LPIs allocation/de-allocation
functions.
The LPI INTID namespace is managed using an IDA to alloc/free LPI INTIDs.
Associate an IPI irqchip with IPI IRQ descriptors to provide
core code with the irqchip.ipi_send_single() method required
to raise an IPI.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-22-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The GICv5 Interrupt Routing Service (IRS) component implements
interrupt management and routing in the GICv5 architecture.
A GICv5 system comprises one or more IRSes, that together
handle the interrupt routing and state for the system.
An IRS supports Shared Peripheral Interrupts (SPIs), that are
interrupt sources directly connected to the IRS; they do not
rely on memory for storage. The number of supported SPIs is
fixed for a given implementation and can be probed through IRS
IDR registers.
SPI interrupt state and routing are managed through GICv5
instructions.
Each core (PE in GICv5 terms) in a GICv5 system is identified with
an Interrupt AFFinity ID (IAFFID).
An IRS manages a set of cores that are connected to it.
Firmware provides a topology description that the driver uses
to detect to which IRS a CPU (ie an IAFFID) is associated with.
Use probeable information and firmware description to initialize
the IRSes and implement GICv5 IRS SPIs support through an
SPI-specific IRQ domain.
The GICv5 IRS driver:
- Probes IRSes in the system to detect SPI ranges
- Associates an IRS with a set of cores connected to it
- Adds an IRQchip structure for SPI handling
SPIs priority is set to a value corresponding to the lowest
permissible priority in the system (taking into account the
implemented priority bits of the IRS and CPU interface).
Since all IRQs are set to the same priority value, the value
itself does not matter as long as it is a valid one.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-21-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The GICv5 CPU interface implements support for PE-Private Peripheral
Interrupts (PPI), that are handled (enabled/prioritized/delivered)
entirely within the CPU interface hardware.
To enable PPI interrupts, implement the baseline GICv5 host kernel
driver infrastructure required to handle interrupts on a GICv5 system.
Add the exception handling code path and definitions for GICv5
instructions.
Add GICv5 PPI handling code as a specific IRQ domain to:
- Set-up PPI priority
- Manage PPI configuration and state
- Manage IRQ flow handler
- IRQs allocation/free
- Hook-up a PPI specific IRQchip to provide the relevant methods
PPI IRQ priority is chosen as the minimum allowed priority by the
system design (after probing the number of priority bits implemented
by the CPU interface).
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-20-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
syzbot reports that defer/local task_work adding via msg_ring can hit
a request that has been freed:
CPU: 1 UID: 0 PID: 19356 Comm: iou-wrk-19354 Not tainted 6.16.0-rc4-syzkaller-00108-g17bbde2e1716 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0xd2/0x2b0 mm/kasan/report.c:521
kasan_report+0x118/0x150 mm/kasan/report.c:634
io_req_local_work_add io_uring/io_uring.c:1184 [inline]
__io_req_task_work_add+0x589/0x950 io_uring/io_uring.c:1252
io_msg_remote_post io_uring/msg_ring.c:103 [inline]
io_msg_data_remote io_uring/msg_ring.c:133 [inline]
__io_msg_ring_data+0x820/0xaa0 io_uring/msg_ring.c:151
io_msg_ring_data io_uring/msg_ring.c:173 [inline]
io_msg_ring+0x134/0xa00 io_uring/msg_ring.c:314
__io_issue_sqe+0x17e/0x4b0 io_uring/io_uring.c:1739
io_issue_sqe+0x165/0xfd0 io_uring/io_uring.c:1762
io_wq_submit_work+0x6e9/0xb90 io_uring/io_uring.c:1874
io_worker_handle_work+0x7cd/0x1180 io_uring/io-wq.c:642
io_wq_worker+0x42f/0xeb0 io_uring/io-wq.c:696
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
which is supposed to be safe with how requests are allocated. But msg
ring requests alloc and free on their own, and hence must defer freeing
to a sane time.
Add an rcu_head and use kfree_rcu() in both spots where requests are
freed. Only the one in io_msg_tw_complete() is strictly required as it
has been visible on the other ring, but use it consistently in the other
spot as well.
This should not cause any other issues outside of KASAN rightfully
complaining about it.
Link: https://lore.kernel.org/io-uring/686cd2ea.a00a0220.338033.0007.GAE@google.com/
Reported-by: syzbot+54cbbfb4db9145d26fc2@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Fixes: 0617bb500bfa ("io_uring/msg_ring: improve handling of target CQE posting")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The ARMv9.2 architecture introduces the optional Branch Record Buffer
Extension (BRBE), which records information about branches as they are
executed into set of branch record registers. BRBE is similar to x86's
Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
(BHRB).
BRBE supports filtering by exception level and can filter just the
source or target address if excluded to avoid leaking privileged
addresses. The h/w filter would be sufficient except when there are
multiple events with disjoint filtering requirements. In this case, BRBE
is configured with a union of all the events' desired branches, and then
the recorded branches are filtered based on each event's filter. For
example, with one event capturing kernel events and another event
capturing user events, BRBE will be configured to capture both kernel
and user branches. When handling event overflow, the branch records have
to be filtered by software to only include kernel or user branch
addresses for that event. In contrast, x86 simply configures LBR using
the last installed event which seems broken.
It is possible on x86 to configure branch filter such that no branches
are ever recorded (e.g. -j save_type). For BRBE, events with a
configuration that will result in no samples are rejected.
Recording branches in KVM guests is not supported like x86. However,
perf on x86 allows requesting branch recording in guests. The guest
events are recorded, but the resulting branches are all from the host.
For BRBE, events with branch recording and "exclude_host" set are
rejected. Requiring "exclude_guest" to be set did not work. The default
for the perf tool does set "exclude_guest" if no exception level
options are specified. However, specifying kernel or user events
defaults to including both host and guest. In this case, only host
branches are recorded.
BRBE can support some additional exception branch types compared to
x86. On x86, all exceptions other than syscalls are recorded as IRQ.
With BRBE, it is possible to better categorize these exceptions. One
limitation relative to x86 is we cannot distinguish a syscall return
from other exception returns. So all exception returns are recorded as
ERET type. The FIQ branch type is omitted as the only FIQ user is Apple
platforms which don't support BRBE. The debug branch types are omitted
as there is no clear need for them.
BRBE records are invalidated whenever events are reconfigured, a new
task is scheduled in, or after recording is paused (and the records
have been recorded for the event). The architecture allows branch
records to be invalidated by the PE under implementation defined
conditions. It is expected that these conditions are rare.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
tested-by: Adam Young <admiyo@os.amperecomputing.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org
[will: Fix sparse warnings about mixed declarations and code.
Fix C99 comment syntax.]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The virtio specification are introducing support for GSO over UDP
tunnel.
This patch brings in the needed defines and the additional virtio hdr
parsing/building helpers.
The UDP tunnel support uses additional fields in the virtio hdr, and such
fields location can change depending on other negotiated features -
specifically VIRTIO_NET_F_HASH_REPORT.
Try to be as conservative as possible with the new field validation.
Existing implementation for plain GSO offloads allow for invalid/
self-contradictory values of such fields. With GSO over UDP tunnel we can
be more strict, with no need to deal with legacy implementation.
Since the checksum-related field validation is asymmetric in the driver
and in the device, introduce a separate helper to implement the new checks
(to be used only on the driver side).
Note that while the feature space exceeds the 64-bit boundaries, the
guest offload space is fixed by the specification of the
VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command to a 64-bit size.
Prior to the UDP tunnel GSO support, each guest offload bit corresponded
to the feature bit with the same value and vice versa.
Due to the limited 'guest offload' space, relevant features in the high
64 bits are 'mapped' to free bits in the lower range. That is simpler
than defining a new command (and associated features) to exchange an
extended guest offloads set.
As a consequence, the uAPIs also specify the mapped guest offload value
corresponding to the UDP tunnel GSO features.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
--
v4 -> v5:
- avoid lines above 80 chars
v3 -> v4:
- fixed offset for UDP GSO tunnel, update accordingly the helpers
- tried to clarified vlan_hlen semantic
- virtio_net_chk_data_valid() -> virtio_net_handle_csum_offload()
v2 -> v3:
- add definitions for possible vnet hdr layouts with tunnel support
v1 -> v2:
- 'relay' -> 'rely' typo
- less unclear comment WRT enforced inner GSO checks
- inner header fields are allowed only with 'modern' virtio,
thus are always le
- clarified in the commit message the need for 'mapped features'
defines
- assume little_endian is true when UDP GSO is enabled.
- fix inner proto type value
|
|
The virtio specifications allows for up to 128 bits for the
device features. Soon we are going to use some of the 'extended'
bits features (above 64) for the virtio_net driver.
Extend the virtio pci modern driver to support configuring the full
virtio features range, replacing the unrolled loops reading and
writing the features space with explicit one bounded to the actual
features space size in word and implementing the get_extended_features
callback.
Note that in vp_finalize_features() we only need to cache the lower
64 features bits, to process the transport features.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The virtio specifications allows for up to 128 bits for the
device features. Soon we are going to use some of the 'extended'
bits features (above 64) for the virtio_net driver.
Introduce extended features as a fixed size array of u64. To minimize
the diffstat allows legacy driver to access the low 64 bits via a
transparent union.
Introduce an extended get_extended_features configuration callback
that devices supporting the extended features range must implement in
place of the traditional one.
Note that legacy and transport features don't need any change, as
they are always in the low 64 bit range.
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
is enabled
Now that we have an own config symbol for the PHY package module,
we can use it to reduce size of these structs if it isn't enabled.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/f0daefa4-406a-4a06-a4f0-7e31309f82bc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since its introduction in commit 2e910b95329c ("net: Add a function to
splice pages into an skbuff for MSG_SPLICE_PAGES"), skb_splice_from_iter()
never used the @gfp argument. Remove it and adapt callers.
No functional change intended.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250702-splice-drop-unused-v3-2-55f68b60d2b7@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-next
Updates for v6.17
CI:
- uprev mesa and ci-templates
- use shallow clone to speed up build jobs
- remove sdm845/cheza jobs. These runners are no more (RIP
dear chezas)
- fix runner tag for i915 cml runners
- uprev igt to pull in msm test fixes
Core:
- VM_BIND support!
- single source of truth for UBWC configuration. Adds a global soc
driver for UBWC config which is used from display and GPU. (And
later vidc/camera/etc)
- Decouple ties between GPU and KMS, adding a `separate_gpu_kms`
modparam to allow the GPU and KMS to bind to separate DRM devices.
This should better deal with more exotic SoC configurations where
the number of GPUs is different from number of DPUs. The default
behavior is to still come up as a single unified DRM device to
avoid surprising userspace.
DP:
- major rework of the I/O accessors
DPU:
- use version checks instead of feature bits
- SM8750 support
- set min_prefill_lines for SC8180X
DSI:
- SM8750 support
GPU:
- speedbin support for X1-85
- X1-45 support
MDSS:
- SM8750 support
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Robin Clark <robin.clark@oss.qualcomm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACSVV0217R+kpoWQJeuYGHf6q_4aFyEJuKa=dZZKOnLQzFwppg@mail.gmail.com
|
|
The combination of spinlock_t lock and seqcount_spinlock_t seq
in struct fs_struct is an open-coded seqlock_t (see linux/seqlock_types.h).
Combine and switch to equivalent seqlock_t primitives. AFAICS,
that does end up with the same sequence of underlying operations in all
cases.
While we are at it, get_fs_pwd() is open-coded verbatim in
get_path_from_fd(); rather than applying conversion to it, replace with
the call of get_fs_pwd() there. Not worth splitting the commit for that,
IMO...
A bit of historical background - conversion of seqlock_t to
use of seqcount_spinlock_t happened several months after the same
had been done to struct fs_struct; switching fs_struct to seqlock_t
could've been done immediately after that, but it looks like nobody
had gotten around to that until now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/20250702053437.GC1880847@ZenIV
Acked-by: Ahmed S. Darwish <darwi@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU speculation fixes from Borislav Petkov:
"Add the mitigation logic for Transient Scheduler Attacks (TSA)
TSA are new aspeculative side channel attacks related to the execution
timing of instructions under specific microarchitectural conditions.
In some cases, an attacker may be able to use this timing information
to infer data from other contexts, resulting in information leakage.
Add the usual controls of the mitigation and integrate it into the
existing speculation bugs infrastructure in the kernel"
* tag 'tsa_x86_bugs_for_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/process: Move the buffer clearing before MONITOR
x86/microcode/AMD: Add TSA microcode SHAs
KVM: SVM: Advertise TSA CPUID bits to guests
x86/bugs: Add a Transient Scheduler Attacks mitigation
x86/bugs: Rename MDS machinery to something more generic
|
|
Merge series from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>:
This is prepare to hiding snd_soc_dapm_context inside soc-dapm.c
|
|
The dev_pm_domain_attach() function is typically used in bus code
alongside dev_pm_domain_detach(), often following patterns like:
static int bus_probe(struct device *_dev)
{
struct bus_driver *drv = to_bus_driver(dev->driver);
struct bus_device *dev = to_bus_device(_dev);
int ret;
// ...
ret = dev_pm_domain_attach(_dev, true);
if (ret)
return ret;
if (drv->probe)
ret = drv->probe(dev);
// ...
}
static void bus_remove(struct device *_dev)
{
struct bus_driver *drv = to_bus_driver(dev->driver);
struct bus_device *dev = to_bus_device(_dev);
if (drv->remove)
drv->remove(dev);
dev_pm_domain_detach(_dev);
}
When the driver's probe function uses devres-managed resources that
depend on the power domain state, those resources are released later
during device_unbind_cleanup().
Releasing devres-managed resources that depend on the power domain state
after detaching the device from its PM domain can cause failures.
For example, if the driver uses devm_pm_runtime_enable() in its probe
function, and the device's clocks are managed by the PM domain, then
during removal the runtime PM is disabled in device_unbind_cleanup()
after the clocks have been removed from the PM domain. It may happen
that the devm_pm_runtime_enable() action causes the device to be runtime-
resumed. If the driver specific runtime PM APIs access registers directly,
this will lead to accessing device registers without clocks being enabled.
Similar issues may occur with other devres actions that access device
registers.
Add detach_power_off member to struct dev_pm_info, to be used
later in device_unbind_cleanup() as the power_off argument for
dev_pm_domain_detach(). This is a preparatory step toward removing
dev_pm_domain_detach() calls from bus remove functions. Since the current
PM domain detach functions (genpd_dev_pm_detach() and acpi_dev_pm_detach())
already set dev->pm_domain = NULL, there should be no issues with bus
drivers that still call dev_pm_domain_detach() in their remove functions.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20250703112708.1621607-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Calling dev_pm_domain_attach()/dev_pm_domain_detach() in bus driver
probe/remove functions can affect system behavior when the drivers
attached to the bus use devres-managed resources. Since devres actions
may need to access device registers, calling dev_pm_domain_detach() too
early, i.e., before these actions complete, can cause failures on some
systems. One such example is Renesas RZ/G3S SoC-based platforms.
If the device clocks are managed via PM domains, invoking
dev_pm_domain_detach() in the bus driver's remove function removes the
device's clocks from the PM domain, preventing any subsequent
pm_runtime_resume*() calls from enabling those clocks.
The second argument of dev_pm_domain_attach() specifies whether the PM
domain should be powered on during attachment. Likewise, the second
argument of dev_pm_domain_detach() indicates whether the domain should be
powered off during detachment.
Upcoming changes address the issue described above (initially for the
platform bus only) by deferring the call to dev_pm_domain_detach() until
after devres_release_all() in device_unbind_cleanup(). The detach_power_off
field in struct dev_pm_info stores the detach power off info from the
second argument of dev_pm_domain_attach().
Because there are cases where the device's PM domain power-on/off behavior
must be conditional (e.g., in i2c_device_probe()), the patch introduces
PD_FLAG_ATTACH_POWER_ON and PD_FLAG_DETACH_POWER_OFF flags to be passed
to dev_pm_domain_attach().
Finally, dev_pm_domain_attach() and its users are updated to use the newly
introduced PD_FLAG_ATTACH_POWER_ON and PD_FLAG_DETACH_POWER_OFF macros.
This change is preparatory.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> # I2C
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20250703112708.1621607-2-claudiu.beznea.uj@bp.renesas.com
[ rjw: Changelog adjustments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If THP is disabled and when a block device with logical block size >
page size is present, the following null ptr deref panic happens during
boot:
[ [13.2 mK AOSAN: null-ptr-deref in range [0x0000000000000000-0x0000000000K0 0 0[07]
[ 13.017749] RIP: 0010:create_empty_buffers+0x3b/0x380
<snip>
[ 13.025448] Call Trace:
[ 13.025692] <TASK>
[ 13.025895] block_read_full_folio+0x610/0x780
[ 13.026379] ? __pfx_blkdev_get_block+0x10/0x10
[ 13.027008] ? __folio_batch_add_and_move+0x1fa/0x2b0
[ 13.027548] ? __pfx_blkdev_read_folio+0x10/0x10
[ 13.028080] filemap_read_folio+0x9b/0x200
[ 13.028526] ? __pfx_filemap_read_folio+0x10/0x10
[ 13.029030] ? __filemap_get_folio+0x43/0x620
[ 13.029497] do_read_cache_folio+0x155/0x3b0
[ 13.029962] ? __pfx_blkdev_read_folio+0x10/0x10
[ 13.030381] read_part_sector+0xb7/0x2a0
[ 13.030805] read_lba+0x174/0x2c0
<snip>
[ 13.045348] nvme_scan_ns+0x684/0x850 [nvme_core]
[ 13.045858] ? __pfx_nvme_scan_ns+0x10/0x10 [nvme_core]
[ 13.046414] ? _raw_spin_unlock+0x15/0x40
[ 13.046843] ? __switch_to+0x523/0x10a0
[ 13.047253] ? kvm_clock_get_cycles+0x14/0x30
[ 13.047742] ? __pfx_nvme_scan_ns_async+0x10/0x10 [nvme_core]
[ 13.048353] async_run_entry_fn+0x96/0x4f0
[ 13.048787] process_one_work+0x667/0x10a0
[ 13.049219] worker_thread+0x63c/0xf60
As large folio support depends on THP, only allow bs > ps block devices
if THP is enabled.
Fixes: 47dd67532303 ("block/bdev: lift block size restrictions to 64k")
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20250704092134.289491-1-p.raghav@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Allow specifying __arg_untrusted for void */char */int */long *
parameters. Treat such parameters as
PTR_TO_MEM|MEM_RDONLY|PTR_UNTRUSTED of size zero.
Intended usage is as follows:
int memcmp(char *a __arg_untrusted, char *b __arg_untrusted, size_t n) {
bpf_for(i, 0, n) {
if (a[i] - b[i]) // load at any offset is allowed
return a[i] - b[i];
}
return 0;
}
Allocate register id for ARG_PTR_TO_MEM parameters only when
PTR_MAYBE_NULL is set. Register id for PTR_TO_MEM is used only to
propagate non-null status after conditionals.
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250704230354.1323244-8-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Merge series from Sakari Ailus <sakari.ailus@linux.intel.com>:
Late last year I posted a set to switch to __pm_runtime_mark_last_busy()
and gradually get rid of explicit pm_runtime_mark_last_busy() calls in
drivers, embedding them in the appropriate pm_runtime_*autosuspend*()
calls. The overall feedback I got at the time was that this is an
unnecessary intermediate step, and removing the
pm_runtime_mark_last_busy() calls can be done after adding them to the
relevant Runtime PM autosuspend related functions.
|
|
This driver has long outlived it's utility, and it's broken and unloved.
The main use case for this was direct mount with UDF of cd-rw drives
that required 32kb packets. It would collect writes into that size and
write them out in multiples of that. That's not a common use case
anymore, the world has moved on from those kinds of media. To make
matters worse, it's actively breaking setups where it's not even
required or useful.
Link: https://lore.kernel.org/linux-block/fxg6dksau4jsk3u5xldlyo2m7qgiux6vtdrz5rywseotsouqdv@urcrwz6qtd3r/
Link: https://lore.kernel.org/linux-block/dcc4836e-6da9-4208-ad27-bbd44b3a2063@kernel.dk/
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Power supply extensions might want to interact with the underlying
power supply to retrieve data like serial numbers, charging status
and more. However doing so causes psy->extensions_sem to be locked
twice, possibly causing a deadlock.
Provide special variants of power_supply_get/set_property() that
ignore any power supply extensions and thus do not touch the
associated psy->extensions_sem lock.
Suggested-by: Hans de Goede <hansg@kernel.org>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Hans de Goede <hansg@kernel.org>
Link: https://lore.kernel.org/r/20250627205124.250433-1-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
* mlx5-next:
net/mlx5: Check device memory pointer before usage
net/mlx5: fs, fix RDMA TRANSPORT init cleanup flow
net/mlx5: Add IFC bits for PCIe Congestion Event object
net/mlx5: Small refactor for general object capabilities
|
|
Add a simple auto cleanup method for struct cred.
Link: https://lore.kernel.org/20250612-work-coredump-massage-v1-19-315c0c34ba94@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
mac80211 identifies a short beacon by the presence of the next
TBTT field, however the standard actually doesn't explicitly state that
the next TBTT can't be in a long beacon or even that it is required in
a short beacon - and as a result this validation does not work for all
vendor implementations.
The standard explicitly states that an S1G long beacon shall contain
the S1G beacon compatibility element as the first element in a beacon
transmitted at a TBTT that is not a TSBTT (Target Short Beacon
Transmission Time) as per IEEE80211-2024 11.1.3.10.1. This is validated
by 9.3.4.3 Table 9-76 which states that the S1G beacon compatibility
element is only allowed in the full set and is not allowed in the
minimum set of elements permitted for use within short beacons.
Correctly identify short beacons by the lack of an S1G beacon
compatibility element as the first element in an S1G beacon frame.
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results")
Signed-off-by: Simon Wadsworth <simon@morsemicro.com>
Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20250701075541.162619-1-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into gpio/for-next
Runtime PM updates related to autosuspend for 6.17
Make several autosuspend functions mark last busy stamp and update
the documentation accordingly (Sakari Ailus).
|
|
As the first step in removing the fields specific to the gpio-mmio
module from struct gpio_chip, we introduce a new set of generic GPIO
chip interfaces that are meant to replace the existing bgpio_ ones.
The new initialization function - gpio_generic_chip_init() - takes a
configuration structure as argument instead of 9 separate parameters.
This will allow easy extension if needed in the future. We hide the
locking details behind a set of helpers in order to be able to move the
raw spinlock out of struct gpio_chip without the users noticing.
For now, the new APIs just wrap the existing ones. Once all users have
been converted to the new interfaces, we'll pull them into gpio-mmio and
implement them in a backward-compatible way while also moving all fields
specific to the generic GPIO chip into struct gpio_generic_chip.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250702-gpio-mmio-rework-v2-1-6b77aab684d8@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Immutable branch between GPIO, MFD and ARM-SoC for v6.17-rc1
Remove struct bgpio_pdata after converting its users to generic device
properties.
|
|
With no more users, we can now remove struct bgpio_pdata. Move the
relevant bits from bgpio_parse_fw() into bgpio_pdev_probe() while
maintaining the logical ordering (get flags before calling
bgpio_init()).
Link: https://lore.kernel.org/r/20250701-gpio-mmio-pdata-v2-6-ebf34d273497@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
The WFHWSIZE constant defines the maximum size for the hardware-specific
waveform representation buffer. It is currently local to
drivers/pwm/core.c, which makes it inaccessible to external tools like
bindgen.
Move the constant to include/linux/pwm.h to make it part of the public
API. As part of this change, rename it to PWM_WFHWSIZE to follow
standard kernel conventions for namespacing macros in public headers.
This allows bindgen to automatically generate a corresponding constant
for the Rust PWM abstractions, ensuring the value remains synchronized
between the C core and Rust code and preventing future maintenance
issues.
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Link: https://lore.kernel.org/r/20250702-rust-next-pwm-working-fan-for-sending-v7-1-67ef39ff1d29@samsung.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
With this change each pwmchip defining the new-style waveform callbacks
can be accessed from userspace via a character device. Compared to the
sysfs-API this is faster and allows to pass the whole configuration in a
single ioctl allowing atomic application and thus reducing glitches.
On an STM32MP13 I see:
root@DistroKit:~ time pwmtestperf
real 0m 1.27s
user 0m 0.02s
sys 0m 1.21s
root@DistroKit:~ rm /dev/pwmchip0
root@DistroKit:~ time pwmtestperf
real 0m 3.61s
user 0m 0.27s
sys 0m 3.26s
pwmtestperf does essentially:
for i in 0 .. 50000:
pwm_set_waveform(duty_length_ns=i, period_length_ns=50000, duty_offset_ns=0)
and in the presence of /dev/pwmchip0 is uses the ioctls introduced here,
without that device it uses /sys/class/pwm/pwmchip0.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/ad4a4e49ae3f8ea81e23cac1ac12b338c3bf5c5b.1746010245.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
Runtime PM updates related to autosuspend for 6.17
Make several autosuspend functions mark last busy stamp and update
the documentation accordingly (Sakari Ailus).
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
We need the USB fixes in here as well to build on top of for other
changes that depend on them.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make the values a bit more meaningful.
This commit is intentionally cross-subsystem to ease review, as the
patchset is intended to be merged together, with a maintainer
consensus.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/660981/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
|
|
This bit is set iff the UBWC version is 1.0. That notably does not
include QCM2290's "no UBWC".
This commit is intentionally cross-subsystem to ease review, as the
patchset is intended to be merged together, with a maintainer
consensus.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/660971/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
|
|
As discussed a lot in the past, the UBWC config must be coherent across
a number of IP blocks (currently display and GPU, but it also may/will
concern camera/video as the drivers evolve).
So far, we've been trying to keep the values reasonable in each of the
two drivers separately, but it really make sense to do so centrally,
especially given certain fields (e.g. HBB) may need to be gathered
dynamically.
To reduce room for error, move to fetching the config from a central
source, so that the data programmed into the hardware is consistent
across all multimedia blocks that request it.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/660963/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
|
|
Add a file that will serve as a single source of truth for UBWC
configuration data for various multimedia blocks.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/660959/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address system suspend failures under memory pressure in some
configurations, fix up RAPL handling on platforms where PL1 cannot be
disabled, and fix a documentation typo:
- Prevent the Intel RAPL power capping driver from allowing PL1 to be
exceeded by mistake on systems when PL1 cannot be disabled (Zhang
Rui)
- Fix a typo in the ABI documentation (Sumanth Gavini)
- Allow swap to be used a bit longer during system suspend and
hibernation to avoid suspend failures under memory pressure (Mario
Limonciello)"
* tag 'pm-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: docs: Replace "diasble" with "disable"
powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed
PM: Restrict swap use to later in the suspend sequence
|
|
Merge series from Sakari Ailus <sakari.ailus@linux.intel.com>:
Late last year I posted a set to switch to __pm_runtime_mark_last_busy()
and gradually get rid of explicit pm_runtime_mark_last_busy() calls in
drivers, embedding them in the appropriate pm_runtime_*autosuspend*()
calls. The overall feedback I got at the time was that this is an
unnecessary intermediate step, and removing the
pm_runtime_mark_last_busy() calls can be done after adding them to the
relevant Runtime PM autosuspend related functions.
|