Age | Commit message (Collapse) | Author |
|
In order to synchronize changes that can affect the thread callback,
introduce an interface to force a flush of the inject workqueue. The
irqfd pointer is only valid under spinlock, but the workqueue cannot
be flushed under spinlock. Therefore the flush work for the irqfd is
queued under spinlock. The vfio_irqfd_cleanup_wq workqueue is re-used
for queuing this work such that flushing the workqueue is also ordered
relative to shutdown.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240308230557.805580-4-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Mask operations through config space changes to DisINTx may race INTx
configuration changes via ioctl. Create wrappers that add locking for
paths outside of the core interrupt code.
In particular, irq_type is updated holding igate, therefore testing
is_intx() requires holding igate. For example clearing DisINTx from
config space can otherwise race changes of the interrupt configuration.
This aligns interfaces which may trigger the INTx eventfd into two
camps, one side serialized by igate and the other only enabled while
INTx is configured. A subsequent patch introduces synchronization for
the latter flows.
Cc: <stable@vger.kernel.org>
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240308230557.805580-3-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Currently for devices requiring masking at the irqchip for INTx, ie.
devices without DisINTx support, the IRQ is enabled in request_irq()
and subsequently disabled as necessary to align with the masked status
flag. This presents a window where the interrupt could fire between
these events, resulting in the IRQ incrementing the disable depth twice.
This would be unrecoverable for a user since the masked flag prevents
nested enables through vfio.
Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
is never auto-enabled, then unmask as required.
Cc: <stable@vger.kernel.org>
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240308230557.805580-2-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The current logic for handling resets is more complicated than it needs
to be. The deferred_reset flag is used to indicate a reset is needed
and the deferred_reset_state is the requested, post-reset, state.
Also, the deferred_reset logic was added to vfio migration drivers to
prevent a circular locking dependency with respect to mm_lock and state
mutex. This is mainly because of the copy_to/from_user() functions(which
takes mm_lock) invoked under state mutex.
Remove all of the deferred reset logic and just pass the requested
next state to pds_vfio_reset() so it can be used for VMM and DSC
initiated resets.
This removes the need for pds_vfio_state_mutex_lock(), so remove that
and replace its use with a simple mutex_unlock().
Also, remove the reset_mutex as it's no longer needed since the
state_mutex can be the driver's primary protector.
Suggested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Link: https://lore.kernel.org/r/20240308182149.22036-3-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
It's possible the migration file is accessed after reset when it has
been cleaned up, especially when it's initiated by the device. This is
because the driver doesn't rip out the filep when cleaning up it only
frees the related page structures and sets its local struct
pds_vfio_lm_file pointer to NULL. This can cause a NULL pointer
dereference, which is shown in the example below during a restore after
a device initiated reset:
BUG: kernel NULL pointer dereference, address: 000000000000000c
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:pds_vfio_get_file_page+0x5d/0xf0 [pds_vfio_pci]
[...]
Call Trace:
<TASK>
pds_vfio_restore_write+0xf6/0x160 [pds_vfio_pci]
vfs_write+0xc9/0x3f0
? __fget_light+0xc9/0x110
ksys_write+0xb5/0xf0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
[...]
Add a disabled flag to the driver's struct pds_vfio_lm_file that gets
set during cleanup. Then make sure to check the flag when the migration
file is accessed via its file_operations. By default this flag will be
false as the memory for struct pds_vfio_lm_file is kzalloc'd, which means
the struct pds_vfio_lm_file is enabled and accessible. Also, since the
file_operations and driver's migration file cleanup happen under the
protection of the same pds_vfio_lm_file.lock, using this flag is thread
safe.
Fixes: 8512ed256334 ("vfio/pds: Always clear the save/restore FDs on reset")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Link: https://lore.kernel.org/r/20240308182149.22036-2-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/79d3df42fe5b359a05b8061631e72e5ed249b234.1709886922.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Enable live migration only once the firmware supports PRE_COPY.
PRE_COPY has been supported by the firmware for a long time already [1]
and is required to achieve a low downtime upon live migration.
This lets us clean up some old code that is not applicable those days
while PRE_COPY is fully supported by the firmware.
[1] The minimum firmware version that supports PRE_COPY is 28.36.1010,
it was released in January 2023.
No firmware without PRE_COPY support ever available to users.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20240306105624.114830-1-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the mbochs_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240301-class_cleanup-vfio-v1-2-9236d69083f5@marliere.net
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the mdpy_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240301-class_cleanup-vfio-v1-1-9236d69083f5@marliere.net
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The deferred_reset logic was added to vfio migration drivers to prevent
a circular locking dependency with respect to mm_lock and state mutex.
This is mainly because of the copy_to/from_user() functions(which takes
mm_lock) invoked under state mutex. But for HiSilicon driver, the only
place where we now hold the state mutex for copy_to_user is during the
PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have
updated the data and perform copy_to_user without state mutex. By this,
we can get rid of the deferred_reset logic.
Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240229091152.56664-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
This reverts commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4.
This was a heinous workaround and it turns out it's been fixed in mm
twice since it was introduced. Most recently, commit c8070b787519
("mm: Don't pin ZERO_PAGE in pin_user_pages()") would have prevented
running up the zeropage refcount, but even before that commit
84209e87c696 ("mm/gup: reliable R/O long-term pinning in COW mappings")
avoids the vfio use case from pinning the zeropage at all, instead
replacing it with exclusive anonymous pages.
Remove this now useless overhead.
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240229223544.257207-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The NVIDIA Grace Hopper GPUs have device memory that is supposed to be
used as a regular RAM. It is accessible through CPU-GPU chip-to-chip
cache coherent interconnect and is present in the system physical
address space. The device memory is split into two regions - termed
as usemem and resmem - in the system physical address space,
with each region mapped and exposed to the VM as a separate fake
device BAR [1].
Owing to a hardware defect for Multi-Instance GPU (MIG) feature [2],
there is a requirement - as a workaround - for the resmem BAR to
display uncached memory characteristics. Based on [3], on system with
FWB enabled such as Grace Hopper, the requisite properties
(uncached, unaligned access) can be achieved through a VM mapping (S1)
of NORMAL_NC and host mapping (S2) of MT_S2_FWB_NORMAL_NC.
KVM currently maps the MMIO region in S2 as MT_S2_FWB_DEVICE_nGnRE by
default. The fake device BARs thus displays DEVICE_nGnRE behavior in the
VM.
The following table summarizes the behavior for the various S1 and S2
mapping combinations for systems with FWB enabled [3].
S1 | S2 | Result
NORMAL_NC | NORMAL_NC | NORMAL_NC
NORMAL_NC | DEVICE_nGnRE | DEVICE_nGnRE
Recently a change was added that modifies this default behavior and
make KVM map MMIO as MT_S2_FWB_NORMAL_NC when a VMA flag
VM_ALLOW_ANY_UNCACHED is set [4]. Setting S2 as MT_S2_FWB_NORMAL_NC
provides the desired behavior (uncached, unaligned access) for resmem.
To use VM_ALLOW_ANY_UNCACHED flag, the platform must guarantee that
no action taken on the MMIO mapping can trigger an uncontained
failure. The Grace Hopper satisfies this requirement. So set
the VM_ALLOW_ANY_UNCACHED flag in the VMA.
Applied over next-20240227.
base-commit: 22ba90670a51
Link: https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/ [1]
Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu/ [2]
Link: https://developer.arm.com/documentation/ddi0487/latest/ section D8.5.5 [3]
Link: https://lore.kernel.org/all/20240224150546.368-1-ankita@nvidia.com/ [4]
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Vikram Sethi <vsethi@nvidia.com>
Cc: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240229193934.2417-1-ankita@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux into v6.9/vfio/next
|
|
Obviously drivers/vfio/platform/vfio_amba.c started its life as a
simplified copy of drivers/dma/pl330.c, but not all variable names were
updated.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1d1b873b59b208547439225aee1f24d6f2512a1f.1708945194.git.geert+renesas@glider.be
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
After reset the VFIO device state will always be put in
VFIO_DEVICE_STATE_RUNNING, but the save/restore files will only be
cleared if the previous state was VFIO_DEVICE_STATE_ERROR. This
can/will cause the restore/save files to be leaked if/when the
migration state machine transitions through the states that
re-allocates these files. Fix this by always clearing the
restore/save files for resets.
Fixes: 7dabb1bcd177 ("vfio/pds: Add support for firmware recovery")
Cc: stable@vger.kernel.org
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240228003205.47311-2-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64,
allowing KVM stage 2 device mapping attributes to use Normal-NC
rather than DEVICE_nGnRE, which allows guest mappings supporting
write-combining attributes (WC). ARM does not architecturally
guarantee this is safe, and indeed some MMIO regions like the GICv2
VCPU interface can trigger uncontained faults if Normal-NC is used.
To safely use VFIO in KVM the platform must guarantee full safety
in the guest where no action taken against a MMIO mapping can
trigger an uncontained failure. The expectation is that most VFIO PCI
platforms support this for both mapping types, at least in common
flows, based on some expectations of how PCI IP is integrated. So
make vfio-pci set the VM_ALLOW_ANY_UNCACHED flag.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240224150546.368-5-ankita@nvidia.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
To provide VM with the ability to get device IO memory with NormalNC
property, map device MMIO in KVM for ARM64 at stage2 as NormalNC.
Having NormalNC S2 default puts guests in control (based on [1],
"Combining stage 1 and stage 2 memory type attributes") of device
MMIO regions memory mappings. The rules are summarized below:
([(S1) - stage1], [(S2) - stage 2])
S1 | S2 | Result
NORMAL-WB | NORMAL-NC | NORMAL-NC
NORMAL-WT | NORMAL-NC | NORMAL-NC
NORMAL-NC | NORMAL-NC | NORMAL-NC
DEVICE<attr> | NORMAL-NC | DEVICE<attr>
Still this cannot be generalized to non PCI devices such as GICv2.
There is insufficient information and uncertainity in the behavior
of non PCI driver. A driver must indicate support using the
new flag VM_ALLOW_ANY_UNCACHED.
Adapt KVM to make use of the flag VM_ALLOW_ANY_UNCACHED as indicator to
activate the S2 setting to NormalNc.
[1] section D8.5.5 of DDI0487J_a_a-profile_architecture_reference_manual.pdf
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240224150546.368-4-ankita@nvidia.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64, allowing KVM
stage 2 device mapping attributes to use NormalNC rather than
DEVICE_nGnRE, which allows guest mappings supporting write-combining
attributes (WC). ARM does not architecturally guarantee this is safe,
and indeed some MMIO regions like the GICv2 VCPU interface can trigger
uncontained faults if NormalNC is used.
Even worse, the expectation is that there are platforms where even
DEVICE_nGnRE can allow uncontained faults in corner cases. Unfortunately
existing ARM IP requires platform integration to take responsibility to
prevent this.
To safely use VFIO in KVM the platform must guarantee full safety in the
guest where no action taken against a MMIO mapping can trigger an
uncontained failure. The assumption is that most VFIO PCI platforms
support this for both mapping types, at least in common flows, based
on some expectations of how PCI IP is integrated. This can be enabled
more broadly, for instance into vfio-platform drivers, but only after
the platform vendor completes auditing for safety.
The VMA flag VM_ALLOW_ANY_UNCACHED was found to be the simplest and
cleanest way to communicate the information from VFIO to KVM that
mapping the region in S2 as NormalNC is safe. KVM consumes it to
activate the code that does the S2 mapping as NormalNC.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240224150546.368-3-ankita@nvidia.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Currently, KVM for ARM64 maps at stage 2 memory that is considered device
(i.e. it is not RAM) with DEVICE_nGnRE memory attributes; this setting
overrides (as per the ARM architecture [1]) any device MMIO mapping
present at stage 1, resulting in a set-up whereby a guest operating
system cannot determine device MMIO mapping memory attributes on its
own but it is always overridden by the KVM stage 2 default.
This set-up does not allow guest operating systems to select device
memory attributes independently from KVM stage-2 mappings
(refer to [1], "Combining stage 1 and stage 2 memory type attributes"),
which turns out to be an issue in that guest operating systems
(e.g. Linux) may request to map devices MMIO regions with memory
attributes that guarantee better performance (e.g. gathering
attribute - that for some devices can generate larger PCIe memory
writes TLPs) and specific operations (e.g. unaligned transactions)
such as the NormalNC memory type.
The default device stage 2 mapping was chosen in KVM for ARM64 since
it was considered safer (i.e. it would not allow guests to trigger
uncontained failures ultimately crashing the machine) but this
turned out to be asynchronous (SError) defeating the purpose.
Failures containability is a property of the platform and is independent
from the memory type used for MMIO device memory mappings.
Actually, DEVICE_nGnRE memory type is even more problematic than
Normal-NC memory type in terms of faults containability in that e.g.
aborts triggered on DEVICE_nGnRE loads cannot be made, architecturally,
synchronous (i.e. that would imply that the processor should issue at
most 1 load transaction at a time - it cannot pipeline them - otherwise
the synchronous abort semantics would break the no-speculation attribute
attached to DEVICE_XXX memory).
This means that regardless of the combined stage1+stage2 mappings a
platform is safe if and only if device transactions cannot trigger
uncontained failures and that in turn relies on platform capabilities
and the device type being assigned (i.e. PCIe AER/DPC error containment
and RAS architecture[3]); therefore the default KVM device stage 2
memory attributes play no role in making device assignment safer
for a given platform (if the platform design adheres to design
guidelines outlined in [3]) and therefore can be relaxed.
For all these reasons, relax the KVM stage 2 device memory attributes
from DEVICE_nGnRE to Normal-NC.
The NormalNC was chosen over a different Normal memory type default
at stage-2 (e.g. Normal Write-through) to avoid cache allocation/snooping.
Relaxing S2 KVM device MMIO mappings to Normal-NC is not expected to
trigger any issue on guest device reclaim use cases either (i.e. device
MMIO unmap followed by a device reset) at least for PCIe devices, in that
in PCIe a device reset is architected and carried out through PCI config
space transactions that are naturally ordered with respect to MMIO
transactions according to the PCI ordering rules.
Having Normal-NC S2 default puts guests in control (thanks to
stage1+stage2 combined memory attributes rules [1]) of device MMIO
regions memory mappings, according to the rules described in [1]
and summarized here ([(S1) - stage1], [(S2) - stage 2]):
S1 | S2 | Result
NORMAL-WB | NORMAL-NC | NORMAL-NC
NORMAL-WT | NORMAL-NC | NORMAL-NC
NORMAL-NC | NORMAL-NC | NORMAL-NC
DEVICE<attr> | NORMAL-NC | DEVICE<attr>
It is worth noting that currently, to map devices MMIO space to user
space in a device pass-through use case the VFIO framework applies memory
attributes derived from pgprot_noncached() settings applied to VMAs, which
result in device-nGnRnE memory attributes for the stage-1 VMM mappings.
This means that a userspace mapping for device MMIO space carried
out with the current VFIO framework and a guest OS mapping for the same
MMIO space may result in a mismatched alias as described in [2].
Defaulting KVM device stage-2 mappings to Normal-NC attributes does not
change anything in this respect, in that the mismatched aliases would
only affect (refer to [2] for a detailed explanation) ordering between
the userspace and GuestOS mappings resulting stream of transactions
(i.e. it does not cause loss of property for either stream of
transactions on its own), which is harmless given that the userspace
and GuestOS access to the device is carried out through independent
transactions streams.
A Normal-NC flag is not present today. So add a new kvm_pgtable_prot
(KVM_PGTABLE_PROT_NORMAL_NC) flag for it, along with its
corresponding PTE value 0x5 (0b101) determined from [1].
Lastly, adapt the stage2 PTE property setter function
(stage2_set_prot_attr) to handle the NormalNC attribute.
The entire discussion leading to this patch series may be followed through
the following links.
Link: https://lore.kernel.org/all/20230907181459.18145-3-ankita@nvidia.com
Link: https://lore.kernel.org/r/20231205033015.10044-1-ankita@nvidia.com
[1] section D8.5.5 - DDI0487J_a_a-profile_architecture_reference_manual.pdf
[2] section B2.8 - DDI0487J_a_a-profile_architecture_reference_manual.pdf
[3] sections 1.7.7.3/1.8.5.2/appendix C - DEN0029H_SBSA_7.1.pdf
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240224150546.368-2-ankita@nvidia.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device
for the on-chip GPU that is the logical OS representation of the
internal proprietary chip-to-chip cache coherent interconnect.
The device is peculiar compared to a real PCI device in that whilst
there is a real 64b PCI BAR1 (comprising region 2 & region 3) on the
device, it is not used to access device memory once the faster
chip-to-chip interconnect is initialized (occurs at the time of host
system boot). The device memory is accessed instead using the chip-to-chip
interconnect that is exposed as a contiguous physically addressable
region on the host. This device memory aperture can be obtained from host
ACPI table using device_property_read_u64(), according to the FW
specification. Since the device memory is cache coherent with the CPU,
it can be mmap into the user VMA with a cacheable mapping using
remap_pfn_range() and used like a regular RAM. The device memory
is not added to the host kernel, but mapped directly as this reduces
memory wastage due to struct pages.
There is also a requirement of a minimum reserved 1G uncached region
(termed as resmem) to support the Multi-Instance GPU (MIG) feature [1].
This is to work around a HW defect. Based on [2], the requisite properties
(uncached, unaligned access) can be achieved through a VM mapping (S1)
of NORMAL_NC and host (S2) mapping with MemAttr[2:0]=0b101. To provide
a different non-cached property to the reserved 1G region, it needs to
be carved out from the device memory and mapped as a separate region
in Qemu VMA with pgprot_writecombine(). pgprot_writecombine() sets the
Qemu VMA page properties (pgprot) as NORMAL_NC.
Provide a VFIO PCI variant driver that adapts the unique device memory
representation into a more standard PCI representation facing userspace.
The variant driver exposes these two regions - the non-cached reserved
(resmem) and the cached rest of the device memory (termed as usemem) as
separate VFIO 64b BAR regions. This is divergent from the baremetal
approach, where the device memory is exposed as a device memory region.
The decision for a different approach was taken in view of the fact that
it would necessiate additional code in Qemu to discover and insert those
regions in the VM IPA, along with the additional VM ACPI DSDT changes to
communicate the device memory region IPA to the VM workloads. Moreover,
this behavior would have to be added to a variety of emulators (beyond
top of tree Qemu) out there desiring grace hopper support.
Since the device implements 64-bit BAR0, the VFIO PCI variant driver
maps the uncached carved out region to the next available PCI BAR (i.e.
comprising of region 2 and 3). The cached device memory aperture is
assigned BAR region 4 and 5. Qemu will then naturally generate a PCI
device in the VM with the uncached aperture reported as BAR2 region,
the cacheable as BAR4. The variant driver provides emulation for these
fake BARs' PCI config space offset registers.
The hardware ensures that the system does not crash when the memory
is accessed with the memory enable turned off. It synthesis ~0 reads
and dropped writes on such access. So there is no need to support the
disablement/enablement of BAR through PCI_COMMAND config space register.
The memory layout on the host looks like the following:
devmem (memlength)
|--------------------------------------------------|
|-------------cached------------------------|--NC--|
| |
usemem.memphys resmem.memphys
PCI BARs need to be aligned to the power-of-2, but the actual memory on the
device may not. A read or write access to the physical address from the
last device PFN up to the next power-of-2 aligned physical address
results in reading ~0 and dropped writes. Note that the GPU device
driver [6] is capable of knowing the exact device memory size through
separate means. The device memory size is primarily kept in the system
ACPI tables for use by the VFIO PCI variant module.
Note that the usemem memory is added by the VM Nvidia device driver [5]
to the VM kernel as memblocks. Hence make the usable memory size memblock
(MEMBLK_SIZE) aligned. This is a hardwired ABI value between the GPU FW and
VFIO driver. The VM device driver make use of the same value for its
calculation to determine USEMEM size.
Currently there is no provision in KVM for a S2 mapping with
MemAttr[2:0]=0b101, but there is an ongoing effort to provide the same [3].
As previously mentioned, resmem is mapped pgprot_writecombine(), that
sets the Qemu VMA page properties (pgprot) as NORMAL_NC. Using the
proposed changes in [3] and [4], KVM marks the region with
MemAttr[2:0]=0b101 in S2.
If the device memory properties are not present, the driver registers the
vfio-pci-core function pointers. Since there are no ACPI memory properties
generated for the VM, the variant driver inside the VM will only use
the vfio-pci-core ops and hence try to map the BARs as non cached. This
is not a problem as the CPUs have FWB enabled which blocks the VM
mapping's ability to override the cacheability set by the host mapping.
This goes along with a qemu series [6] to provides the necessary
implementation of the Grace Hopper Superchip firmware specification so
that the guest operating system can see the correct ACPI modeling for
the coherent GPU device. Verified with the CUDA workload in the VM.
[1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu/
[2] section D8.5.5 of https://developer.arm.com/documentation/ddi0487/latest/
[3] https://lore.kernel.org/all/20240211174705.31992-1-ankita@nvidia.com/
[4] https://lore.kernel.org/all/20230907181459.18145-2-ankita@nvidia.com/
[5] https://github.com/NVIDIA/open-gpu-kernel-modules
[6] https://lore.kernel.org/all/20231203060245.31593-1-ankita@nvidia.com/
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Zhi Wang <zhi.wang.linux@gmail.com>
Signed-off-by: Aniket Agashe <aniketa@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240220115055.23546-4-ankita@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
range_intersect_range determines an overlap between two ranges. If an
overlap, the helper function returns the overlapping offset and size.
The VFIO PCI variant driver emulates the PCI config space BAR offset
registers. These offset may be accessed for read/write with a variety
of lengths including sub-word sizes from sub-word offsets. The driver
makes use of this helper function to read/write the targeted part of
the emulated register.
Make this a vfio_pci_core function, rename and export as GPL. Also
update references in virtio driver.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240220115055.23546-3-ankita@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
do_io_rw() is used to read/write to the device MMIO. The grace hopper
VFIO PCI variant driver require this functionality to read/write to
its memory.
Rename this as vfio_pci_core functions and export as GPL.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20240220115055.23546-2-ankita@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the mdev_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Link: https://lore.kernel.org/r/20240208-bus_cleanup-vfio-v1-1-ed5da3019949@marliere.net
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The vfio-pci virtio variant entry slipped in out of order.
Link: https://lore.kernel.org/r/20240205235427.2103714-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Let firmware knows upon leaving PRE_COPY back to RUNNING as of some
error in the target/migration cancellation.
This will let firmware cleaning its internal resources that were turned
on upon PRE_COPY.
The flow is based on the device specification in this area.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://lore.kernel.org/r/20240205124828.232701-6-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Block incremental query which is state-dependent once the migration file
was previously marked with state error.
This may prevent redundant calls to firmware upon PRE_COPY which will
end-up with a failure and a syndrome printed in dmesg.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://lore.kernel.org/r/20240205124828.232701-5-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
The SAVE command uses the async command interface over the PF.
Upon a failure in the firmware -EREMOTEIO is returned.
In that case call mlx5_cmd_out_err() to let it print the command failure
details including the firmware syndrome.
Note:
The other commands in the driver use the sync command interface in a way
that a firmware syndrome is printed upon an error inside mlx5_core.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://lore.kernel.org/r/20240205124828.232701-4-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Add support for tracker object change event by referring to its
MLX5_EVENT_TYPE_OBJECT_CHANGE event when occurs.
This lets the driver recognize whether the firmware moved the tracker
object to an error state.
In that case, the driver will skip/block any usage of that object
including an early exit in case the object was previously marked with an
error.
This functionality also covers the case when no CQE is delivered as of
the error state.
The driver was adapted to the device specification to handle the above.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://lore.kernel.org/r/20240205124828.232701-3-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Add the IFC related bits for query tracker.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://lore.kernel.org/r/20240205124828.232701-2-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.
This is a blocking notifier callback, so errno isn't a proper return
value. Use WARN_ON to small allocation failures.
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Link: https://lore.kernel.org/r/20240115063434.20278-1-chentao@kylinos.cn
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Reformat nested if-conditionals in Makefiles with 4 spaces
- Fix CONFIG_DEBUG_INFO_BTF builds for big endian
- Fix modpost for module srcversion
- Fix an escape sequence warning in gen_compile_commands.py
- Fix kallsyms to ignore ARMv4 thunk symbols
* tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kallsyms: ignore ARMv4 thunks along with others
modpost: trim leading spaces when processing source files list
gen_compile_commands: fix invalid escape sequence warning
kbuild: Fix changing ELF file type for output of gen_btf for big endian
docs: kconfig: Fix grammar and formatting
kbuild: use 4-space indentation when followed by conditionals
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Use a GB page for identity mapping only when memory of this size is
requested so that mapping of reserved regions is prevented which
would otherwise lead to system crashes on UV machines
* tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Fix GICv4.1 affinity update
- Restore a quirk for ACPI-based GICv4 systems
- Handle non-coherent GICv4 redistributors properly
- Prevent spurious interrupts on Broadcom devices using GIC v3
architecture
- Other minor fixes
* tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems
irqchip/gic-v3-its: Handle non-coherent GICv4 redistributors
irqchip/qcom-mpm: Fix IS_ERR() vs NULL check in qcom_mpm_init()
irqchip/loongson-eiointc: Use correct struct type in eiointc_domain_alloc()
irqchip/irq-brcmstb-l2: Add write memory barrier before exit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two fixes for i801 and qcom-geni devices. Meanwhile, a fix from Arnd
addresses a compilation error encountered during compile test on
powerpc"
* tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i801: Fix block process call transactions
i2c: pasemi: split driver into two separate modules
i2c: qcom-geni: Correct I2C TRE sequence
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"This is a bit of a big batch for rc4, but just due to holiday hangover
and because I didn't send any fixes last week due to a late revert
request. I think next week should be back to normal.
- Fix ftrace bug on boot caused by exit text sections with
'-fpatchable-function-entry'
- Fix accuracy of stolen time on pseries since the switch to
VIRT_CPU_ACCOUNTING_GEN
- Fix a crash in the IOMMU code when doing DLPAR remove
- Set pt_regs->link on scv entry to fix BPF stack unwinding
- Add missing PPC_FEATURE_BOOKE on 64-bit e5500/e6500, which broke
gdb
- Fix boot on some 6xx platforms with STRICT_KERNEL_RWX enabled
- Fix build failures with KASAN enabled and 32KB stack size
- Some other minor fixes
Thanks to Arnd Bergmann, Benjamin Gray, Christophe Leroy, David
Engraf, Gaurav Batra, Jason Gunthorpe, Jiangfeng Xiao, Matthias
Schiffer, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A,
R Nageswara Sastry, Shivaprasad G Bhat, Shrikanth Hegde, Spoorthy,
Srikar Dronamraju, and Venkat Rao Bagalkote"
* tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach
powerpc/pseries: fix accuracy of stolen time
powerpc/ftrace: Ignore ftrace locations in exit text sections
powerpc/cputable: Add missing PPC_FEATURE_BOOKE on PPC64 Book-E
powerpc/kasan: Limit KASAN thread size increase to 32KB
Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"
powerpc: 85xx: mark local functions static
powerpc: udbg_memcons: mark functions static
powerpc/kasan: Fix addr error caused by page alignment
powerpc/6xx: set High BAT Enable flag on G2_LE cores
selftests/powerpc/papr_vpd: Check devfd before get_system_loc_code()
powerpc/64: Set task pt_regs->link to the LR value on scv entry
powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add
powerpc/pseries/papr-sysparm: use u8 arrays for payloads
|
|
Pull bcachefs fixes from Kent Overstreet:
"Mostly pretty trivial, the user visible ones are:
- don't barf when replicas_required > replicas
- fix check_version_upgrade() so it doesn't do something nonsensical
when we're downgrading"
* tag 'bcachefs-2024-02-17' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix missing va_end()
bcachefs: Fix check_version_upgrade()
bcachefs: Clamp replicas_required to replicas
bcachefs: fix missing endiannes conversion in sb_members
bcachefs: fix kmemleak in __bch2_read_super error handling path
bcachefs: Fix missing bch2_err_class() calls
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some driver core fixes, a kobject fix, and a documentation
update for 6.8-rc5. In detail these changes are:
- devlink fixes for reported issues with 6.8-rc1
- topology scheduling regression fix that has been reported by many
- kobject loosening of checks change in -rc1 is now reverted as some
codepaths seemed to need the checks
- documentation update for the CVE process. Has been reviewed by
many, the last minute change to the document was to bring the .rst
format back into the the new style rules, the contents did not
change.
All of these, except for the documentation update, have been in
linux-next for over a week. The documentation update has been reviewed
for weeks by a group of developers, and in public for a week and the
wording has stabilized for now. If future changes are needed, we can
do so before 6.8-final is out (or anytime after that)"
* tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation: Document the Linux Kernel CVE process
Revert "kobject: Remove redundant checks for whether ktype is NULL"
driver core: fw_devlink: Improve logs for cycle detection
driver core: fw_devlink: Improve detection of overlapping cycles
driver core: Fix device_link_flag_is_sync_state_only()
topology: Set capacity_freq_ref in all cases
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / miscdriver fixes from Greg KH:
"Here is a small set of char/misc and IIO driver fixes for 6.8-rc5.
Included in here are:
- lots of iio driver fixes for reported issues
- nvmem device naming fixup for reported problem
- interconnect driver fixes for reported issues
All of these have been in linux-next for a while with no reported the
issues (the nvmem patch was included in a different branch in
linux-next before sent to me for inclusion here)"
* tag 'char-misc-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
nvmem: include bit index in cell sysfs file name
iio: adc: ad4130: only set GPIO_CTRL if pin is unused
iio: adc: ad4130: zero-initialize clock init data
interconnect: qcom: x1e80100: Add missing ACV enable_mask
interconnect: qcom: sm8650: Use correct ACV enable_mask
iio: accel: bma400: Fix a compilation problem
iio: commom: st_sensors: ensure proper DMA alignment
iio: hid-sensor-als: Return 0 for HID_USAGE_SENSOR_TIME_TIMESTAMP
iio: move LIGHT_UVA and LIGHT_UVB to the end of iio_modifier
staging: iio: ad5933: fix type mismatch regression
iio: humidity: hdc3020: fix temperature offset
iio: adc: ad7091r8: Fix error code in ad7091r8_gpio_setup()
iio: adc: ad_sigma_delta: ensure proper DMA alignment
iio: imu: adis: ensure proper DMA alignment
iio: humidity: hdc3020: Add Makefile, Kconfig and MAINTAINERS entry
iio: imu: bno055: serdev requires REGMAP
iio: magnetometer: rm3100: add boundary check for the value read from RM3100_REG_TMRC
iio: pressure: bmp280: Add missing bmp085 to SPI id table
iio: core: fix memleak in iio_device_register_sysfs
interconnect: qcom: sm8550: Enable sync_state
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are three small tty and serial driver fixes for 6.8-rc5:
- revert a 8250_pci1xxxx off-by-one change that was incorrect
- two changes to fix the transmit path of the mxs-auart driver,
fixing a regression in the 6.2 release
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: mxs-auart: fix tx
serial: core: introduce uart_port_tx_flags()
serial: 8250_pci1xxxx: partially revert off by one patch
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt fixes from Greg KH:
"Here are two small fixes for 6.8-rc5:
- thunderbolt to fix a reported issue on many platforms
- dwc3 driver revert of a commit that caused problems in -rc1
Both of these changes have been in linux-next for over a week with no
reported issues"
* tag 'usb-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: dwc3: Support EBC feature of DWC_usb31"
thunderbolt: Fix setting the CNS bit in ROUTER_CS_5
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- regression fix for rkisp1 shared IRQ logic
- fix atomisp breakage due to a kAPI change
- permission fix for remote controller BPF support
- memleak fix in ir_toy driver
- Kconfig dependency fix for pwm-ir-rx
* tag 'media/v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: pwm-ir-tx: Depend on CONFIG_HIGH_RES_TIMERS
media: ir_toy: fix a memleak in irtoy_tx
media: rc: bpf attach/detach requires write permission
media: atomisp: Adjust for v4l2_subdev_state handling changes in 6.8
media: rkisp1: Fix IRQ handling due to shared interrupts
media: Revert "media: rkisp1: Drop IRQF_SHARED"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Keep bridges in D0 if we need to poll downstream devices for PME to
resolve a v6.6 regression where we failed to enumerate devices below
bridges put in D3hot by runtime PM, e.g., NVMe drives connected via
Thunderbolt or USB4 docks (Alex Williamson)
- Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer
* tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
MAINTAINERS: Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer
PCI: Fix active state requirement in PME polling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- tracing/probes: Fix BTF structure member finder to find the members
which are placed after any anonymous union member correctly.
* tag 'probes-fixes-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix to search structure fields correctly
|
|
Pull smb client fixes from Steve French:
"Five smb3 client fixes, most also for stable:
- Two multichannel fixes (one to fix potential handle leak on retry)
- Work around possible serious data corruption (due to change in
folios in 6.3, for cases when non standard maximum write size
negotiated)
- Symlink creation fix
- Multiuser automount fix"
* tag '6.8-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: Fix regression in writes when non-standard maximum write size negotiated
smb: client: handle path separator of created SMB symlinks
smb: client: set correct id, uid and cruid for multiuser automounts
cifs: update the same create_guid on replay
cifs: fix underflow in parse_server_interfaces()
|
|
The Linux kernel project now has the ability to assign CVEs to fixed
issues, so document the process and how individual developers can get a
CVE if one is not automatically assigned for their fixes.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/2024021731-essence-sadness-28fd@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix to search a field from the structure which has anonymous union
correctly.
Since the reference `type` pointer was updated in the loop, the search
loop suddenly aborted where it hits an anonymous union. Thus it can not
find the field after the anonymous union. This avoids updating the
cursor `type` pointer in the loop.
Link: https://lore.kernel.org/all/170791694361.389532.10047514554799419688.stgit@devnote2/
Fixes: 302db0f5b3d8 ("tracing/probes: Add a function to search a member of a struct/union")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
Three fixes are included here. Two are strictly hardware-related
for the i801 and qcom-geni devices. Meanwhile, a fix from Arnd
addresses a compilation error encountered during compile test on
powerpc.
|
|
Since I have been contributing to the driver for a while and wish to help
with the review process, add myself as a reviewer.
Link: https://lore.kernel.org/r/20240216065926.473805-1-s-vadapalli@ti.com
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes: the two fnic ones are a revert and a refix, which is why
the diffstat is a bit big. The target one also extracts a function to
add a check for configuration and so looks bigger than it is"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fnic: Move fnic_fnic_flush_tx() to a work queue
scsi: Revert "scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock"
scsi: target: Fix unmap setup during configuration
|