Age | Commit message (Collapse) | Author |
|
We unconditionally enable FEAT_MOPS, which is obviously wrong.
So let's only do that when it is advertised to the guest.
Which means we need to rely on a per-vcpu HCRX_EL2 shadow register.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-25-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
No AMU? No AMU! IF we see an AMU-related trap, let's turn it into
an UNDEF!
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-24-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
As part of the ongoing effort to honor the guest configuration,
add the necessary checks to make PIR_EL1 and co UNDEF if not
advertised to the guest, and avoid context switching them.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-23-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Outer Shareable and Range TLBI instructions shouldn't be made available
to the guest if they are not advertised. Use FGU to disable those,
and set HCR_EL2.TLBIOS in the case the host doesn't have FGT. Note
that in that later case, we cannot efficiently disable TLBI Range
instructions, as this would require to trap all TLBIs.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-22-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The way we save/restore HFG[RW]TR_EL2 can now be simplified, and
the Ampere erratum hack is the only thing that still stands out.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-21-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We already trap a bunch of existing features for the purpose of
disabling them (MAIR2, POR, ACCDATA, SME...).
Let's move them over to our brand new FGU infrastructure.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-20-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to correctly honor our FGU bits, they must be converted
into a set of FGT bits. They get merged as part of the existing
FGT setting.
Similarly, the UNDEF injection phase takes place when handling
the trap.
This results in a bit of rework in the FGT macros in order to
help with the code generation, as burying per-CPU accesses in
macros results in a lot of expansion, not to mention the vcpu->kvm
access on nvhe (kern_hyp_va() is not optimisation-friendly).
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-19-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to efficiently handle system register access being disabled,
and this resulting in an UNDEF exception being injected, we introduce
the (slightly dubious) concept of Fine-Grained UNDEF, modeled after
the architectural Fine-Grained Traps.
For each FGT group, we keep a 64 bit word that has the exact same
bit assignment as the corresponding FGT register, where a 1 indicates
that trapping this register should result in an UNDEF exception being
reinjected.
So far, nothing populates this information, nor sets the corresponding
trap bits.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-18-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
__check_nv_sr_forward() is not specific to NV anymore, and does
a lot more. Rename it to triage_sysreg_trap(), making it plain
that its role is to handle where an exception is to be handled.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-17-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Since we always start sysreg/sysinsn handling by searching the
xarray, use it as the source of the index in the correct sys_reg_desc
array.
This allows some cleanup, such as moving the handling of unknown
sysregs in a single location.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to reduce the number of lookups that we have to perform
when handling a sysreg, register each AArch64 sysreg descriptor
with the global xarray. The index of the descriptor is stored
as a 10 bit field in the data word.
Subsequent patches will retrieve and use the stored index.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-15-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
As we are going to rely more and more on the global xarray that
contains the trap configuration, always populate it, even in the
non-NV case.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
As NV results in a bunch of system instructions being trapped, it makes
sense to pull the system instructions into their own little array, where
they will eventually be joined by AT, TLBI and a bunch of other CMOs.
Based on an initial patch by Jintack Lim.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-13-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Now that we don't use xa_store_range() anymore, drop the added
complexity of XARRAY_MULTI for KVM. It is likely still pulled
in by other bits of the kernel though.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to be able to store different values for member of an
encoding range, replace xa_store_range() calls with discrete
xa_store() calls and an encoding iterator.
We end-up using a bit more memory, but we gain some flexibility
that we will make use of shortly.
Take this opportunity to tidy up the error handling path.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Negative trap bits are a massive pain. They are, on the surface,
indistinguishable from RES0 bits. Do you trap? or do you ignore?
Thankfully, we now have the right infrastructure to check for RES0
bits as long as the register is backed by VNCR, which is the case
for the FGT registers.
Use that information as a discriminant when handling a trap that
is potentially caused by a FGT.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
There is no reason to have separate FGT group identifiers for
the debug fine grain trapping. The sole requirement is to provide
the *names* so that the SR_FGF() macro can do its magic of picking
the correct bit definition.
So let's alias HDFGWTR_GROUP and HDFGRTR_GROUP.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Now that we have the infrastructure to enforce a sanitised register
value depending on the VM configuration, drop the helper that only
used the architectural RES0 value.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Just like its little friends, HCRX_EL2 gets the feature set treatment
when backed by VNCR.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Fine Grained Traps are controlled by a whole bunch of features.
Each one of them must be checked and the corresponding masks
computed so that we don't let the guest apply traps it shouldn't
be using.
This takes care of HFG[IRW]TR_EL2, HDFG[RW]TR_EL2, and HAFGRTR_EL2.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We can now start making use of our sanitising masks by setting them
to values that depend on the guest's configuration.
First up are VTTBR_EL2, VTCR_EL2, VMPIDR_EL2 and HCR_EL2.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
VNCR-backed "registers" are actually only memory. Which means that
there is zero control over what the guest can write, and that it
is the hypervisor's job to actually sanitise the content of the
backing store. Yeah, this is fun.
In order to preserve some form of sanity, add a repainting mechanism
that makes use of a per-VM set of RES0/RES1 masks, one pair per VNCR
register. These masks get applied on access to the backing store via
__vcpu_sys_reg(), ensuring that the state that is consumed by KVM is
correct.
So far, nothing populates these masks, but stay tuned.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to make it easier to check whether a particular feature
is exposed to a guest, add a new set of helpers, with kvm_has_feat()
being the most useful.
Let's start making use of them in the PMU code (courtesy of Oliver).
Follow-up changes will introduce additional use patterns.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Co-developed--by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Despite having the control bits for FEAT_SPECRES and FEAT_PACM,
the ID registers fields are either incomplete or missing.
Fix it.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
On my EliteBook 840 G8 Notebook PC (ProdId 5S7R6EC#ABD; built 2022 for
german market) the Mute LED is always on. The mute button itself works
as expected. alsa-info.sh shows a different subsystem-id 0x8ab9 for
Realtek ALC285 Codec, thus the existing quirks for HP 840 G8 don't work.
Therefore, add a new quirk for this type of EliteBook.
Signed-off-by: Hans Peter <flurry123@gmx.ch>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240219164518.4099-1-flurry123@gmx.ch
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The driver must write 0 to HALO_STATE before sending the SYSTEM_RESET
command to the firmware.
HALO_STATE is in DSP memory, which is preserved across a soft reset.
The SYSTEM_RESET command does not change the value of HALO_STATE.
There is period of time while the CS35L56 is resetting, before the
firmware has started to boot, where a read of HALO_STATE will return
the value it had before the SYSTEM_RESET. If the driver does not
clear HALO_STATE, this would return BOOT_DONE status even though the
firmware has not booted.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file")
Link: https://msgid.link/r/20240216140535.1434933-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.
To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.
Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.
Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com
|
|
RISC-V PLIC cannot "end-of-interrupt" (EOI) disabled interrupts, as
explained in the description of Interrupt Completion in the PLIC spec:
"The PLIC signals it has completed executing an interrupt handler by
writing the interrupt ID it received from the claim to the claim/complete
register. The PLIC does not check whether the completion ID is the same
as the last claim ID for that target. If the completion ID does not match
an interrupt source that *is currently enabled* for the target, the
completion is silently ignored."
Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked")
ensured that EOI is successful by enabling interrupt first, before EOI.
Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask
operations") removed the interrupt enabling code from the previous
commit, because it assumes that interrupt should already be enabled at the
point of EOI.
However, this is incorrect: there is a window after a hart claiming an
interrupt and before irq_desc->lock getting acquired, interrupt can be
disabled during this window. Thus, EOI can be invoked while the interrupt
is disabled, effectively nullify this EOI. This results in the interrupt
never gets asserted again, and the device who uses this interrupt appears
frozen.
Make sure that interrupt is really enabled before EOI.
Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Samuel Holland <samuel@sholland.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: linux-riscv@lists.infradead.org
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240131081933.144512-1-namcao@linutronix.de
|
|
The debugfs `update_policy` file is created before
amd_pmf_start_policy_engine() has completed, and thus there could be
a possible (albeit unlikely) race between sideloading a policy and the
BIOS policy getting setup.
Move the debugfs file creation after all BIOS policy is setup.
Fixes: 10817f28e533 ("platform/x86/amd/pmf: Add capability to sideload of policy binary")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Closes: https://lore.kernel.org/platform-driver-x86/15df7d02-b0aa-457a-954a-9d280a592843@redhat.com/T/#m2c445f135e5ef9b53184be7fc9df84e15f89d4d9
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240217015642.113806-1-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
amd_pmf_init_smart_pc() calls out to amd_pmf_get_bios_buffer() but
the error handling flow doesn't clean everything up allocated
memory.
As amd_pmf_get_bios_buffer() is only called by amd_pmf_init_smart_pc(),
fold it into the function and add labels to clean up any step that
can fail along the way. Explicitly set everything allocated to NULL as
there are other features that may access some of the same variables.
Fixes: 7c45534afa44 ("platform/x86/amd/pmf: Add support for PMF Policy Binary")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240217014107.113749-3-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
If a machine advertises Smart PC support but is missing policy data
show a debugging message to help clarify why Smart PC wasn't enabled.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240217014107.113749-2-mario.limonciello@amd.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
The buffer is cleared in the suspend handler but used in
the delayed work for amd_pmf_get_metrics().
Stop clearing it to fix the hang.
Reported-by: Trolli Schmittlauch <t.schmittlauch@orlives.de>
Closes: https://lore.kernel.org/regressions/ed2226ff-257b-4cfd-afd6-bf3be9785474@localhost/
Closes: https://community.frame.work/t/kernel-6-8-rc-system-freezes-after-resuming-from-suspend-reproducers-wanted/45381
Fixes: 2b3a7f06caaf ("platform/x86/amd/pmf: Change return type of amd_pmf_set_dram_addr()")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240217005216.113408-1-mario.limonciello@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
TEE enact command failures are seen after each suspend/resume cycle;
fix this by cancelling the policy builder workqueue before going into
suspend and reschedule the workqueue after resume.
[ 629.516792] ccp 0000:c2:00.2: tee: command 0x5 timed out, disabling PSP
[ 629.516835] amd-pmf AMDI0102:00: TEE enact cmd failed. err: ffff000e, ret:0
[ 630.550464] amd-pmf AMDI0102:00: AMD_PMF_REGISTER_RESPONSE:1
[ 630.550511] amd-pmf AMDI0102:00: AMD_PMF_REGISTER_ARGUMENT:7
[ 630.550548] amd-pmf AMDI0102:00: AMD_PMF_REGISTER_MESSAGE:16
Fixes: ae82cef7d9c5 ("platform/x86/amd/pmf: Add support for PMF-TA interaction")
Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240216064112.962582-2-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Improve code readability by removing smart_pc_status enum, as the same
can be done with a simple true/false check; Update the code checks
accordingly.
Also add a missing return on amd_pmf_init_smart_pc() success,
to skip trying to setup the auto / slider modes which should
not be used in this case.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240216064112.962582-1-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Now that prefix matches for ACPI names are supported, the ts_dmi_data
structs for "GDIX1001:00" and "GDIX1001:01" can be consolidated into
a single match matching on "GDIX1001".
For consistency also change gdix1002_00_upside_down_data to match.
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240212120608.30469-2-hdegoede@redhat.com
|
|
On some devices the ACPI name of the touchscreen is e.g. either
MSSL1680:00 or MSSL1680:01 depending on the BIOS version.
This happens for example on the "Chuwi Hi8 Air" tablet where the initial
commit's ts_data uses "MSSL1680:00" but the tablets from the github issue
and linux-hardware.org probe linked below both use "MSSL1680:01".
Replace the strcmp() match on ts_data->acpi_name with a strstarts()
check to allow using a partial match on just the ACPI HID of "MSSL1680"
and change the ts_data->acpi_name for the "Chuwi Hi8 Air" accordingly
to fix the touchscreen not working on models where it is "MSSL1680:01".
Note this drops the length check for I2C_NAME_SIZE. This never was
necessary since the ACPI names used are never more then 11 chars and
I2C_NAME_SIZE is 20 so the replaced strncmp() would always stop long
before reaching I2C_NAME_SIZE.
Link: https://linux-hardware.org/?computer=AC4301C0542A
Fixes: bbb97d728f77 ("platform/x86: touchscreen_dmi: Add info for the Chuwi Hi8 Air tablet")
Closes: https://github.com/onitake/gsl-firmware/issues/91
Cc: stable@vger.kernel.org
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240212120608.30469-1-hdegoede@redhat.com
|
|
Since commit 7a36b901a6eb ("ACPI: OSL: Use a threaded interrupt handler
for SCI") the ACPI OSL code passes IRQF_ONESHOT when requesting the SCI.
Since the INT0002 GPIO is typically shared with the ACPI SCI the INT0002
driver must pass the same flags.
This fixes the INT0002 driver failing to probe due to following error +
as well as removing the backtrace that follows this error:
"genirq: Flags mismatch irq 9. 00000084 (INT0002) vs. 00002080 (acpi)"
Fixes: 7a36b901a6eb ("ACPI: OSL: Use a threaded interrupt handler for SCI")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240210110149.12803-1-hdegoede@redhat.com
|
|
The Lenovo workstations require the password opcode to be run before
the attribute value is changed (if Admin password is enabled).
Tested on some Thinkpads to confirm they are OK with this order too.
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Fixes: 640a5fa50a42 ("platform/x86: think-lmi: Opcode support")
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240209152359.528919-1-mpearson-lenovo@squebb.ca
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Incorporate a test case to assess the handling of invalid flags or
task__nullable parameters passed to bpf_iter_task_new(). Prior to the
preceding commit, this scenario could potentially trigger a kernel panic.
However, with the previous commit, this test case is expected to function
correctly.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240217114152.1623-3-laoar.shao@gmail.com
|
|
Failure to initialize it->pos, coupled with the presence of an invalid
value in the flags variable, can lead to it->pos referencing an invalid
task, potentially resulting in a kernel panic. To mitigate this risk, it's
crucial to ensure proper initialization of it->pos to NULL.
Fixes: ac8148d957f5 ("bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/bpf/20240217114152.1623-2-laoar.shao@gmail.com
|
|
bpf_timer_cancel
This selftest is based on a Alexei's test adopted from an internal
user to troubleshoot another bug. During this exercise, a separate
racing bug was discovered between bpf_timer_cancel_and_free
and bpf_timer_cancel. The details can be found in the previous
patch.
This patch is to add a selftest that can trigger the bug.
I can trigger the UAF everytime in my qemu setup with KASAN. The idea
is to have multiple user space threads running in a tight loop to exercise
both bpf_map_update_elem (which calls into bpf_timer_cancel_and_free)
and bpf_timer_cancel.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20240215211218.990808-2-martin.lau@linux.dev
|
|
The following race is possible between bpf_timer_cancel_and_free
and bpf_timer_cancel. It will lead a UAF on the timer->timer.
bpf_timer_cancel();
spin_lock();
t = timer->time;
spin_unlock();
bpf_timer_cancel_and_free();
spin_lock();
t = timer->timer;
timer->timer = NULL;
spin_unlock();
hrtimer_cancel(&t->timer);
kfree(t);
/* UAF on t */
hrtimer_cancel(&t->timer);
In bpf_timer_cancel_and_free, this patch frees the timer->timer
after a rcu grace period. This requires a rcu_head addition
to the "struct bpf_hrtimer". Another kfree(t) happens in bpf_timer_init,
this does not need a kfree_rcu because it is still under the
spin_lock and timer->timer has not been visible by others yet.
In bpf_timer_cancel, rcu_read_lock() is added because this helper
can be used in a non rcu critical section context (e.g. from
a sleepable bpf prog). Other timer->timer usages in helpers.c
have been audited, bpf_timer_cancel() is the only place where
timer->timer is used outside of the spin_lock.
Another solution considered is to mark a t->flag in bpf_timer_cancel
and clear it after hrtimer_cancel() is done. In bpf_timer_cancel_and_free,
it busy waits for the flag to be cleared before kfree(t). This patch
goes with a straight forward solution and frees timer->timer after
a rcu grace period.
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20240215211218.990808-1-martin.lau@linux.dev
|
|
FORTIFY_SOURCE has been ignoring 0-sized destinations while the kernel
code base has been converted to flexible arrays. In order to enforce
the 0-sized destinations (e.g. with __counted_by), the remaining 0-sized
destinations need to be handled. Unfortunately, struct vic_provinfo
resists full conversion, as it contains a flexible array of flexible
arrays, which is only possible with the 0-sized fake flexible array.
Use unsafe_memcpy() to avoid future false positives under
CONFIG_FORTIFY_SOURCE.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Dell Inspiron 16 Plus 7630, similar to its predecessors (7620 models),
experiences an issue with unconnected top speakers. Since the controller
remains unchanged, this commit addresses the problem by correctly
connecting the speakers on NID 0X17 to the DAC on NIC 0x03.
Signed-off-by: Jay Ajit Mate <jay.mate15@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240219100404.9573-1-jay.mate15@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Since there is a utility available for this, use
the API rather than open code.
Fixes: 13943d6c8273 ("ionic: prevent pci disable of already disabled device")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While working on the patchset to remove extent locking I got a lockdep
splat with fiemap and pagefaulting with my new extent lock replacement
lock.
This deadlock exists with our normal code, we just don't have lockdep
annotations with the extent locking so we've never noticed it.
Since we're copying the fiemap extent to user space on every iteration
we have the chance of pagefaulting. Because we hold the extent lock for
the entire range we could mkwrite into a range in the file that we have
mmap'ed. This would deadlock with the following stack trace
[<0>] lock_extent+0x28d/0x2f0
[<0>] btrfs_page_mkwrite+0x273/0x8a0
[<0>] do_page_mkwrite+0x50/0xb0
[<0>] do_fault+0xc1/0x7b0
[<0>] __handle_mm_fault+0x2fa/0x460
[<0>] handle_mm_fault+0xa4/0x330
[<0>] do_user_addr_fault+0x1f4/0x800
[<0>] exc_page_fault+0x7c/0x1e0
[<0>] asm_exc_page_fault+0x26/0x30
[<0>] rep_movs_alternative+0x33/0x70
[<0>] _copy_to_user+0x49/0x70
[<0>] fiemap_fill_next_extent+0xc8/0x120
[<0>] emit_fiemap_extent+0x4d/0xa0
[<0>] extent_fiemap+0x7f8/0xad0
[<0>] btrfs_fiemap+0x49/0x80
[<0>] __x64_sys_ioctl+0x3e1/0xb50
[<0>] do_syscall_64+0x94/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
I wrote an fstest to reproduce this deadlock without my replacement lock
and verified that the deadlock exists with our existing locking.
To fix this simply don't take the extent lock for the entire duration of
the fiemap. This is safe in general because we keep track of where we
are when we're searching the tree, so if an ordered extent updates in
the middle of our fiemap call we'll still emit the correct extents
because we know what offset we were on before.
The only place we maintain the lock is searching delalloc. Since the
delalloc stuff can change during writeback we want to lock the extent
range so we have a consistent view of delalloc at the time we're
checking to see if we need to set the delalloc flag.
With this patch applied we no longer deadlock with my testcase.
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
With the following file extent layout, defrag would do unnecessary IO
and result more on-disk space usage.
# mkfs.btrfs -f $dev
# mount $dev $mnt
# xfs_io -f -c "pwrite 0 40m" $mnt/foobar
# sync
# xfs_io -f -c "pwrite 40m 16k" $mnt/foobar
# sync
Above command would lead to the following file extent layout:
item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 7 type 1 (regular)
extent data disk byte 298844160 nr 41943040
extent data offset 0 nr 41943040 ram 41943040
extent compression 0 (none)
item 7 key (257 EXTENT_DATA 41943040) itemoff 15763 itemsize 53
generation 8 type 1 (regular)
extent data disk byte 13631488 nr 16384
extent data offset 0 nr 16384 ram 16384
extent compression 0 (none)
Which is mostly fine. We can allow the final 16K to be merged with the
previous 40M, but it's upon the end users' preference.
But if we defrag the file using the default parameters, it would result
worse file layout:
# btrfs filesystem defrag $mnt/foobar
# sync
item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 7 type 1 (regular)
extent data disk byte 298844160 nr 41943040
extent data offset 0 nr 8650752 ram 41943040
extent compression 0 (none)
item 7 key (257 EXTENT_DATA 8650752) itemoff 15763 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 340787200 nr 33292288
extent data offset 0 nr 33292288 ram 33292288
extent compression 0 (none)
item 8 key (257 EXTENT_DATA 41943040) itemoff 15710 itemsize 53
generation 8 type 1 (regular)
extent data disk byte 13631488 nr 16384
extent data offset 0 nr 16384 ram 16384
extent compression 0 (none)
Note the original 40M extent is still there, but a new 32M extent is
created for no benefit at all.
[CAUSE]
There is an existing check to make sure we won't defrag a large enough
extent (the threshold is by default 32M).
But the check is using the length to the end of the extent:
range_len = em->len - (cur - em->start);
/* Skip too large extent */
if (range_len >= extent_thresh)
goto next;
This means, for the first 8MiB of the extent, the range_len is always
smaller than the default threshold, and would not be defragged.
But after the first 8MiB, the remaining part would fit the requirement,
and be defragged.
Such different behavior inside the same extent caused the above problem,
and we should avoid different defrag decision inside the same extent.
[FIX]
Instead of using @range_len, just use @em->len, so that we have a
consistent decision among the same file extent.
Now with this fix, we won't touch the extent, thus not making it any
worse.
Reported-by: Filipe Manana <fdmanana@suse.com>
Fixes: 0cb5950f3f3b ("btrfs: fix deadlock when reserving space during defrag")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Doesn't seem to compile on 32b, presumably due to u64 mod/division.
Simplest is to just switch over to u32 here. Also make print modifiers
consistent with that.
Fixes: a64056bb5a32 ("drm/tests/drm_buddy: add alloc_contiguous test")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215174431.285069-7-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Platform clock and phy error resources are not cleaned up in Xilinx GT PHY
error path.
To fix introduce the function ceva_ahci_platform_enable_resources() which
is a customized version of ahci_platform_enable_resources() and inline with
SATA IP programming sequence it does:
- Assert SATA reset
- Program PS GTR phy
- Bring SATA by de-asserting the reset
- Wait for GT lane PLL to be locked
ceva_ahci_platform_enable_resources() is also used in the resume path
as the same SATA programming sequence (as in probe) should be followed.
Also cleanup the mixed usage of ahci_platform_enable_resources() and custom
implementation in the probe function as both are not required.
Fixes: 9a9d3abe24bb ("ata: ahci: ceva: Update the driver to support xilinx GT phy")
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
The ASM1064 SATA host controller always reports wrongly,
that it has 24 ports. But in reality, it only has four ports.
before:
ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled
ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xffff0f impl SATA mode
ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst
after:
ahci 0000:04:00.0: ASM1064 has only four ports
ahci 0000:04:00.0: forcing port_map 0xffff0f -> 0xf
ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled
ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xf impl SATA mode
ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst
Signed-off-by: "Andrey Jr. Melnikov" <temnota.am@gmail.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|