Age | Commit message (Collapse) | Author |
|
These functions are used often, also in selftests.
sscanf() itself is also used by kselftest.h itself.
The implementation is limited and only supports numeric arguments.
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250209-nolibc-scanf-v2-1-c29dea32f1cd@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Pull NFS client fixes from Anna Schumaker:
"Stable Fixes:
- O_DIRECT writes should adjust file length
Other Bugfixes:
- Adjust delegated timestamps for O_DIRECT reads and writes
- Prevent looping due to rpc_signal_task() races
- Fix a deadlock when recovering state on a sillyrenamed file
- Properly handle -ETIMEDOUT errors from tlshd
- Suppress build warnings for unused procfs functions
- Fix memory leak of lsm_contexts"
* tag 'nfs-for-6.14-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
lsm,nfs: fix memory leak of lsm_context
sunrpc: suppress warnings for unused procfs functions
SUNRPC: Handle -ETIMEDOUT return from tlshd
NFSv4: Fix a deadlock when recovering state on a sillyrenamed file
SUNRPC: Prevent looping due to rpc_signal_task() races
NFS: Adjust delegated timestamps for O_DIRECT reads and writes
NFS: O_DIRECT writes must check and adjust the file length
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fixes from Mickaël Salaün:
"Fixes to TCP socket identification, documentation, and tests"
* tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Add binaries to .gitignore
selftests/landlock: Test that MPTCP actions are not restricted
selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP
landlock: Fix non-TCP sockets restriction
landlock: Minor typo and grammar fixes in IPC scoping documentation
landlock: Fix grammar error
selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity fixes from Mimi Zohar:
"One bugfix and one spelling cleanup. The bug fix restores a
performance improvement"
* tag 'integrity-v6.14-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: Reset IMA_NONACTION_RULE_FLAGS after post_setattr
integrity: fix typos and spelling errors
|
|
'alignment'"
This reverts commit 267b21d0bef8e67dbe6c591c9991444e58237ec9.
Turns out some DTs do depend on this behavior. Specifically, a
downstream Pixel 6 DT. Revert the change at least until we can decide if
the DT spec can be changed instead.
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Keep user-forced connector status even if it cannot be programmed. Same
behavior as for the rest of the drivers.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250114100214.195386-1-tzimmermann@suse.de
|
|
Add a selftest to validate the behavior of the NUMA-aware scheduler
functionalities, including idle CPU selection within nodes, per-node
DSQs and CPU to node mapping.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
With CONFIG_DEBUG_RSEQ=y, an in-kernel copy of the read-only fields is
kept synchronized with the user-space fields. Ensure the updates are
done in lockstep in case we error out on a write to user-space.
Fixes: 7d5265ffcd8b ("rseq: Validate read-only fields under DEBUG_RSEQ config")
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20250225202500.731245-1-mjeanson@efficios.com
|
|
If a data sector on an OFS floppy contains a value > 0x1e8 (the
largest amount of data that fits in the sector after its header), then
an Amiga reading the file can return corrupt data, by taking the
overlarge size at its word and reading past the end of the buffer it
read the disk sector into!
The cause: when affs_write_end_ofs() writes data to an OFS filesystem,
the new size field for a data block was computed by adding the amount
of data currently being written (into the block) to the existing value
of the size field. This is correct if you're extending the file at the
end, but if you seek backwards in the file and overwrite _existing_
data, it can lead to the size field being larger than the maximum
legal value.
This commit changes the calculation so that it sets the size field to
the max of its previous size and the position within the block that we
just wrote up to.
Signed-off-by: Simon Tatham <anakin@pobox.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If I write a file to an OFS floppy image, and try to read it back on
an emulated Amiga running Workbench 1.3, the Amiga reports a disk
error trying to read the file. (That is, it's unable to read it _at
all_, even to copy it to the NIL: device. It isn't a matter of getting
the wrong data and being unable to parse the file format.)
This is because the 'sequence number' field in the OFS data block
header is supposed to be based at 1, but affs writes it based at 0.
All three locations changed by this patch were setting the sequence
number to a variable 'bidx' which was previously obtained by dividing
a file position by bsize, so bidx will naturally use 0 for the first
block. Therefore all three should add 1 to that value before writing
it into the sequence number field.
With this change, the Amiga successfully reads the file.
For data block reference: https://wiki.osdev.org/FFS_(Amiga)
Signed-off-by: Simon Tatham <anakin@pobox.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
After commit 92cadedd9d5f ("brcmfmac: Avoid keeping power to SDIO card
unless WOWL is used"), the wifi adapter by default is turned off on
suspend and then re-probed on resume.
This conflicts with some embedded boards that require to remain powered.
They will fail on resume with:
brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
ieee80211 phy1: brcmf_bus_started: failed: -110
ieee80211 phy1: brcmf_attach: dongle is not responding: err=-110
brcmfmac: brcmf_sdio_firmware_callback: brcmf_attach failed
This commit checks for the Device Tree property 'cap-power-off-cards'.
If this property is not set, it means that we do not have the capability
to power off and should therefore remain powered.
Signed-off-by: Matthias Proske <email@matthias-proske.de>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20250212185941.146958-2-email@matthias-proske.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Use rcu_access_pointer() to avoid sparse warning in
drv_remove_interface().
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502130534.bVrZZBK0-lkp@intel.com/
Fixes: 646262c71aca ("wifi: mac80211: remove debugfs dir for virtual monitor")
Link: https://patch.msgid.link/20250213214330.6113-1-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If there's any vendor-specific element in the subelements
then the outer element parsing must not parse any vendor
element at all. This isn't implemented correctly now due
to parsing into the pointers and then overriding them, so
explicitly skip vendor elements if any exist in the sub-
elements (non-transmitted profile or per-STA profile).
Fixes: 671042a4fb77 ("mac80211: support non-inheritance element")
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250221112451.fd71e5268840.I9db3e6a3367e6ff38d052d07dc07005f0dd3bd5c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The code is erroneously applying the non-inheritance element
to the inner elements rather than the outer, which is clearly
completely wrong. Fix it by finding the MLE basic element at
the beginning, and then applying the non-inheritance for the
outer parsing.
While at it, do some general cleanups such as not allowing
callers to try looking for a specific non-transmitted BSS
and link at the same time.
Fixes: 45ebac4f059b ("wifi: mac80211: Parse station profile from association response")
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250221112451.b46d42f45b66.If5b95dc3c80208e0c62d8895fb6152aa54b6620b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.14
More driver specific fixes, the firmware change is part of fixing the
race conditions in the Cirrus driver.
|
|
This fixes a regression introduced a few weeks ago in stable kernels
6.12.14 and 6.13.3. The internal microphone on ASUS Vivobook N705UD /
X705UD laptops is broken: the microphone appears in userspace (e.g.
Gnome settings) but no sound is detected.
I bisected it to commit 3b4309546b48 ("ALSA: hda: Fix headset detection
failure due to unstable sort").
I figured out the cause:
1. The initial pins enabled for the ALC256 driver are:
cfg->inputs == {
{ pin=0x19, type=AUTO_PIN_MIC,
is_headset_mic=1, is_headphone_mic=0, has_boost_on_pin=1 },
{ pin=0x1a, type=AUTO_PIN_MIC,
is_headset_mic=0, is_headphone_mic=0, has_boost_on_pin=1 } }
2. Since 2017 and commits c1732ede5e8 ("ALSA: hda/realtek - Fix headset
and mic on several ASUS laptops with ALC256") and 28e8af8a163 ("ALSA:
hda/realtek: Fix mic and headset jack sense on ASUS X705UD"), the
quirk ALC256_FIXUP_ASUS_MIC is also applied to ASUS X705UD / N705UD
laptops.
This added another internal microphone on pin 0x13:
cfg->inputs == {
{ pin=0x13, type=AUTO_PIN_MIC,
is_headset_mic=0, is_headphone_mic=0, has_boost_on_pin=1 },
{ pin=0x19, type=AUTO_PIN_MIC,
is_headset_mic=1, is_headphone_mic=0, has_boost_on_pin=1 },
{ pin=0x1a, type=AUTO_PIN_MIC,
is_headset_mic=0, is_headphone_mic=0, has_boost_on_pin=1 } }
I don't know what this pin 0x13 corresponds to. To the best of my
knowledge, these laptops have only one internal microphone.
3. Before 2025 and commit 3b4309546b48 ("ALSA: hda: Fix headset
detection failure due to unstable sort"), the sort function would let
the microphone of pin 0x1a (the working one) *before* the microphone
of pin 0x13 (the phantom one).
4. After this commit 3b4309546b48, the fixed sort function puts the
working microphone (pin 0x1a) *after* the phantom one (pin 0x13). As
a result, no sound is detected anymore.
It looks like the quirk ALC256_FIXUP_ASUS_MIC is not needed anymore for
ASUS Vivobook X705UD / N705UD laptops. Without it, everything works
fine:
- the internal microphone is detected and records actual sound,
- plugging in a jack headset is detected and can record actual sound
with it,
- unplugging the jack headset makes the system go back to internal
microphone and can record actual sound.
Cc: stable@vger.kernel.org
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Chris Chiu <chris.chiu@canonical.com>
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort")
Tested-by: Adrien Vergé <adrienverge@gmail.com>
Signed-off-by: Adrien Vergé <adrienverge@gmail.com>
Link: https://patch.msgid.link/20250226135515.24219-1-adrienverge@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The encoder HW/SW state verification should use a SW state which stays
unchanged while the encoder/output is active. The intel_dp::is_mst flag
used during state computation to choose between the DP SST/MST modes can
change while the output is active, if the sink gets disconnected or the
MST topology is removed for another reason. A subsequent state
verification using intel_dp::is_mst leads then to a mismatch if the
output is disabled/re-enabled without recomputing its state.
Use the encoder's active MST link count instead, which will be always
non-zero for an active MST output and will be zero for SST.
Fixes: 35d2e4b75649 ("drm/i915/ddi: start distinguishing 128b/132b SST and MST at state readout")
Fixes: 40d489fac0e8 ("drm/i915/ddi: handle 128b/132b SST in intel_ddi_read_func_ctl()")
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250224093242.1859583-1-imre.deak@intel.com
(cherry picked from commit 0159e311772af9d6598aafe072c020687720f1d7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The async call to __guc_exec_queue_fini_async frees the scheduler
while a submission may time out and restart. To prevent this race
condition, the pending job timer should be canceled before freeing
the scheduler.
V3(MattB):
- Adjust position of cancel pending job
- Remove gitlab issue# from commit message
V2(MattB):
- Cancel pending jobs before scheduler finish
Fixes: a20c75dba192 ("drm/xe: Call __guc_exec_queue_fini_async direct for KERNEL exec_queues")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250225045754.600905-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 18fbd567e75f9b97b699b2ab4f1fa76b7cf268f6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Commit b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h")
introduced an internal set of engine registers, however, as part of this
change, it has also introduced two duplicate `define' lines for
`RING_CTL_SIZE(size)'. This commit was introduced to the tree in v6.8-rc1.
While this is harmless as the definitions did not change, so no compiler
warning was observed.
Drop this line anyway for the sake of correctness.
Cc: stable@vger.kernel.org # v6.8-rc1+
Fixes: b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h")
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250225073104.865230-1-jeffbai@aosc.io
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 6b68c4542ffecc36087a9e14db8fc990c88bb01b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Saves a CALL to an out-of-line thunk for the common case of 1
argument.
Suggested-by: Scott Constable <scott.d.constable@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.927885784@infradead.org
|
|
While WAIT_FOR_ENDBR is specified to be a full speculation stop; it
has been shown that some implementations are 'leaky' to such an extend
that speculation can escape even the FineIBT preamble.
To deal with this, add additional hardening to the FineIBT preamble.
Notably, using a new LLVM feature:
https://github.com/llvm/llvm-project/commit/e223485c9b38a5579991b8cebb6a200153eee245
which encodes the number of arguments in the kCFI preamble's register.
Using this register<->arity mapping, have the FineIBT preamble CALL
into a stub clobbering the relevant argument registers in the
speculative case.
Scott sayeth thusly:
Microarchitectural attacks such as Branch History Injection (BHI) and
Intra-mode Branch Target Injection (IMBTI) [1] can cause an indirect
call to mispredict to an adversary-influenced target within the same
hardware domain (e.g., within the kernel). Instructions at the
mispredicted target may execute speculatively and potentially expose
kernel data (e.g., to a user-mode adversary) through a
microarchitectural covert channel such as CPU cache state.
CET-IBT [2] is a coarse-grained control-flow integrity (CFI) ISA
extension that enforces that each indirect call (or indirect jump)
must land on an ENDBR (end branch) instruction, even speculatively*.
FineIBT is a software technique that refines CET-IBT by associating
each function type with a 32-bit hash and enforcing (at the callee)
that the hash of the caller's function pointer type matches the hash
of the callee's function type. However, recent research [3] has
demonstrated that the conditional branch that enforces FineIBT's hash
check can be coerced to mispredict, potentially allowing an adversary
to speculatively bypass the hash check:
__cfi_foo:
ENDBR64
SUB R10d, 0x01234567
JZ foo # Even if the hash check fails and ZF=0, this branch could still mispredict as taken
UD2
foo:
...
The techniques demonstrated in [3] require the attacker to be able to
control the contents of at least one live register at the mispredicted
target. Therefore, this patch set introduces a sequence of CMOV
instructions at each indirect-callable target that poisons every live
register with data that the attacker cannot control whenever the
FineIBT hash check fails, thus mitigating any potential attack.
The security provided by this scheme has been discussed in detail on
an earlier thread [4].
[1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html
[2] Intel Software Developer's Manual, Volume 1, Chapter 18
[3] https://www.vusec.net/projects/native-bhi/
[4] https://lore.kernel.org/lkml/20240927194925.707462984@infradead.org/
*There are some caveats for certain processors, see [1] for more info
Suggested-by: Scott Constable <scott.d.constable@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.820402212@infradead.org
|
|
Add an array of code thunks, to be called from the FineIBT preamble,
clobbering the first 'n' argument registers for speculative execution.
Notably the 0th entry will clobber no argument registers and will never
be used, it exists so the array can be naturally indexed, while the 7th
entry will clobber all the 6 argument registers and also RSP in order to
mess up stack based arguments.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.717378681@infradead.org
|
|
Due to concerns about circumvention attacks against FineIBT on 'naked'
ENDBR, add an additional caller side hash check to FineIBT. This
should make it impossible to pivot over such a 'naked' ENDBR
instruction at the cost of an additional load.
The specific pivot reported was against the SYSCALL entry site and
FRED will have all those holes fixed up.
https://lore.kernel.org/linux-hardening/Z60NwR4w%2F28Z7XUa@ubun/
This specific fineibt_paranoid_start[] sequence was concocted by
Scott.
Suggested-by: Scott Constable <scott.d.constable@intel.com>
Reported-by: Jennifer Miller <jmill@asu.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.598033084@infradead.org
|
|
Because overlapping code sequences are all the rage.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.486463917@infradead.org
|
|
Scott notes that non-taken branches are faster. Abuse overlapping code
that traps instead of explicit UD2 instructions.
And LEA does not modify flags and will have less dependencies.
Suggested-by: Scott Constable <scott.d.constable@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.371942555@infradead.org
|
|
The normal fixup in handle_bug() is simply continuing at the next
instruction. However upcoming patches make this the wrong thing, so
allow handlers (specifically handle_cfi_failure()) to over-ride
regs->ip.
The callchain is such that the fixup needs to be done before it is
determined if the exception is fatal, as such, revert any changes in
that case.
Additionally, have handle_cfi_failure() remember the regs->ip value it
starts with for reporting.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.275223080@infradead.org
|
|
FineIBT will start using 0xEA as #UD. Normally '0xEA' is a 'bad',
invalid instruction for the CPU.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.166774696@infradead.org
|
|
When in the middle of a kernel source code file a kernel developer
sees a lone #else or #endif:
...
#else
...
It's not obvious at a glance what those preprocessor blocks are
conditional on, if the starting #ifdef is outside visible range.
So apply the standard pattern we use in such cases elsewhere in
the kernel for large preprocessor blocks:
#ifdef CONFIG_XXX
...
...
...
#endif /* CONFIG_XXX */
...
#ifdef CONFIG_XXX
...
...
...
#else /* !CONFIG_XXX: */
...
...
...
#endif /* !CONFIG_XXX */
( Note that in the #else case we use the /* !CONFIG_XXX */ marker
in the final #endif, not /* CONFIG_XXX */, which serves as an easy
visual marker to differentiate #else or #elif related #endif closures
from singular #ifdef/#endif blocks. )
Also clean up __CFI_DEFAULT definition with a bit more vertical alignment
applied, and a pointless tab converted to the standard space we use in
such definitions.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
For when we want to exactly match ENDBR, and not everything that we
can scribble it with.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124200.059556588@infradead.org
|
|
Rebuilding with CONFIG_CFI_PERMISSIVE=y enabled is such a pain, esp. since
clang is so slow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250224124159.924496481@infradead.org
|
|
Commit 8c87215dd3a2 ("ata: libahci_platform: support non-consecutive port
numbers") added a skip to ahci_platform_enable_phys() for ports that are
not in mask_port_map.
The code in ahci_platform_get_resources(), will currently set mask_port_map
for each child "port" node it finds in the device tree.
However, device trees that do not have any child "port" nodes will not have
mask_port_map set, and for non-device tree platforms mask_port_map will
only exist as a quirk for specific PCI device + vendor IDs, or as a kernel
module parameter, but will not be set by default.
Therefore, the common thing is that mask_port_map is only set if you do not
want to use all ports (as defined by Offset 0Ch: PI – Ports Implemented
register), but instead only want to use the ports in mask_port_map. If
mask_port_map is not set, all ports are available.
Thus, ahci_ignore_port() must be able to handle an empty mask_port_map.
Fixes: 8c87215dd3a2 ("ata: libahci_platform: support non-consecutive port numbers")
Fixes: 2c202e6c4f4d ("ata: libahci_platform: Do not set mask_port_map when not needed")
Fixes: c9b5be909e65 ("ahci: Introduce ahci_ignore_port() helper")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/linux-ide/10b31dd0-d0bb-4f76-9305-2195c3e17670@samsung.com/
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Co-developed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250225141612.942170-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
drivers/gpu/drm/imagination/ includes local headers with the double-quote
form (#include "...").
Hence, the header search path addition is unneeded.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250210102352.1517115-1-masahiroy@kernel.org
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
|
|
Process pending events on nested VM-Exit if the vCPU has an injectable IRQ
or NMI, as the event may have become pending while L2 was active, i.e. may
not be tracked in the context of vmcs01. E.g. if L1 has passed its APIC
through to L2 and an IRQ arrives while L2 is active, then KVM needs to
request an IRQ window prior to running L1, otherwise delivery of the IRQ
will be delayed until KVM happens to process events for some other reason.
The missed failure is detected by vmx_apic_passthrough_tpr_threshold_test
in KVM-Unit-Tests, but has effectively been masked due to a flaw in KVM's
PIC emulation that causes KVM to make spurious KVM_REQ_EVENT requests (and
apparently no one ever ran the test with split IRQ chips).
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250224235542.2562848-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Free vCPUs before freeing any VM state, as both SVM and VMX may access
VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
to be kicked out of nested guest mode.
Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
called") partially fixed the issue, but for unknown reasons only moved the
MMU unloading before VM destruction. Complete the change, and free all
vCPU state prior to destroying VM state, as nVMX accesses even more state
than nSVM.
In addition to the AVIC, KVM can hit a use-after-free on MSR filters:
kvm_msr_allowed+0x4c/0xd0
__kvm_set_msr+0x12d/0x1e0
kvm_set_msr+0x19/0x40
load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
vmx_free_vcpu+0x54/0xc0 [kvm_intel]
kvm_arch_vcpu_destroy+0x28/0xf0
kvm_vcpu_destroy+0x12/0x50
kvm_arch_destroy_vm+0x12c/0x1c0
kvm_put_kvm+0x263/0x3c0
kvm_vm_release+0x21/0x30
and an upcoming fix to process injectable interrupts on nested VM-Exit
will access the PIC:
BUG: kernel NULL pointer dereference, address: 0000000000000090
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
Call Trace:
<TASK>
kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
vmx_vcpu_free+0x2d/0x80 [kvm_intel]
kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
kvm_destroy_vcpus+0x8a/0x100 [kvm]
kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
kvm_destroy_vm+0x172/0x300 [kvm]
kvm_vcpu_release+0x31/0x50 [kvm]
Inarguably, both nSVM and nVMX need to be fixed, but punt on those
cleanups for the moment. Conceptually, vCPUs should be freed before VM
state. Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
are created, so it stands to reason that they must be freed _after_ vCPUs
are destroyed.
Reported-by: Aaron Lewis <aaronlewis@google.com>
Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
Cc: Jim Mattson <jmattson@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250224235542.2562848-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
nfsd_create_locked() doesn't need to explicitly call fh_update().
On success (which is the only time that fh_update() matters at all),
nfsd_create_setattr() will be called and it will call fh_update().
This extra call is not harmful, but is not necessary.
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250226062135.2043651-3-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
d_exact_alias() is a descendent of d_add_unique() which was introduced
20 years ago mostly likely to work around problems with NFS servers of
the time. It is now not used in several situations were it was
originally needed and there have been no reports of problems -
presumably the old NFS servers have been improved. This only place it
is now use is in NFSv4 code and the old problematic servers are thought
to have been v2/v3 only.
There is no clear benefit in reusing a unhashed() dentry which happens
to have the same name as the dentry we are adding.
So this patch removes d_exact_alias() and the one place that it is used.
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250226062135.2043651-2-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Recently a bug was discovered where the server had entered TCP_ESTABLISHED
state, but the upper layers were not notified.
The same 5-tuple packet may be processed by different CPUSs, so two
CPUs may receive different ack packets at the same time when the
state is TCP_NEW_SYN_RECV.
In that case, req->ts_recent in tcp_check_req may be changed concurrently,
which will probably cause the newsk's ts_recent to be incorrectly large.
So that tcp_validate_incoming will fail. At this point, newsk will not be
able to enter the TCP_ESTABLISHED.
cpu1 cpu2
tcp_check_req
tcp_check_req
req->ts_recent = rcv_tsval = t1
req->ts_recent = rcv_tsval = t2
syn_recv_sock
tcp_sk(child)->rx_opt.ts_recent = req->ts_recent = t2 // t1 < t2
tcp_child_process
tcp_rcv_state_process
tcp_validate_incoming
tcp_paws_check
if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win)
// t2 - t1 > paws_win, failed
tcp_v4_do_rcv
tcp_rcv_state_process
// TCP_ESTABLISHED
The cpu2's skb or a newly received skb will call tcp_v4_do_rcv to get
the newsk into the TCP_ESTABLISHED state, but at this point it is no
longer possible to notify the upper layer application. A notification
mechanism could be added here, but the fix is more complex, so the
current fix is used.
In tcp_check_req, req->ts_recent is used to assign a value to
tcp_sk(child)->rx_opt.ts_recent, so removing the change in req->ts_recent
and changing tcp_sk(child)->rx_opt.ts_recent directly after owning the
req fixes this bug.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Brian Foster <bfoster@redhat.com> says:
Here's phase 2 of the incremental iter advance conversions. This updates
all remaining iomap operations to advance the iter within the operation
and thus removes the need to advance from the core iomap iterator. Once
all operations are switched over, the core advance code is removed and
the processed field is renamed to reflect that it is now a pure status
code.
For context, this was first introduced in a previous series [1] that
focused mainly on the core mechanism and iomap buffered write. This is
because original impetus was to facilitate a folio batch mechanism where
a filesystem can optionally provide a batch of folios to process for a
given mapping (i.e. zero range of an unwritten mapping with dirty folios
in pagecache). That is still WIP, but the broader point is that this was
originally intended as an optional mode until consensus that fell out of
discussion was that it would be preferable to convert over everything.
This presumably facilitates some other future work and simplifies
semantics in the core iteration code.
Patches 1-3 convert over iomap buffered read, direct I/O and various
other remaining ops (swap, etc.). Patches 4-9 convert over the various
DAX iomap operations. Finally, patches 10-12 introduce some cleanups now
that all iomap operations have updated iteration semantics.
* patches from https://lore.kernel.org/r/20250224144757.237706-1-bfoster@redhat.com:
iomap: introduce a full map advance helper
iomap: rename iomap_iter processed field to status
iomap: remove unnecessary advance from iomap_iter()
dax: advance the iomap_iter on pte and pmd faults
dax: advance the iomap_iter on dedupe range
dax: advance the iomap_iter on unshare range
dax: advance the iomap_iter on zero range
dax: push advance down into dax_iomap_iter() for read and write
dax: advance the iomap_iter in the read/write path
iomap: convert misc simple ops to incremental advance
iomap: advance the iter on direct I/O
iomap: advance the iter directly on buffered read
Link: https://lore.kernel.org/r/20250224144757.237706-1-bfoster@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Various iomap_iter_advance() calls advance by the full mapping
length and thus have no need for the current length input or
post-advance remaining length output from the standard advance
function. Add an iomap_iter_advance_full() helper to clean up these
cases.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-13-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The iter.processed field name is no longer appropriate now that
iomap operations do not return the number of bytes processed. Rename
the field to iter.status to reflect that a success or error code is
expected.
Also change the type to int as there is no longer a need for an s64.
This reduces the size of iomap_iter by 8 bytes due to a combination
of smaller type and reduction in structure padding. While here, fix
up the return types of various _iter() helpers to reflect the type
change.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-12-bfoster@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
At this point, all iomap operations have been updated to advance the
iomap_iter directly before returning to iomap_iter(). Therefore, the
complexity of handling both the old and new semantics is no longer
required and can be removed from iomap_iter().
Update iomap_iter() to expect success or failure status in
iter.processed. As a precaution and developer hint to prevent
inadvertent use of old semantics, warn on a positive return code and
fail the operation. Remove the unnecessary advance and simplify the
termination logic.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-11-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Advance the iomap_iter on PTE and PMD faults. Each of these
operations assign a hardcoded size to iter.processed. Replace those
with an advance and status return.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-10-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Advance the iter on successful dedupe. Dedupe range uses two iters
and iterates so long as both have outstanding work, so
correspondingly this needs to advance both on each iteration. Since
dax_range_compare_iter() now returns status instead of a byte count,
update the variable name in the caller as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-9-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Advance the iter and return 0 or an error code for success or
failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-8-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Update the DAX zero range iomap iter handler to advance the iter
directly. Advance by the full length in the hole/unwritten case, or
otherwise advance incrementally in the zeroing loop. In either case,
return 0 or an error code for success or failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-7-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
DAX read and write currently advances the iter after the
dax_iomap_iter() returns the number of bytes processed rather than
internally within the iter handler itself, as most other iomap
operations do. Push the advance down into dax_iomap_iter() and
update the function to return op status instead of bytes processed.
dax_iomap_iter() shortcuts reads from a hole or unwritten mapping by
directly zeroing the iov_iter, so advance the iomap_iter similarly
in that case.
The DAX processing loop can operate on a range slightly different
than defined by the iomap_iter depending on circumstances. For
example, a read may be truncated by inode size, a read or write
range can be increased due to page alignment, etc. Therefore, this
patch aims to retain as much of the existing logic as possible.
The loop control logic remains pos based, but is sampled from the
iomap_iter on each iteration after the advance instead of being
updated manually. Similarly, length is updated based on the output
of the advance instead of being updated manually. The advance itself
is based on the number of bytes transferred, which was previously
used to update the local copies of pos and length.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-6-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
DAX reads and writes flow through dax_iomap_iter(), which has one or
more subtleties in terms of how it processes a range vs. what is
specified in the iomap_iter. To keep things simple and remove the
dependency on iomap_iter() advances, convert a positive return from
dax_iomap_iter() to the new advance and status return semantics. The
advance can be pushed further down in future patches.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-5-bfoster@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Update several of the remaining iomap operations to advance the iter
directly rather than via return value. This includes page faults,
fiemap, seek data/hole and swapfile activation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-4-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Update iomap direct I/O to advance the iter directly rather than via
iter.processed. Update each mapping type helper to advance based on
the amount of data processed and return success or failure.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-3-bfoster@redhat.com
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap buffered read advances the iter via iter.processed. To
continue separating iter advance from return status, update
iomap_readpage_iter() to advance the iter instead of returning the
number of bytes processed. In turn, drop the offset parameter and
sample the updated iter->pos at the start of the function. Update
the callers to loop based on remaining length in the current
iteration instead of number of bytes processed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20250224144757.237706-2-bfoster@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|