Age | Commit message (Collapse) | Author |
|
Protected mode assumes that at minimum vgic-v3 is present, however KVM
fails to actually enforce this at the time of initialization. As such,
when running protected mode in a half-baked state on GICv2 hardware we
see the hyp go belly up at vcpu_load() when it tries to restore the
vgic-v3 cpuif:
$ ./arch_timer_edge_cases
[ 130.599140] kvm [4518]: nVHE hyp panic at: [<ffff800081102b58>] __kvm_nvhe___vgic_v3_restore_vmcr_aprs+0x8/0x84!
[ 130.603685] kvm [4518]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE
[ 130.611962] kvm [4518]: Hyp Offset: 0xfffeca95ed000000
[ 130.617053] Kernel panic - not syncing: HYP panic:
[ 130.617053] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000
[ 130.617053] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000
[ 130.617053] VCPU:0000000000000000
[ 130.638013] CPU: 0 UID: 0 PID: 4518 Comm: arch_timer_edge Tainted: G C 6.13.0-rc3-00009-gf7d03fcbf1f4 #1
[ 130.648790] Tainted: [C]=CRAP
[ 130.651721] Hardware name: Libre Computer AML-S905X-CC (DT)
[ 130.657242] Call trace:
[ 130.659656] show_stack+0x18/0x24 (C)
[ 130.663279] dump_stack_lvl+0x38/0x90
[ 130.666900] dump_stack+0x18/0x24
[ 130.670178] panic+0x388/0x3e8
[ 130.673196] nvhe_hyp_panic_handler+0x104/0x208
[ 130.677681] kvm_arch_vcpu_load+0x290/0x548
[ 130.681821] vcpu_load+0x50/0x80
[ 130.685013] kvm_arch_vcpu_ioctl_run+0x30/0x868
[ 130.689498] kvm_vcpu_ioctl+0x2e0/0x974
[ 130.693293] __arm64_sys_ioctl+0xb4/0xec
[ 130.697174] invoke_syscall+0x48/0x110
[ 130.700883] el0_svc_common.constprop.0+0x40/0xe0
[ 130.705540] do_el0_svc+0x1c/0x28
[ 130.708818] el0_svc+0x30/0xd0
[ 130.711837] el0t_64_sync_handler+0x10c/0x138
[ 130.716149] el0t_64_sync+0x198/0x19c
[ 130.719774] SMP: stopping secondary CPUs
[ 130.723660] Kernel Offset: disabled
[ 130.727103] CPU features: 0x000,00000800,02800000,0200421b
[ 130.732537] Memory Limit: none
[ 130.735561] ---[ end Kernel panic - not syncing: HYP panic:
[ 130.735561] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000
[ 130.735561] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000
[ 130.735561] VCPU:0000000000000000 ]---
Fix it by failing KVM initialization if the system doesn't implement
vgic-v3, as protected mode will never do anything useful on such
hardware.
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/kvmarm/5ca7588c-7bf2-4352-8661-e4a56a9cd9aa@sirena.org.uk/
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250203231543.233511-1-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The documentation previously listed the path to download In Field Scan
(IFS) test images as "TBD".
Update the documentation to include the correct image download
location. Also move the download link to the appropriate section within
the documentation.
Reported-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20250131205315.1585663-1-jithu.joseph@intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
By default, the UFS controller allocates a fixed number of RX and TX
engines statically. Consequently, when UFS reads are in progress, the TX
ICE engines remain idle, and vice versa. This leads to inefficient
utilization of RX and TX engines.
To address this limitation, enable the UFS shared ICE feature for Qualcomm
UFS V5.0 and above. This feature utilizes a pool of crypto cores for both
TX streams (UFS Write – Encryption) and RX streams (UFS Read –
Decryption). With this approach, crypto cores are dynamically allocated to
either the RX or TX stream as needed.
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Co-developed-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Signed-off-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com>
Link: https://lore.kernel.org/r/20250203112739.11425-1-quic_rdwivedi@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Justin Tee <justintee8345@gmail.com> says:
Update lpfc to revision 14.4.0.8
This patch set contains fixes related to diagnostic logging, smatch, and
ndlp ptr referencing issues.
The patches were cut against Martin's 6.14/scsi-queue tree.
Link: https://lore.kernel.org/r/20250131000524.163662-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update copyrights to 2025 for files modified in the 14.4.0.8 patch set.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update lpfc version to 14.4.0.8
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
After a port swap between separate fabrics, there may be multiple nodes in
the vport's fc_nodes list with the same fabric well known address.
Duplication is temporary and eventually resolves itself after dev_loss_tmo
expires, but nameserver queries may still occur before dev_loss_tmo. This
possibly results in returning stale fabric ndlp objects. Fix by adding an
nlp_state check to ensure the ndlp search routine returns the correct newer
allocated ndlp fabric object.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
With repeated port swaps between separate fabrics, there can be multiple
registrations for fabric well known address 0xfffffe. This can cause ndlp
reference confusion due to the usage of a single ndlp ptr that stores the
rport object in fc_rport struct private storage during transport
registration. Subsequent registrations update the ndlp->rport field with
the newer rport, so when transport layer triggers dev_loss_tmo for the
earlier registered rport the ndlp->rport private storage is referencing the
newer rport instead of the older rport in dev_loss_tmo callbk.
Because the older ndlp->rport object is already cleaned up elsewhere in
driver code during the time of fabric swap, check that the rport provided
in dev_loss_tmo callbk actually matches the rport stored in the LLDD's
ndlp->rport field. Otherwise, skip dev_loss_tmo work on a stale rport.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fix smatch warning regarding missed calls to free_irq(). Free the phba IRQ
in the failed pci_irq_vector cases.
lpfc_init.c: lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' from
request_irq() not released.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A clean up log message is output from lpfc_els_flush_cmd() for each
outstanding ELS I/O and repeated for every NPIV instance. The log message
should only be generated for active I/Os matching the NPIV vport. Thus,
move the vport check to before logging the message.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Ranjan Kumar <ranjan.kumar@broadcom.com> says:
Few Enhancements and minor fixes of mpi3mr driver.
Link: https://lore.kernel.org/r/20250129100850.25430-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update driver version to 8.12.1.0.50
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250129100850.25430-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When the task management thread processes reply queues while the reset
thread resets them, the task management thread accesses an invalid queue ID
(0xFFFF), set by the reset thread, which points to unallocated memory,
causing a crash.
Add flag 'io_admin_reset_sync' to synchronize access between the reset,
I/O, and admin threads. Before a reset, the reset handler sets this flag to
block I/O and admin processing threads. If any thread bypasses the initial
check, the reset thread waits up to 10 seconds for processing to finish. If
the wait exceeds 10 seconds, the controller is marked as unrecoverable.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250129100850.25430-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Allocate segmented trace buffer if firmware advertises the capability in
IOCfacts.
Upon driver load, read the trace buffer size from driver page 1, calculate
the required segments for trace buffer, and allocate segmented buffers.
Each segment is 4096 bytes in size.
While posting driver diagnostic buffer to firmware, advertise that trace
buffer is segmented.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250129100850.25430-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
To avoid reply queue full condition, update the driver to check IOCFacts
capabilities for qfull.
Update the operational reply queue's Consumer Index after processing 100
replies. If pending I/Os on a reply queue exceeds a threshold
(reply_queue_depth - 200), then return I/O back to OS to retry.
Also increase default admin reply queue size to 2K.
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250129100850.25430-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
After the blamed commits below, some UDP tunnel use dstats for
accounting. On the xmit path, all the UDP-base tunnels ends up
using iptunnel_xmit_stats() for stats accounting, and the latter
assumes the relevant (tunnel) network device uses tstats.
The end result is some 'funny' stat report for the mentioned UDP
tunnel, e.g. when no packet is actually dropped and a bunch of
packets are transmitted:
gnv2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue \
state UNKNOWN mode DEFAULT group default qlen 1000
link/ether ee:7d:09:87:90:ea brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped missed mcast
14916 23 0 15 0 0
TX: bytes packets errors dropped carrier collsns
0 1566 0 0 0 0
Address the issue ensuring the same binary layout for the overlapping
fields of dstats and tstats. While this solution is a bit hackish, is
smaller and with no performance pitfall compared to other alternatives
i.e. supporting both dstat and tstat in iptunnel_xmit_stats() or
reverting the blamed commit.
With time we should possibly move all the IP-based tunnel (and virtual
devices) to dstats.
Fixes: c77200c07491 ("bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: 6fa6de302246 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: be226352e8dc ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/2e1c444cf0f63ae472baff29862c4c869be17031.1738432804.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
ethtool: rss: minor fixes for recent RSS changes
Make sure RSS_GET messages are consistent in do and dump.
Fix up a recently added safety check for RSS + queue offset.
Adjust related tests so that they pass on devices which
don't support RSS + queue offset.
====================
Link: https://patch.msgid.link/20250201013040.725123-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
supported
Vast majority of drivers does not support queue offset.
Simply return if the rss context + queue ntuple fails.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit under Fixes adds ntuple rules but never deletes them.
Fixes: 29a4bc1fe961 ("selftest: extend test_rss_context_queue_reconfigure for action addition")
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The info.flow_type is for RXFH commands, ntuple flow_type is inside
the flow spec. The check currently does nothing, as info.flow_type
is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS.
Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit ec6e57beaf8b ("ethtool: rss: don't report key if device
doesn't support it") intended to stop reporting key fields for
additional rss contexts if device has a global hashing key.
Later we added dump support and the filtering wasn't properly
added there. So we end up reporting the key fields in dumps
but not in dos:
# ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \
--json '{"header": {"dev-index":2}, "context": 1 }'
{
"header": { ... },
"context": 1,
"indir": [0, 1, 2, 3, ...]]
}
# ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get
[
... snip context 0 ...
{ "header": { ... },
"context": 1,
"indir": [0, 1, 2, 3, ...],
-> "input_xfrm": 255,
-> "hfunc": 1,
-> "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
]
Hide these fields correctly.
The drivers/net/hw/rss_ctx.py selftest catches this when run on
a device with single key, already:
# Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump:
# Check| ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero")
# Check failed {0} == {0} key is all zero
not ok 8 rss_ctx.test_rss_context_dump
Fixes: f6122900f4e2 ("ethtool: rss: support dumping RSS contexts")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
kthread_affine_preferred() incorrectly returns 0 instead of -ENOMEM
when kzalloc() fails. Return 'ret' to ensure the correct error code is
propagated.
Fixes: 4d13f4304fa4 ("kthread: Implement preferred affinity")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501301528.t0cZVbnq-lkp@intel.com/
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
Remove the cxlflash driver for IBM CAPI Flash devices.
The cxlflash driver has received minimal maintenance for some time, and
the CAPI Flash hardware that uses it is no longer commercially available.
Thanks to Uma Krishnan, Matthew Ochs and Manoj Kumar for their work on
this driver over the years.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Link: https://lore.kernel.org/r/20250203072801.365551-2-ajd@linux.ibm.com
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
mpt3sas_config_get_manufacturing_pg7() and
mpt3sas_config_get_sas_device_pg1() were added as part of 2012's
commit f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS")
but haven't been used.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250127002851.113711-1-linux@treblig.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
mptscsih_target_reset() was added in 2023 by commit e6629081fb12 ("scsi:
message: fusion: Correct definitions for mptscsih_dev_reset()") but never
used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250127002716.113641-1-linux@treblig.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
mvs_phys_reset() was added in 2009's commit 20b09c2992fe ("[SCSI] mvsas:
add support for 94xx; layout change; bug fixes") but hasn't been used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250127002601.113555-1-linux@treblig.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A null dereference or oops exception will eventually occur when qla1280.c
driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I
think its clear from the code that the intention here is sg_dma_len(s) not
length of sg_next(s) when printing the debug info.
Signed-off-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Remove the repeated word "for" in comments.
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Link: https://lore.kernel.org/r/20250124081330.210724-1-hanchunchao@inspur.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Kai Mäkisara <Kai.Makisara@kolumbus.fi> says:
The first patch re-applies after device reset some settings changed
by the user (partition, density, block size).
The second and third patch address the case where more than one ULD
access the same device. The Unit Attention (UA) sense data is sent only
to one ULD and the others miss it. The st driver needs to find out
if device reset or media change has happened.
The second patch adds counters for New Media and Power On/Reset (POR)
Unit Attentions to the scsi_device struct. The third one changes st
so that these are used: if the value in the scsi_device struct does
not match the one stored locally, the corresponding UA has happened.
Use of the was_reset flag has been removed.
The fourth patch adds a file to sysfs to tell the user if reads/writes
to a tape are blocked following a device reset.
Link: https://lore.kernel.org/r/20250120194925.44432-1-Kai.Makisara@kolumbus.fi
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If the value read from the file is 1, reads and writes from/to the device
are blocked because the tape position may not match user's expectation
(tape rewound after device reset).
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250201151106.25529-1-Kai.Makisara@kolumbus.fi
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Compare the stored values of por_ctr and new_media_ctr against the values
in the device struct. In case of mismatch, the Unit Attention corresponding
to the counter has happened. This is a safeguard against another ULD
catching the Unit Attention sense data.
Macros scsi_get_ua_new_media_ctr and scsi_get_ua_por_ctr are added to read
the current values of the counters.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250120194925.44432-4-Kai.Makisara@kolumbus.fi
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The purpose of the counters is to enable all ULDs attached to a device to
find out that a New Media or/and Power On/Reset Unit Attentions has/have
been set, even if another ULD catches the Unit Attention as response to a
SCSI command.
The ULDs can read the counters and see if the values have changed from the
previous check.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250120194925.44432-3-Kai.Makisara@kolumbus.fi
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Some of the allowed operations put the tape into a known position to
continue operation assuming only the tape position has changed. But reset
sets partition, density and block size to drive default values. These
should be restored to the values before reset.
Normally the current block size and density are stored by the drive. If
the settings have been changed, the changed values have to be saved by the
driver across reset.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250120194925.44432-2-Kai.Makisara@kolumbus.fi
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
'struct pci_error_handlers' are not modified in these drivers.
Constifying these structures moves some data to a read-only section, so
increase overall security, especially when the structure holds some
function pointers.
On a x86_64, with allmodconfig, as an example:
Before:
======
text data bss dec hex filename
39049 6429 112 45590 b216 drivers/scsi/aacraid/linit.o
After:
=====
text data bss dec hex filename
39113 6365 112 45590 b216 drivers/scsi/aacraid/linit.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/efdec8425981e10fc398fa2ac599c9c45d930561.1737318548.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is currently no mechanism to return error from query responses.
Return the error and print the corresponding error message with it.
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20250118023808.24726-1-sh043.lee@samsung.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In StorVSC, payload->range.len is used to indicate if this SCSI command
carries payload. This data is allocated as part of the private driver data
by the upper layer and may get passed to lower driver uninitialized.
For example, the SCSI error handling mid layer may send TEST_UNIT_READY or
REQUEST_SENSE while reusing the buffer from a failed command. The private
data section may have stale data from the previous command.
If the SCSI command doesn't carry payload, the driver may use this value as
is for communicating with host, resulting in possible corruption.
Fix this by always initializing this value.
Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host")
Cc: stable@kernel.org
Tested-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Roman Kisel <romank@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
devm_blk_crypto_profile_init() registers a cleanup handler to run when
the associated (platform-) device is being released. For UFS, the
crypto private data and pointers are stored as part of the ufs_hba's
data structure 'struct ufs_hba::crypto_profile'. This structure is
allocated as part of the underlying ufshcd and therefore Scsi_host
allocation.
During driver release or during error handling in ufshcd_pltfrm_init(),
this structure is released as part of ufshcd_dealloc_host() before the
(platform-) device associated with the crypto call above is released.
Once this device is released, the crypto cleanup code will run, using
the just-released 'struct ufs_hba::crypto_profile'. This causes a
use-after-free situation:
Call trace:
kfree+0x60/0x2d8 (P)
kvfree+0x44/0x60
blk_crypto_profile_destroy_callback+0x28/0x70
devm_action_release+0x1c/0x30
release_nodes+0x6c/0x108
devres_release_all+0x98/0x100
device_unbind_cleanup+0x20/0x70
really_probe+0x218/0x2d0
In other words, the initialisation code flow is:
platform-device probe
ufshcd_pltfrm_init()
ufshcd_alloc_host()
scsi_host_alloc()
allocation of struct ufs_hba
creation of scsi-host devices
devm_blk_crypto_profile_init()
devm registration of cleanup handler using platform-device
and during error handling of ufshcd_pltfrm_init() or during driver
removal:
ufshcd_dealloc_host()
scsi_host_put()
put_device(scsi-host)
release of struct ufs_hba
put_device(platform-device)
crypto cleanup handler
To fix this use-after free, change ufshcd_alloc_host() to register a
devres action to automatically cleanup the underlying SCSI device on
ufshcd destruction, without requiring explicit calls to
ufshcd_dealloc_host(). This way:
* the crypto profile and all other ufs_hba-owned resources are
destroyed before SCSI (as they've been registered after)
* a memleak is plugged in tc-dwc-g210-pci.c remove() as a
side-effect
* EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as
it's not needed anymore
* no future drivers using ufshcd_alloc_host() could ever forget
adding the cleanup
Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile")
Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()")
Cc: stable@vger.kernel.org
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250124-ufshcd-fix-v4-1-c5d0144aae59@linaro.org
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fail I/Os instead of retry to prevent user space processes from being
blocked on the I/O completion for several minutes.
Retrying I/Os during "depopulation in progress" or "depopulation restore in
progress" results in a continuous retry loop until the depopulation
completes or until the I/O retry loop is aborted due to a timeout by the
scsi_cmd_runtime_exceeced().
Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs.
Most I/Os in the depopulation retry loop end up taking several minutes
before returning the failure to user space.
Cc: stable@vger.kernel.org # 4.18.x: 2bbeb8d scsi: core: Handle depopulation and restoration in progress
Cc: stable@vger.kernel.org # 4.18.x
Fixes: e37c7d9a0341 ("scsi: core: sanitize++ in progress")
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Filesystems can write to disk from page reclaim with __GFP_FS
set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in
page reclaim with GFP_KERNEL, where it could try to take filesystem
locks again, leading to a deadlock.
WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0
but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760
The full lockdep splat can be found in Marc's report:
https://lkml.org/lkml/2025/1/24/1101
Avoid the potential deadlock by doing the allocation with GFP_NOIO, which
prevents both filesystem and block layer recursion.
Reported-by: Marc Aurèle La France <tsi@tuyoix.net>
Signed-off-by: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This commit addresses an issue where clk_gating.state is being toggled in
ufshcd_setup_clocks() even if clock gating is not allowed.
The fix is to add a check for hba->clk_gating.is_initialized before toggling
clk_gating.state in ufshcd_setup_clocks().
Since clk_gating.lock is now initialized unconditionally, it can no longer
lead to the spinlock being used before it is properly initialized, but
instead it is mostly for documentation purposes.
Fixes: 1ab27c9cf8b6 ("ufs: Add support for clock gating")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20250128071207.75494-3-avri.altman@wdc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Address a lockdep warning triggered by the use of the clk_gating.lock before
it is properly initialized. The warning is as follows:
[ 4.388838] INFO: trying to register non-static key.
[ 4.395673] The code is fine but needs lockdep annotation, or maybe
[ 4.402118] you didn't initialize this object before use?
[ 4.407673] turning off the locking correctness validator.
[ 4.413334] CPU: 5 UID: 0 PID: 58 Comm: kworker/u32:1 Not tainted 6.12-rc1 #185
[ 4.413343] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[ 4.413362] Call trace:
[ 4.413364] show_stack+0x18/0x24 (C)
[ 4.413374] dump_stack_lvl+0x90/0xd0
[ 4.413384] dump_stack+0x18/0x24
[ 4.413392] register_lock_class+0x498/0x4a8
[ 4.413400] __lock_acquire+0xb4/0x1b90
[ 4.413406] lock_acquire+0x114/0x310
[ 4.413413] _raw_spin_lock_irqsave+0x60/0x88
[ 4.413423] ufshcd_setup_clocks+0x2c0/0x490
[ 4.413433] ufshcd_init+0x198/0x10ec
[ 4.413437] ufshcd_pltfrm_init+0x600/0x7c0
[ 4.413444] ufs_qcom_probe+0x20/0x58
[ 4.413449] platform_probe+0x68/0xd8
[ 4.413459] really_probe+0xbc/0x268
[ 4.413466] __driver_probe_device+0x78/0x12c
[ 4.413473] driver_probe_device+0x40/0x11c
[ 4.413481] __device_attach_driver+0xb8/0xf8
[ 4.413489] bus_for_each_drv+0x84/0xe4
[ 4.413495] __device_attach+0xfc/0x18c
[ 4.413502] device_initial_probe+0x14/0x20
[ 4.413510] bus_probe_device+0xb0/0xb4
[ 4.413517] deferred_probe_work_func+0x8c/0xc8
[ 4.413524] process_scheduled_works+0x250/0x658
[ 4.413534] worker_thread+0x15c/0x2c8
[ 4.413542] kthread+0x134/0x200
[ 4.413550] ret_from_fork+0x10/0x20
To fix this issue, ensure that the spinlock is only used after it has been
properly initialized before using it in ufshcd_setup_clocks(). Do that
unconditionally as initializing a spinlock is a fast operation.
Fixes: 209f4e43b806 ("scsi: ufs: core: Introduce a new clock_gating lock")
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20250128071207.75494-2-avri.altman@wdc.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull d_revalidate fix from Al Viro:
"Fix a braino in d_revalidate series: check ->d_op for NULL"
* tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix braino in "9p: fix ->rename_sem exclusion"
|
|
Pull outstanding fixes bound for this release into 6.14/scsi-fixes.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Jakub Kicinski says:
====================
MAINTAINERS: recognize Kuniyuki Iwashima as a maintainer
Kuniyuki Iwashima has been a prolific contributor and trusted reviewer
for some core portions of the networking stack for a couple of years now.
Formalize some obvious areas of his expertise and list him as a maintainer.
====================
Link: https://patch.msgid.link/20250202014728.1005003-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a MAINTAINERS entry for UNIX socket, Kuniyuki has been
the de-facto maintainer of this code for a while.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create a MAINTAINERS entry for BSD sockets. List the top 3
reviewers as maintainers. The entry is meant to cover core
socket code (of which there isn't much) but also reviews
of any new socket families.
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
List Kuniyuki as an official TCP reviewer.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Submissions to the docs seem to not get properly CCed.
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20250202005024.964262-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
ice: fix Rx data path for heavy 9k MTU traffic
Maciej Fijalkowski says:
This patchset fixes a pretty nasty issue that was reported by RedHat
folks which occurred after ~30 minutes (this value varied, just trying
here to state that it was not observed immediately but rather after a
considerable longer amount of time) when ice driver was tortured with
jumbo frames via mix of iperf traffic executed simultaneously with
wrk/nginx on client/server sides (HTTP and TCP workloads basically).
The reported splats were spanning across all the bad things that can
happen to the state of page - refcount underflow, use-after-free, etc.
One of these looked as follows:
[ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0
[ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0
[ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
[ 2084.057468] page dumped because: nonzero _refcount
[ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd
[ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1
[ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023
[ 2084.154604] Call Trace:
[ 2084.157058] <IRQ>
[ 2084.159080] dump_stack_lvl+0x34/0x48
[ 2084.162752] bad_page.cold+0x63/0x94
[ 2084.166333] check_new_pages+0xb3/0xe0
[ 2084.170083] rmqueue_bulk+0x2d2/0x9e0
[ 2084.173749] ? ktime_get+0x35/0xa0
[ 2084.177159] rmqueue_pcplist+0x13b/0x210
[ 2084.181081] rmqueue+0x7d3/0xd40
[ 2084.184316] ? xas_load+0x9/0xa0
[ 2084.187547] ? xas_find+0x183/0x1d0
[ 2084.191041] ? xa_find_after+0xd0/0x130
[ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0
[ 2084.199759] get_page_from_freelist+0x11f/0x530
[ 2084.204291] __alloc_pages+0xf2/0x250
[ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
[ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice]
[ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice]
[ 2084.221330] __napi_poll+0x27/0x170
[ 2084.224824] net_rx_action+0x233/0x2f0
[ 2084.228575] __do_softirq+0xc7/0x2ac
[ 2084.232155] __irq_exit_rcu+0xa1/0xc0
[ 2084.235821] common_interrupt+0x80/0xa0
[ 2084.239662] </IRQ>
[ 2084.241768] <TASK>
The fix is mostly about reverting what was done in commit 1dc1a7e7f410
("ice: Centrallize Rx buffer recycling") followed by proper timing on
page_count() storage and then removing the ice_rx_buf::act related logic
(which was mostly introduced for purposes from cited commit).
Special thanks to Xu Du for providing reproducer and Jacob Keller for
initial extensive analysis.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: stop storing XDP verdict within ice_rx_buf
ice: gather page_count()'s of each frag right before XDP prog call
ice: put Rx buffers after being done with current frame
====================
Link: https://patch.msgid.link/20250131185415.3741532-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
->d_op can bloody well be NULL
Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 30d61efe118c "9p: fix ->rename_sem exclusion"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|