summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-04KVM: arm64: Fail protected mode init if no vgic hardware is presentOliver Upton
Protected mode assumes that at minimum vgic-v3 is present, however KVM fails to actually enforce this at the time of initialization. As such, when running protected mode in a half-baked state on GICv2 hardware we see the hyp go belly up at vcpu_load() when it tries to restore the vgic-v3 cpuif: $ ./arch_timer_edge_cases [ 130.599140] kvm [4518]: nVHE hyp panic at: [<ffff800081102b58>] __kvm_nvhe___vgic_v3_restore_vmcr_aprs+0x8/0x84! [ 130.603685] kvm [4518]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE [ 130.611962] kvm [4518]: Hyp Offset: 0xfffeca95ed000000 [ 130.617053] Kernel panic - not syncing: HYP panic: [ 130.617053] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000 [ 130.617053] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000 [ 130.617053] VCPU:0000000000000000 [ 130.638013] CPU: 0 UID: 0 PID: 4518 Comm: arch_timer_edge Tainted: G C 6.13.0-rc3-00009-gf7d03fcbf1f4 #1 [ 130.648790] Tainted: [C]=CRAP [ 130.651721] Hardware name: Libre Computer AML-S905X-CC (DT) [ 130.657242] Call trace: [ 130.659656] show_stack+0x18/0x24 (C) [ 130.663279] dump_stack_lvl+0x38/0x90 [ 130.666900] dump_stack+0x18/0x24 [ 130.670178] panic+0x388/0x3e8 [ 130.673196] nvhe_hyp_panic_handler+0x104/0x208 [ 130.677681] kvm_arch_vcpu_load+0x290/0x548 [ 130.681821] vcpu_load+0x50/0x80 [ 130.685013] kvm_arch_vcpu_ioctl_run+0x30/0x868 [ 130.689498] kvm_vcpu_ioctl+0x2e0/0x974 [ 130.693293] __arm64_sys_ioctl+0xb4/0xec [ 130.697174] invoke_syscall+0x48/0x110 [ 130.700883] el0_svc_common.constprop.0+0x40/0xe0 [ 130.705540] do_el0_svc+0x1c/0x28 [ 130.708818] el0_svc+0x30/0xd0 [ 130.711837] el0t_64_sync_handler+0x10c/0x138 [ 130.716149] el0t_64_sync+0x198/0x19c [ 130.719774] SMP: stopping secondary CPUs [ 130.723660] Kernel Offset: disabled [ 130.727103] CPU features: 0x000,00000800,02800000,0200421b [ 130.732537] Memory Limit: none [ 130.735561] ---[ end Kernel panic - not syncing: HYP panic: [ 130.735561] PS:800003c9 PC:0000b56a94102b58 ESR:0000000002000000 [ 130.735561] FAR:ffff00007b98d4d0 HPFAR:00000000007b98d0 PAR:0000000000000000 [ 130.735561] VCPU:0000000000000000 ]--- Fix it by failing KVM initialization if the system doesn't implement vgic-v3, as protected mode will never do anything useful on such hardware. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/kvmarm/5ca7588c-7bf2-4352-8661-e4a56a9cd9aa@sirena.org.uk/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250203231543.233511-1-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-02-04platform/x86/intel/ifs: Update documentation with image download pathJithu Joseph
The documentation previously listed the path to download In Field Scan (IFS) test images as "TBD". Update the documentation to include the correct image download location. Also move the download link to the appropriate section within the documentation. Reported-by: Anisse Astier <anisse@astier.eu> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20250131205315.1585663-1-jithu.joseph@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-02-03scsi: ufs: qcom: Enable UFS Shared ICE FeatureRam Kumar Dwivedi
By default, the UFS controller allocates a fixed number of RX and TX engines statically. Consequently, when UFS reads are in progress, the TX ICE engines remain idle, and vice versa. This leads to inefficient utilization of RX and TX engines. To address this limitation, enable the UFS shared ICE feature for Qualcomm UFS V5.0 and above. This feature utilizes a pool of crypto cores for both TX streams (UFS Write – Encryption) and RX streams (UFS Read – Decryption). With this approach, crypto cores are dynamically allocated to either the RX or TX stream as needed. Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com> Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com> Co-developed-by: Nitin Rawat <quic_nitirawa@quicinc.com> Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com> Signed-off-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com> Link: https://lore.kernel.org/r/20250203112739.11425-1-quic_rdwivedi@quicinc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge patch series "Update lpfc to revision 14.4.0.8"Martin K. Petersen
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.4.0.8 This patch set contains fixes related to diagnostic logging, smatch, and ndlp ptr referencing issues. The patches were cut against Martin's 6.14/scsi-queue tree. Link: https://lore.kernel.org/r/20250131000524.163662-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Copyright updates for 14.4.0.8 patchesJustin Tee
Update copyrights to 2025 for files modified in the 14.4.0.8 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Update lpfc version to 14.4.0.8Justin Tee
Update lpfc version to 14.4.0.8 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routineJustin Tee
After a port swap between separate fabrics, there may be multiple nodes in the vport's fc_nodes list with the same fabric well known address. Duplication is temporary and eventually resolves itself after dev_loss_tmo expires, but nameserver queries may still occur before dev_loss_tmo. This possibly results in returning stale fabric ndlp objects. Fix by adding an nlp_state check to ensure the ndlp search routine returns the correct newer allocated ndlp fabric object. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Ignore ndlp rport mismatch in dev_loss_tmo callbkJustin Tee
With repeated port swaps between separate fabrics, there can be multiple registrations for fabric well known address 0xfffffe. This can cause ndlp reference confusion due to the usage of a single ndlp ptr that stores the rport object in fc_rport struct private storage during transport registration. Subsequent registrations update the ndlp->rport field with the newer rport, so when transport layer triggers dev_loss_tmo for the earlier registered rport the ndlp->rport private storage is referencing the newer rport instead of the older rport in dev_loss_tmo callbk. Because the older ndlp->rport object is already cleaned up elsewhere in driver code during the time of fabric swap, check that the rport provided in dev_loss_tmo callbk actually matches the rport stored in the LLDD's ndlp->rport field. Otherwise, skip dev_loss_tmo work on a stale rport. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Free phba irq in lpfc_sli4_enable_msi() when pci_irq_vector() failsJustin Tee
Fix smatch warning regarding missed calls to free_irq(). Free the phba IRQ in the failed pci_irq_vector cases. lpfc_init.c: lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' from request_irq() not released. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: lpfc: Reduce log message generation during ELS ring clean upJustin Tee
A clean up log message is output from lpfc_els_flush_cmd() for each outstanding ELS I/O and repeated for every NPIV instance. The log message should only be generated for active I/Os matching the NPIV vport. Thus, move the vport check to before logging the message. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20250131000524.163662-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge patch series "mpi3mr: Few Enhancements and minor fixes"Martin K. Petersen
Ranjan Kumar <ranjan.kumar@broadcom.com> says: Few Enhancements and minor fixes of mpi3mr driver. Link: https://lore.kernel.org/r/20250129100850.25430-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mpi3mr: Update driver version to 8.12.1.0.50Ranjan Kumar
Update driver version to 8.12.1.0.50 Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mpi3mr: Synchronous access b/w reset and tm thread for reply queueRanjan Kumar
When the task management thread processes reply queues while the reset thread resets them, the task management thread accesses an invalid queue ID (0xFFFF), set by the reset thread, which points to unallocated memory, causing a crash. Add flag 'io_admin_reset_sync' to synchronize access between the reset, I/O, and admin threads. Before a reset, the reset handler sets this flag to block I/O and admin processing threads. If any thread bypasses the initial check, the reset thread waits up to 10 seconds for processing to finish. If the wait exceeds 10 seconds, the controller is marked as unrecoverable. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mpi3mr: Support for Segmented Hardware Trace bufferRanjan Kumar
Allocate segmented trace buffer if firmware advertises the capability in IOCfacts. Upon driver load, read the trace buffer size from driver page 1, calculate the required segments for trace buffer, and allocate segmented buffers. Each segment is 4096 bytes in size. While posting driver diagnostic buffer to firmware, advertise that trace buffer is segmented. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mpi3mr: Avoid reply queue full conditionRanjan Kumar
To avoid reply queue full condition, update the driver to check IOCFacts capabilities for qfull. Update the operational reply queue's Consumer Index after processing 100 replies. If pending I/Os on a reply queue exceeds a threshold (reply_queue_depth - 200), then return I/O back to OS to retry. Also increase default admin reply queue size to 2K. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03net: harmonize tstats and dstatsPaolo Abeni
After the blamed commits below, some UDP tunnel use dstats for accounting. On the xmit path, all the UDP-base tunnels ends up using iptunnel_xmit_stats() for stats accounting, and the latter assumes the relevant (tunnel) network device uses tstats. The end result is some 'funny' stat report for the mentioned UDP tunnel, e.g. when no packet is actually dropped and a bunch of packets are transmitted: gnv2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue \ state UNKNOWN mode DEFAULT group default qlen 1000 link/ether ee:7d:09:87:90:ea brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 14916 23 0 15 0 0 TX: bytes packets errors dropped carrier collsns 0 1566 0 0 0 0 Address the issue ensuring the same binary layout for the overlapping fields of dstats and tstats. While this solution is a bit hackish, is smaller and with no performance pitfall compared to other alternatives i.e. supporting both dstat and tstat in iptunnel_xmit_stats() or reverting the blamed commit. With time we should possibly move all the IP-based tunnel (and virtual devices) to dstats. Fixes: c77200c07491 ("bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Fixes: 6fa6de302246 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Fixes: be226352e8dc ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/2e1c444cf0f63ae472baff29862c4c869be17031.1738432804.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03Merge branch 'ethtool-rss-minor-fixes-for-recent-rss-changes'Jakub Kicinski
Jakub Kicinski says: ==================== ethtool: rss: minor fixes for recent RSS changes Make sure RSS_GET messages are consistent in do and dump. Fix up a recently added safety check for RSS + queue offset. Adjust related tests so that they pass on devices which don't support RSS + queue offset. ==================== Link: https://patch.msgid.link/20250201013040.725123-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not ↵Jakub Kicinski
supported Vast majority of drivers does not support queue offset. Simply return if the rss context + queue ntuple fails. Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigureJakub Kicinski
Commit under Fixes adds ntuple rules but never deletes them. Fixes: 29a4bc1fe961 ("selftest: extend test_rss_context_queue_reconfigure for action addition") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03ethtool: ntuple: fix rss + ring_cookie checkJakub Kicinski
The info.flow_type is for RXFH commands, ntuple flow_type is inside the flow spec. The check currently does nothing, as info.flow_type is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS. Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in") Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03ethtool: rss: fix hiding unsupported fields in dumpsJakub Kicinski
Commit ec6e57beaf8b ("ethtool: rss: don't report key if device doesn't support it") intended to stop reporting key fields for additional rss contexts if device has a global hashing key. Later we added dump support and the filtering wasn't properly added there. So we end up reporting the key fields in dumps but not in dos: # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \ --json '{"header": {"dev-index":2}, "context": 1 }' { "header": { ... }, "context": 1, "indir": [0, 1, 2, 3, ...]] } # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get [ ... snip context 0 ... { "header": { ... }, "context": 1, "indir": [0, 1, 2, 3, ...], -> "input_xfrm": 255, -> "hfunc": 1, -> "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" } ] Hide these fields correctly. The drivers/net/hw/rss_ctx.py selftest catches this when run on a device with single key, already: # Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump: # Check| ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero") # Check failed {0} == {0} key is all zero not ok 8 rss_ctx.test_rss_context_dump Fixes: f6122900f4e2 ("ethtool: rss: support dumping RSS contexts") Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-04kthread: Fix return value on kzalloc() failure in kthread_affine_preferred()Yu-Chun Lin
kthread_affine_preferred() incorrectly returns 0 instead of -ENOMEM when kzalloc() fails. Return 'ret' to ensure the correct error code is propagated. Fixes: 4d13f4304fa4 ("kthread: Implement preferred affinity") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501301528.t0cZVbnq-lkp@intel.com/ Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-02-03scsi: cxlflash: Remove driverAndrew Donnellan
Remove the cxlflash driver for IBM CAPI Flash devices. The cxlflash driver has received minimal maintenance for some time, and the CAPI Flash hardware that uses it is no longer commercially available. Thanks to Uma Krishnan, Matthew Ochs and Manoj Kumar for their work on this driver over the years. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Link: https://lore.kernel.org/r/20250203072801.365551-2-ajd@linux.ibm.com Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mpt3sas: Remove unused config functionsDr. David Alan Gilbert
mpt3sas_config_get_manufacturing_pg7() and mpt3sas_config_get_sas_device_pg1() were added as part of 2012's commit f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") but haven't been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250127002851.113711-1-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: message: fusion: Remove unused mptscsih_target_reset()Dr. David Alan Gilbert
mptscsih_target_reset() was added in 2023 by commit e6629081fb12 ("scsi: message: fusion: Correct definitions for mptscsih_dev_reset()") but never used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250127002716.113641-1-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: mvsas: Remove unused mvs_phys_reset()Dr. David Alan Gilbert
mvs_phys_reset() was added in 2009's commit 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250127002601.113555-1-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: qla1280: Fix kernel oops when debug level > 2Magnus Lindholm
A null dereference or oops exception will eventually occur when qla1280.c driver is compiled with DEBUG_QLA1280 enabled and ql_debug_level > 2. I think its clear from the code that the intention here is sg_dma_len(s) not length of sg_next(s) when printing the debug info. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20250125095033.26188-1-linmag7@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: isci: Fix double word in commentsCharles Han
Remove the repeated word "for" in comments. Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/r/20250124081330.210724-1-hanchunchao@inspur.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge patch series "scsi: st: scsi_error: More reset patches"Martin K. Petersen
Kai Mäkisara <Kai.Makisara@kolumbus.fi> says: The first patch re-applies after device reset some settings changed by the user (partition, density, block size). The second and third patch address the case where more than one ULD access the same device. The Unit Attention (UA) sense data is sent only to one ULD and the others miss it. The st driver needs to find out if device reset or media change has happened. The second patch adds counters for New Media and Power On/Reset (POR) Unit Attentions to the scsi_device struct. The third one changes st so that these are used: if the value in the scsi_device struct does not match the one stored locally, the corresponding UA has happened. Use of the was_reset flag has been removed. The fourth patch adds a file to sysfs to tell the user if reads/writes to a tape are blocked following a device reset. Link: https://lore.kernel.org/r/20250120194925.44432-1-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: st: Add sysfs file position_lost_in_resetKai Mäkisara
If the value read from the file is 1, reads and writes from/to the device are blocked because the tape position may not match user's expectation (tape rewound after device reset). Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250201151106.25529-1-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: st: Modify st.c to use the new scsi_error countersKai Mäkisara
Compare the stored values of por_ctr and new_media_ctr against the values in the device struct. In case of mismatch, the Unit Attention corresponding to the counter has happened. This is a safeguard against another ULD catching the Unit Attention sense data. Macros scsi_get_ua_new_media_ctr and scsi_get_ua_por_ctr are added to read the current values of the counters. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-4-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: core: Add counters for New Media and Power On/Reset UNIT ATTENTIONsKai Mäkisara
The purpose of the counters is to enable all ULDs attached to a device to find out that a New Media or/and Power On/Reset Unit Attentions has/have been set, even if another ULD catches the Unit Attention as response to a SCSI command. The ULDs can read the counters and see if the values have changed from the previous check. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-3-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: st: Restore some drive settings after resetKai Mäkisara
Some of the allowed operations put the tape into a known position to continue operation assuming only the tape position has changed. But reset sets partition, density and block size to drive default values. These should be restored to the values before reset. Normally the current block size and density are stored by the drive. If the settings have been changed, the changed values have to be saved by the driver across reset. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250120194925.44432-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: Constify struct pci_error_handlersChristophe JAILLET
'struct pci_error_handlers' are not modified in these drivers. Constifying these structures moves some data to a read-only section, so increase overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 39049 6429 112 45590 b216 drivers/scsi/aacraid/linit.o After: ===== text data bss dec hex filename 39113 6365 112 45590 b216 drivers/scsi/aacraid/linit.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/efdec8425981e10fc398fa2ac599c9c45d930561.1737318548.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Fix error return with query responseSeunghui Lee
There is currently no mechanism to return error from query responses. Return the error and print the corresponding error message with it. Signed-off-by: Seunghui Lee <sh043.lee@samsung.com> Link: https://lore.kernel.org/r/20250118023808.24726-1-sh043.lee@samsung.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: storvsc: Set correct data length for sending SCSI command without payloadLong Li
In StorVSC, payload->range.len is used to indicate if this SCSI command carries payload. This data is allocated as part of the private driver data by the upper layer and may get passed to lower driver uninitialized. For example, the SCSI error handling mid layer may send TEST_UNIT_READY or REQUEST_SENSE while reusing the buffer from a failed command. The private data section may have stale data from the previous command. If the SCSI command doesn't carry payload, the driver may use this value as is for communicating with host, resulting in possible corruption. Fix this by always initializing this value. Fixes: be0cf6ca301c ("scsi: storvsc: Set the tablesize based on the information given by the host") Cc: stable@kernel.org Tested-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Long Li <longli@microsoft.com> Link: https://lore.kernel.org/r/1737601642-7759-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Fix use-after free in init error and remove pathsAndré Draszik
devm_blk_crypto_profile_init() registers a cleanup handler to run when the associated (platform-) device is being released. For UFS, the crypto private data and pointers are stored as part of the ufs_hba's data structure 'struct ufs_hba::crypto_profile'. This structure is allocated as part of the underlying ufshcd and therefore Scsi_host allocation. During driver release or during error handling in ufshcd_pltfrm_init(), this structure is released as part of ufshcd_dealloc_host() before the (platform-) device associated with the crypto call above is released. Once this device is released, the crypto cleanup code will run, using the just-released 'struct ufs_hba::crypto_profile'. This causes a use-after-free situation: Call trace: kfree+0x60/0x2d8 (P) kvfree+0x44/0x60 blk_crypto_profile_destroy_callback+0x28/0x70 devm_action_release+0x1c/0x30 release_nodes+0x6c/0x108 devres_release_all+0x98/0x100 device_unbind_cleanup+0x20/0x70 really_probe+0x218/0x2d0 In other words, the initialisation code flow is: platform-device probe ufshcd_pltfrm_init() ufshcd_alloc_host() scsi_host_alloc() allocation of struct ufs_hba creation of scsi-host devices devm_blk_crypto_profile_init() devm registration of cleanup handler using platform-device and during error handling of ufshcd_pltfrm_init() or during driver removal: ufshcd_dealloc_host() scsi_host_put() put_device(scsi-host) release of struct ufs_hba put_device(platform-device) crypto cleanup handler To fix this use-after free, change ufshcd_alloc_host() to register a devres action to automatically cleanup the underlying SCSI device on ufshcd destruction, without requiring explicit calls to ufshcd_dealloc_host(). This way: * the crypto profile and all other ufs_hba-owned resources are destroyed before SCSI (as they've been registered after) * a memleak is plugged in tc-dwc-g210-pci.c remove() as a side-effect * EXPORT_SYMBOL_GPL(ufshcd_dealloc_host) can be removed fully as it's not needed anymore * no future drivers using ufshcd_alloc_host() could ever forget adding the cleanup Fixes: cb77cb5abe1f ("blk-crypto: rename blk_keyslot_manager to blk_crypto_profile") Fixes: d76d9d7d1009 ("scsi: ufs: use devm_blk_ksm_init()") Cc: stable@vger.kernel.org Signed-off-by: André Draszik <andre.draszik@linaro.org> Link: https://lore.kernel.org/r/20250124-ufshcd-fix-v4-1-c5d0144aae59@linaro.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: core: Do not retry I/Os during depopulationIgor Pylypiv
Fail I/Os instead of retry to prevent user space processes from being blocked on the I/O completion for several minutes. Retrying I/Os during "depopulation in progress" or "depopulation restore in progress" results in a continuous retry loop until the depopulation completes or until the I/O retry loop is aborted due to a timeout by the scsi_cmd_runtime_exceeced(). Depopulation is slow and can take 24+ hours to complete on 20+ TB HDDs. Most I/Os in the depopulation retry loop end up taking several minutes before returning the failure to user space. Cc: stable@vger.kernel.org # 4.18.x: 2bbeb8d scsi: core: Handle depopulation and restoration in progress Cc: stable@vger.kernel.org # 4.18.x Fixes: e37c7d9a0341 ("scsi: core: sanitize++ in progress") Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20250131184408.859579-1-ipylypiv@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: core: Use GFP_NOIO to avoid circular locking dependencyRik van Riel
Filesystems can write to disk from page reclaim with __GFP_FS set. Marc found a case where scsi_realloc_sdev_budget_map() ends up in page reclaim with GFP_KERNEL, where it could try to take filesystem locks again, leading to a deadlock. WARNING: possible circular locking dependency detected 6.13.0 #1 Not tainted ------------------------------------------------------ kswapd0/70 is trying to acquire lock: ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0 but task is already holding lock: ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760 The full lockdep splat can be found in Marc's report: https://lkml.org/lkml/2025/1/24/1101 Avoid the potential deadlock by doing the allocation with GFP_NOIO, which prevents both filesystem and block layer recursion. Reported-by: Marc Aurèle La France <tsi@tuyoix.net> Signed-off-by: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250129104525.0ae8421e@fangorn Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: Fix toggling of clk_gating.state when clock gating is not allowedAvri Altman
This commit addresses an issue where clk_gating.state is being toggled in ufshcd_setup_clocks() even if clock gating is not allowed. The fix is to add a check for hba->clk_gating.is_initialized before toggling clk_gating.state in ufshcd_setup_clocks(). Since clk_gating.lock is now initialized unconditionally, it can no longer lead to the spinlock being used before it is properly initialized, but instead it is mostly for documentation purposes. Fixes: 1ab27c9cf8b6 ("ufs: Add support for clock gating") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250128071207.75494-3-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03scsi: ufs: core: Ensure clk_gating.lock is used only after initializationAvri Altman
Address a lockdep warning triggered by the use of the clk_gating.lock before it is properly initialized. The warning is as follows: [ 4.388838] INFO: trying to register non-static key. [ 4.395673] The code is fine but needs lockdep annotation, or maybe [ 4.402118] you didn't initialize this object before use? [ 4.407673] turning off the locking correctness validator. [ 4.413334] CPU: 5 UID: 0 PID: 58 Comm: kworker/u32:1 Not tainted 6.12-rc1 #185 [ 4.413343] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 4.413362] Call trace: [ 4.413364] show_stack+0x18/0x24 (C) [ 4.413374] dump_stack_lvl+0x90/0xd0 [ 4.413384] dump_stack+0x18/0x24 [ 4.413392] register_lock_class+0x498/0x4a8 [ 4.413400] __lock_acquire+0xb4/0x1b90 [ 4.413406] lock_acquire+0x114/0x310 [ 4.413413] _raw_spin_lock_irqsave+0x60/0x88 [ 4.413423] ufshcd_setup_clocks+0x2c0/0x490 [ 4.413433] ufshcd_init+0x198/0x10ec [ 4.413437] ufshcd_pltfrm_init+0x600/0x7c0 [ 4.413444] ufs_qcom_probe+0x20/0x58 [ 4.413449] platform_probe+0x68/0xd8 [ 4.413459] really_probe+0xbc/0x268 [ 4.413466] __driver_probe_device+0x78/0x12c [ 4.413473] driver_probe_device+0x40/0x11c [ 4.413481] __device_attach_driver+0xb8/0xf8 [ 4.413489] bus_for_each_drv+0x84/0xe4 [ 4.413495] __device_attach+0xfc/0x18c [ 4.413502] device_initial_probe+0x14/0x20 [ 4.413510] bus_probe_device+0xb0/0xb4 [ 4.413517] deferred_probe_work_func+0x8c/0xc8 [ 4.413524] process_scheduled_works+0x250/0x658 [ 4.413534] worker_thread+0x15c/0x2c8 [ 4.413542] kthread+0x134/0x200 [ 4.413550] ret_from_fork+0x10/0x20 To fix this issue, ensure that the spinlock is only used after it has been properly initialized before using it in ufshcd_setup_clocks(). Do that unconditionally as initializing a spinlock is a fast operation. Fixes: 209f4e43b806 ("scsi: ufs: core: Introduce a new clock_gating lock") Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250128071207.75494-2-avri.altman@wdc.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull d_revalidate fix from Al Viro: "Fix a braino in d_revalidate series: check ->d_op for NULL" * tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix braino in "9p: fix ->rename_sem exclusion"
2025-02-03Merge branch '6.14/scsi-queue' into 6.14/scsi-fixesMartin K. Petersen
Pull outstanding fixes bound for this release into 6.14/scsi-fixes. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-03Merge branch 'maintainers-recognize-kuniyuki-iwashima-as-a-maintainer'Jakub Kicinski
Jakub Kicinski says: ==================== MAINTAINERS: recognize Kuniyuki Iwashima as a maintainer Kuniyuki Iwashima has been a prolific contributor and trusted reviewer for some core portions of the networking stack for a couple of years now. Formalize some obvious areas of his expertise and list him as a maintainer. ==================== Link: https://patch.msgid.link/20250202014728.1005003-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add entry for UNIX socketsJakub Kicinski
Add a MAINTAINERS entry for UNIX socket, Kuniyuki has been the de-facto maintainer of this code for a while. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add a general entry for BSD socketsJakub Kicinski
Create a MAINTAINERS entry for BSD sockets. List the top 3 reviewers as maintainers. The entry is meant to cover core socket code (of which there isn't much) but also reviews of any new socket families. Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: add Kuniyuki Iwashima to TCP reviewersJakub Kicinski
List Kuniyuki as an official TCP reviewer. Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250202014728.1005003-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03MAINTAINERS: list openvswitch docs under its entryJakub Kicinski
Submissions to the docs seem to not get properly CCed. Acked-by: Ilya Maximets <i.maximets@ovn.org> Link: https://patch.msgid.link/20250202005024.964262-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== ice: fix Rx data path for heavy 9k MTU traffic Maciej Fijalkowski says: This patchset fixes a pretty nasty issue that was reported by RedHat folks which occurred after ~30 minutes (this value varied, just trying here to state that it was not observed immediately but rather after a considerable longer amount of time) when ice driver was tortured with jumbo frames via mix of iperf traffic executed simultaneously with wrk/nginx on client/server sides (HTTP and TCP workloads basically). The reported splats were spanning across all the bad things that can happen to the state of page - refcount underflow, use-after-free, etc. One of these looked as follows: [ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0 [ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0 [ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000 [ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 [ 2084.057468] page dumped because: nonzero _refcount [ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd [ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1 [ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023 [ 2084.154604] Call Trace: [ 2084.157058] <IRQ> [ 2084.159080] dump_stack_lvl+0x34/0x48 [ 2084.162752] bad_page.cold+0x63/0x94 [ 2084.166333] check_new_pages+0xb3/0xe0 [ 2084.170083] rmqueue_bulk+0x2d2/0x9e0 [ 2084.173749] ? ktime_get+0x35/0xa0 [ 2084.177159] rmqueue_pcplist+0x13b/0x210 [ 2084.181081] rmqueue+0x7d3/0xd40 [ 2084.184316] ? xas_load+0x9/0xa0 [ 2084.187547] ? xas_find+0x183/0x1d0 [ 2084.191041] ? xa_find_after+0xd0/0x130 [ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0 [ 2084.199759] get_page_from_freelist+0x11f/0x530 [ 2084.204291] __alloc_pages+0xf2/0x250 [ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice] [ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice] [ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice] [ 2084.221330] __napi_poll+0x27/0x170 [ 2084.224824] net_rx_action+0x233/0x2f0 [ 2084.228575] __do_softirq+0xc7/0x2ac [ 2084.232155] __irq_exit_rcu+0xa1/0xc0 [ 2084.235821] common_interrupt+0x80/0xa0 [ 2084.239662] </IRQ> [ 2084.241768] <TASK> The fix is mostly about reverting what was done in commit 1dc1a7e7f410 ("ice: Centrallize Rx buffer recycling") followed by proper timing on page_count() storage and then removing the ice_rx_buf::act related logic (which was mostly introduced for purposes from cited commit). Special thanks to Xu Du for providing reproducer and Jacob Keller for initial extensive analysis. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: stop storing XDP verdict within ice_rx_buf ice: gather page_count()'s of each frag right before XDP prog call ice: put Rx buffers after being done with current frame ==================== Link: https://patch.msgid.link/20250131185415.3741532-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03fix braino in "9p: fix ->rename_sem exclusion"Al Viro
->d_op can bloody well be NULL Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30d61efe118c "9p: fix ->rename_sem exclusion" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>