summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-09-28x86/sgx: Resolves SECS reclaim vs. page fault for EAUG raceHaitao Huang
The SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC page for an enclave and set secs.epc_page to NULL. The SECS page is used for EAUG and ELDU in the SGX page fault handler. However, the NULL check for secs.epc_page is only done for ELDU, not EAUG before being used. Fix this by doing the same NULL check and reloading of the SECS page as needed for both EAUG and ELDU. The SECS page holds global enclave metadata. It can only be reclaimed when there are no other enclave pages remaining. At that point, virtually nothing can be done with the enclave until the SECS page is paged back in. An enclave can not run nor generate page faults without a resident SECS page. But it is still possible for a #PF for a non-SECS page to race with paging out the SECS page: when the last resident non-SECS page A triggers a #PF in a non-resident page B, and then page A and the SECS both are paged out before the #PF on B is handled. Hitting this bug requires that race triggered with a #PF for EAUG. Following is a trace when it happens. BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 Call Trace: ? __kmem_cache_alloc_node+0x16a/0x440 ? xa_load+0x6e/0xa0 sgx_vma_fault+0x119/0x230 __do_fault+0x36/0x140 do_fault+0x12f/0x400 __handle_mm_fault+0x728/0x1110 handle_mm_fault+0x105/0x310 do_user_addr_fault+0x1ee/0x750 ? __this_cpu_preempt_check+0x13/0x20 exc_page_fault+0x76/0x180 asm_exc_page_fault+0x27/0x30 Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Kai Huang <kai.huang@intel.com> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20230728051024.33063-1-haitao.huang%40linux.intel.com
2023-09-29Merge tag 'drm-misc-fixes-2023-09-28' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * ivpu: * Add PCI ids for Arrow Lake * Fix memory corruption during IPC * Avoid dmesg flooding * 40xx: Wait for clock resource * 40xx: Fix interrupt usage * 40xx: Support caching when loading firmware Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230928081208.GA7881@linux-uq9g
2023-09-28Merge tag 'irqchip-fixes-6.6-1' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zygnier: - Fix QC PDC v3.2 support by working around broken firmware tables - Fix rzg2l-irqc missing #interrupt-cells description in the DT binding - Fix rzg2l-irqc interrupt masking Link: https://lore.kernel.org/lkml/20230924094105.2361754-1-maz@kernel.org
2023-09-28sched/rt: Fix live lock between select_fallback_rq() and RT pushJoel Fernandes (Google)
During RCU-boost testing with the TREE03 rcutorture config, I found that after a few hours, the machine locks up. On tracing, I found that there is a live lock happening between 2 CPUs. One CPU has an RT task running, while another CPU is being offlined which also has an RT task running. During this offlining, all threads are migrated. The migration thread is repeatedly scheduled to migrate actively running tasks on the CPU being offlined. This results in a live lock because select_fallback_rq() keeps picking the CPU that an RT task is already running on only to get pushed back to the CPU being offlined. It is anyway pointless to pick CPUs for pushing tasks to if they are being offlined only to get migrated away to somewhere else. This could also add unwanted latency to this task. Fix these issues by not selecting CPUs in RT if they are not 'active' for scheduling, using the cpu_active_mask. Other parts in core.c already use cpu_active_mask to prevent tasks from being put on CPUs going offline. With this fix I ran the tests for days and could not reproduce the hang. Without the patch, I hit it in a few hours. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230923011409.3522762-1-joel@joelfernandes.org
2023-09-28fs/smb/client: Reset password pointer to NULLQuang Le
Forget to reset ctx->password to NULL will lead to bug like double free Cc: stable@vger.kernel.org Cc: Willy Tarreau <w@1wt.eu> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Quang Le <quanglex97@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-28Merge tag 'spi-fix-v6.6-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A small set of device specific fixes, the most major one is for the GXP driver which would probably have been confusing some callers with returning the length rather than 0 on successful writes" * tag 'spi-fix-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-gxp: BUG: Correct spi write return value dt-bindings: spi: fsl-imx-cspi: Document missing entries spi: cs42l43: Remove spurious pm_runtime_disable
2023-09-28Merge tag 'loongarch-fixes-6.6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix high_memory calculation and module loader errors with latest binutils" * tag 'loongarch-fixes-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Add support for 64_PCREL relocation type LoongArch: Add support for 32_PCREL relocation type LoongArch: Define relocation types for ABI v2.10 LoongArch: numa: Fix high_memory calculation
2023-09-28Merge tag 'mips-fixes_6.6_1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fix from Thomas Bogendoerfer: - fix Alchemy build with MMC support disabled * tag 'mips-fixes_6.6_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
2023-09-28iomap: Spelling s/preceeding/preceding/gGeert Uytterhoeven
Fix a misspelling of "preceding". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-09-28nfs: decrement nrequests counter before releasing the reqJeff Layton
I hit this panic in testing: [ 6235.500016] run fstests generic/464 at 2023-09-18 22:51:24 [ 6288.410761] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 6288.412174] #PF: supervisor read access in kernel mode [ 6288.413160] #PF: error_code(0x0000) - not-present page [ 6288.413992] PGD 0 P4D 0 [ 6288.414603] Oops: 0000 [#1] PREEMPT SMP PTI [ 6288.415419] CPU: 0 PID: 340798 Comm: kworker/u18:8 Not tainted 6.6.0-rc1-gdcf620ceebac #95 [ 6288.416538] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 [ 6288.417701] Workqueue: nfsiod rpc_async_release [sunrpc] [ 6288.418676] RIP: 0010:nfs_inode_remove_request+0xc8/0x150 [nfs] [ 6288.419836] Code: ff ff 48 8b 43 38 48 8b 7b 10 a8 04 74 5b 48 85 ff 74 56 48 8b 07 a9 00 00 08 00 74 58 48 8b 07 f6 c4 10 74 50 e8 c8 44 b3 d5 <48> 8b 00 f0 48 ff 88 30 ff ff ff 5b 5d 41 5c c3 cc cc cc cc 48 8b [ 6288.422389] RSP: 0018:ffffbd618353bda8 EFLAGS: 00010246 [ 6288.423234] RAX: 0000000000000000 RBX: ffff9a29f9a25280 RCX: 0000000000000000 [ 6288.424351] RDX: ffff9a29f9a252b4 RSI: 000000000000000b RDI: ffffef41448e3840 [ 6288.425345] RBP: ffffef41448e3840 R08: 0000000000000038 R09: ffffffffffffffff [ 6288.426334] R10: 0000000000033f80 R11: ffff9a2a7fffa000 R12: ffff9a29093f98c4 [ 6288.427353] R13: 0000000000000000 R14: ffff9a29230f62e0 R15: ffff9a29230f62d0 [ 6288.428358] FS: 0000000000000000(0000) GS:ffff9a2a77c00000(0000) knlGS:0000000000000000 [ 6288.429513] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6288.430427] CR2: 0000000000000000 CR3: 0000000264748002 CR4: 0000000000770ef0 [ 6288.431553] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6288.432715] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6288.433698] PKRU: 55555554 [ 6288.434196] Call Trace: [ 6288.434667] <TASK> [ 6288.435132] ? __die+0x1f/0x70 [ 6288.435723] ? page_fault_oops+0x159/0x450 [ 6288.436389] ? try_to_wake_up+0x98/0x5d0 [ 6288.437044] ? do_user_addr_fault+0x65/0x660 [ 6288.437728] ? exc_page_fault+0x7a/0x180 [ 6288.438368] ? asm_exc_page_fault+0x22/0x30 [ 6288.439137] ? nfs_inode_remove_request+0xc8/0x150 [nfs] [ 6288.440112] ? nfs_inode_remove_request+0xa0/0x150 [nfs] [ 6288.440924] nfs_commit_release_pages+0x16e/0x340 [nfs] [ 6288.441700] ? __pfx_call_transmit+0x10/0x10 [sunrpc] [ 6288.442475] ? _raw_spin_lock_irqsave+0x23/0x50 [ 6288.443161] nfs_commit_release+0x15/0x40 [nfs] [ 6288.443926] rpc_free_task+0x36/0x60 [sunrpc] [ 6288.444741] rpc_async_release+0x29/0x40 [sunrpc] [ 6288.445509] process_one_work+0x171/0x340 [ 6288.446135] worker_thread+0x277/0x3a0 [ 6288.446724] ? __pfx_worker_thread+0x10/0x10 [ 6288.447376] kthread+0xf0/0x120 [ 6288.447903] ? __pfx_kthread+0x10/0x10 [ 6288.448500] ret_from_fork+0x2d/0x50 [ 6288.449078] ? __pfx_kthread+0x10/0x10 [ 6288.449665] ret_from_fork_asm+0x1b/0x30 [ 6288.450283] </TASK> [ 6288.450688] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc nls_iso8859_1 nls_cp437 vfat fat 9p netfs ext4 kvm_intel crc16 mbcache jbd2 joydev kvm xfs irqbypass virtio_net pcspkr net_failover psmouse failover 9pnet_virtio cirrus drm_shmem_helper virtio_balloon drm_kms_helper button evdev drm loop dm_mod zram zsmalloc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 sha512_generic virtio_blk nvme aesni_intel crypto_simd cryptd nvme_core t10_pi i6300esb crc64_rocksoft_generic crc64_rocksoft crc64 virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring serio_raw btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq autofs4 [ 6288.460211] CR2: 0000000000000000 [ 6288.460787] ---[ end trace 0000000000000000 ]--- [ 6288.461571] RIP: 0010:nfs_inode_remove_request+0xc8/0x150 [nfs] [ 6288.462500] Code: ff ff 48 8b 43 38 48 8b 7b 10 a8 04 74 5b 48 85 ff 74 56 48 8b 07 a9 00 00 08 00 74 58 48 8b 07 f6 c4 10 74 50 e8 c8 44 b3 d5 <48> 8b 00 f0 48 ff 88 30 ff ff ff 5b 5d 41 5c c3 cc cc cc cc 48 8b [ 6288.465136] RSP: 0018:ffffbd618353bda8 EFLAGS: 00010246 [ 6288.465963] RAX: 0000000000000000 RBX: ffff9a29f9a25280 RCX: 0000000000000000 [ 6288.467035] RDX: ffff9a29f9a252b4 RSI: 000000000000000b RDI: ffffef41448e3840 [ 6288.468093] RBP: ffffef41448e3840 R08: 0000000000000038 R09: ffffffffffffffff [ 6288.469121] R10: 0000000000033f80 R11: ffff9a2a7fffa000 R12: ffff9a29093f98c4 [ 6288.470109] R13: 0000000000000000 R14: ffff9a29230f62e0 R15: ffff9a29230f62d0 [ 6288.471106] FS: 0000000000000000(0000) GS:ffff9a2a77c00000(0000) knlGS:0000000000000000 [ 6288.472216] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6288.473059] CR2: 0000000000000000 CR3: 0000000264748002 CR4: 0000000000770ef0 [ 6288.474096] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6288.475097] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6288.476148] PKRU: 55555554 [ 6288.476665] note: kworker/u18:8[340798] exited with irqs disabled Once we've released "req", it's not safe to dereference it anymore. Decrement the nrequests counter before dropping the reference. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-09-28erofs: update documentationJingbo Xu
- update new features like bloom filter and DEFLATE. - add documentation for the long xattr name prefixes, which was landed upstream since v6.4. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230928134852.31118-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-09-28NFSD: Fix zero NFSv4 READ results when RQ_SPLICE_OK is not setChuck Lever
nfsd4_encode_readv() uses xdr->buf->page_len as a starting point for the nfsd_iter_read() sink buffer -- page_len is going to be offset by the parts of the COMPOUND that have already been encoded into xdr->buf->pages. However, that value must be captured /before/ xdr_reserve_space_vec() advances page_len by the expected size of the read payload. Otherwise, the whole front part of the first page of the payload in the reply will be uninitialized. Mantas hit this because sec=krb5i forces RQ_SPLICE_OK off, which invokes the readv part of the nfsd4_encode_read() path. Also, older Linux NFS clients appear to send shorter READ requests for files smaller than a page, whereas newer clients just send page-sized requests and let the server send as many bytes as are in the file. Reported-by: Mantas Mikulėnas <grawity@gmail.com> Closes: https://lore.kernel.org/linux-nfs/f1d0b234-e650-0f6e-0f5d-126b3d51d1eb@gmail.com/ Fixes: 703d75215555 ("NFSD: Hoist rq_vec preparation into nfsd_read() [step two]") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-09-28ata: libata-eh: Fix compilation warning in ata_eh_link_report()Damien Le Moal
The 6 bytes length of the tries_buf string in ata_eh_link_report() is too short and results in a gcc compilation warning with W-!: drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’: drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~ drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~ drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2372 | ap->eh_tries); | ~~~~~~~~~~~~~ Avoid this warning by increasing the string size to 16B. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28ata: libata-core: Fix compilation warning in ata_dev_config_ncq()Damien Le Moal
The 24 bytes length allocated to the ncq_desc string in ata_dev_config_lba() for ata_dev_config_ncq() to use is too short, causing the following gcc compilation warnings when compiling with W=1: drivers/ata/libata-core.c: In function ‘ata_dev_configure’: drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=] 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~ In function ‘ata_dev_config_ncq’, inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8, inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9: drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32] 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~~~~~~~~~~~~~~~~~~~~ drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24 2378 | snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2379 | ddepth, aa_desc); | ~~~~~~~~~~~~~~~~ Avoid these warnings and the potential truncation by changing the size of the ncq_desc string to 32 characters. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28scsi: sd: Do not issue commands to suspended disks on shutdownDamien Le Moal
If an error occurs when resuming a host adapter before the devices attached to the adapter are resumed, the adapter low level driver may remove the scsi host, resulting in a call to sd_remove() for the disks of the host. This in turn results in a call to sd_shutdown() which will issue a synchronize cache command and a start stop unit command to spindown the disk. sd_shutdown() issues the commands only if the device is not already runtime suspended but does not check the power state for system-wide suspend/resume. That is, the commands may be issued with the device in a suspended state, which causes PM resume to hang, forcing a reset of the machine to recover. Fix this by tracking the suspended state of a disk by introducing the suspended boolean field in the scsi_disk structure. This flag is set to true when the disk is suspended is sd_suspend_common() and resumed with sd_resume(). When suspended is true, sd_shutdown() is not executed from sd_remove(). Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28ata: libata-core: Do not register PM operations for SAS portsDamien Le Moal
libsas does its own domain based power management of ports. For such ports, libata should not use a device type defining power management operations as executing these operations for suspend/resume in addition to libsas calls to ata_sas_port_suspend() and ata_sas_port_resume() is not necessary (and likely dangerous to do, even though problems are not seen currently). Introduce the new ata_port_sas_type device_type for ports managed by libsas. This new device type is used in ata_tport_add() and is defined without power management operations. Fixes: 2fcbdcb4c802 ("[SCSI] libata: export ata_port suspend/resume infrastructure for sas") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28ata: libata-scsi: Fix delayed scsi_rescan_device() executionDamien Le Moal
Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") modified ata_scsi_dev_rescan() to check the scsi device "is_suspended" power field to ensure that the scsi device associated with an ATA device is fully resumed when scsi_rescan_device() is executed. However, this fix is problematic as: 1) It relies on a PM internal field that should not be used without PM device locking protection. 2) The check for is_suspended and the call to scsi_rescan_device() are not atomic and a suspend PM event may be triggered between them, casuing scsi_rescan_device() to be called on a suspended device and in that function blocking while holding the scsi device lock. This would deadlock a following resume operation. These problems can trigger PM deadlocks on resume, especially with resume operations triggered quickly after or during suspend operations. E.g., a simple bash script like: for (( i=0; i<10; i++ )); do echo "+2 > /sys/class/rtc/rtc0/wakealarm echo mem > /sys/power/state done that triggers a resume 2 seconds after starting suspending a system can quickly lead to a PM deadlock preventing the system from correctly resuming. Fix this by replacing the check on is_suspended with a check on the return value given by scsi_rescan_device() as that function will fail if called against a suspended device. Also make sure rescan tasks already scheduled are first cancelled before suspending an ata port. Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28scsi: Do not attempt to rescan suspended devicesDamien Le Moal
scsi_rescan_device() takes a scsi device lock before executing a device handler and device driver rescan methods. Waiting for the completion of any command issued to the device by these methods will thus be done with the device lock held. As a result, there is a risk of deadlocking within the power management code if scsi_rescan_device() is called to handle a device resume with the associated scsi device not yet resumed. Avoid such situation by checking that the target scsi device is in the running state, that is, fully capable of executing commands, before proceeding with the rescan and bailout returning -EWOULDBLOCK otherwise. With this error return, the caller can retry rescaning the device after a delay. The state check is done with the device lock held and is thus safe against incoming suspend power management operations. Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
2023-09-28ata: libata-scsi: Disable scsi device manage_system_start_stopDamien Le Moal
The introduction of a device link to create a consumer/supplier relationship between the scsi device of an ATA device and the ATA port of that ATA device fixes the ordering of system suspend and resume operations. For suspend, the scsi device is suspended first and the ata port after it. This is fine as this allows the synchronize cache and START STOP UNIT commands issued by the scsi disk driver to be executed before the ata port is disabled. For resume operations, the ata port is resumed first, followed by the scsi device. This allows having the request queue of the scsi device to be unfrozen after the ata port resume is scheduled in EH, thus avoiding to see new requests prematurely issued to the ATA device. Since libata sets manage_system_start_stop to 1, the scsi disk resume operation also results in issuing a START STOP UNIT command to the device being resumed so that the device exits standby power mode. However, restoring the ATA device to the active power mode must be synchronized with libata EH processing of the port resume operation to avoid either 1) seeing the start stop unit command being received too early when the port is not yet resumed and ready to accept commands, or after the port resume process issues commands such as IDENTIFY to revalidate the device. In this last case, the risk is that the device revalidation fails with timeout errors as the drive is still spun down. Commit 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") disabled issuing the START STOP UNIT command to avoid issues with it. But this is incorrect as transitioning a device to the active power mode from the standby power mode set on suspend requires a media access command. The IDENTIFY, READ LOG and SET FEATURES commands executed in libata EH context triggered by the ata port resume operation may thus fail. Fix these synchronization issues is by handling a device power mode transitions for system suspend and resume directly in libata EH context, without relying on the scsi disk driver management triggered with the manage_system_start_stop flag. To do this, the following libata helper functions are introduced: 1) ata_dev_power_set_standby(): This function issues a STANDBY IMMEDIATE command to transitiom a device to the standby power mode. For HDDs, this spins down the disks. This function applies only to ATA and ZAC devices and does nothing otherwise. This function also does nothing for devices that have the ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag set. For suspend, call ata_dev_power_set_standby() in ata_eh_handle_port_suspend() before the port is disabled and frozen. ata_eh_unload() is also modified to transition all enabled devices to the standby power mode when the system is shutdown or devices removed. 2) ata_dev_power_set_active() and This function applies to ATA or ZAC devices and issues a VERIFY command for 1 sector at LBA 0 to transition the device to the active power mode. For HDDs, since this function will complete only once the disk spin up. Its execution uses the same timeouts as for reset, to give the drive enough time to complete spinup without triggering a command timeout. For resume, call ata_dev_power_set_active() in ata_eh_revalidate_and_attach() after the port has been enabled and before any other command is issued to the device. With these changes, the manage_system_start_stop and no_start_on_resume scsi device flags do not need to be set in ata_scsi_dev_config(). The flag manage_runtime_start_stop is still set to allow the sd driver to spinup/spindown a disk through the sd runtime operations. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28scsi: sd: Differentiate system and runtime start/stop managementDamien Le Moal
The underlying device and driver of a SCSI disk may have different system and runtime power mode control requirements. This is because runtime power management affects only the SCSI disk, while system level power management affects all devices, including the controller for the SCSI disk. For instance, issuing a START STOP UNIT command when a SCSI disk is runtime suspended and resumed is fine: the command is translated to a STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY command to wake it up. The SCSI disk runtime operations have no effect on the ata port device used to connect the ATA disk. However, for system suspend/resume operations, the ATA port used to connect the device will also be suspended and resumed, with the resume operation requiring re-validating the device link and the device itself. In this case, issuing a VERIFY command to spinup the disk must be done before starting to revalidate the device, when the ata port is being resumed. In such case, we must not allow the SCSI disk driver to issue START STOP UNIT commands. Allow a low level driver to refine the SCSI disk start/stop management by differentiating system and runtime cases with two new SCSI device flags: manage_system_start_stop and manage_runtime_start_stop. These new flags replace the current manage_start_stop flag. Drivers setting the manage_start_stop are modifed to set both new flags, thus preserving the existing start/stop management behavior. For backward compatibility, the old manage_start_stop sysfs device attribute is kept as a read-only attribute showing a value of 1 for devices enabling both new flags and 0 otherwise. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28ata: libata-scsi: link ata port and scsi deviceDamien Le Moal
There is no direct device ancestry defined between an ata_device and its scsi device which prevents the power management code from correctly ordering suspend and resume operations. Create such ancestry with the ata device as the parent to ensure that the scsi device (child) is suspended before the ata device and that resume handles the ata device before the scsi device. The parent-child (supplier-consumer) relationship is established between the ata_port (parent) and the scsi device (child) with the function device_add_link(). The parent used is not the ata_device as the PM operations are defined per port and the status of all devices connected through that port is controlled from the port operations. The device link is established with the new function ata_scsi_slave_alloc(), and this function is used to define the ->slave_alloc callback of the scsi host template of all ata drivers. Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com>
2023-09-28ata: libata-core: Fix port and device removalDamien Le Moal
Whenever an ATA adapter driver is removed (e.g. rmmod), ata_port_detach() is called repeatedly for all the adapter ports to remove (unload) the devices attached to the port and delete the port device itself. Removing of devices is done using libata EH with the ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute ata_eh_unload() which disables all devices attached to the port. ata_port_detach() finishes by calling scsi_remove_host() to remove the scsi host associated with the port. This function will trigger the removal of all scsi devices attached to the host and in the case of disks, calls to sd_shutdown() which will flush the device write cache and stop the device. However, given that the devices were already disabled by ata_eh_unload(), the synchronize write cache command and start stop unit commands fail. E.g. running "rmmod ahci" with first removing sd_mod results in error messages like: ata13.00: disable device sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] Stopping disk sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Fix this by removing all scsi devices of the ata devices connected to the port before scheduling libata EH to disable the ATA devices. Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-28ata: libata-core: Fix ata_port_request_pm() lockingDamien Le Moal
The function ata_port_request_pm() checks the port flag ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to ensure that power management operations for a port are not scheduled simultaneously. However, this flag check is done without holding the port lock. Fix this by taking the port lock on entry to the function and checking the flag under this lock. The lock is released and re-taken if ata_port_wait_eh() needs to be called. The two WARN_ON() macros checking that the ATA_PFLAG_PM_PENDING flag was cleared are removed as the first call is racy and the second one done without holding the port lock. Fixes: 5ef41082912b ("ata: add ata port system PM callbacks") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
2023-09-28fs/ntfs3: Avoid possible memory leakSu Hui
smatch warn: fs/ntfs3/fslog.c:2172 last_log_lsn() warn: possible memory leak of 'page_bufs' Jump to label 'out' to free 'page_bufs' and is more consistent with other code. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix directory element type detectionGabriel Marcano
Calling stat() from userspace correctly identified junctions in an NTFS partition as symlinks, but using readdir() and iterating through the directory containing the same junction did not identify the junction as a symlink. When emitting directory contents, check FILE_ATTRIBUTE_REPARSE_POINT attribute to detect junctions and report them as links. Signed-off-by: Gabriel Marcano <gabemarcano@yahoo.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix possible null-pointer dereference in hdr_find_e()Ziqi Zhao
Upon investigation of the C reproducer provided by Syzbot, it seemed the reproducer was trying to mount a corrupted NTFS filesystem, then issue a rename syscall to some nodes in the filesystem. This can be shown by modifying the reproducer to only include the mount syscall, and investigating the filesystem by e.g. `ls` and `rm` commands. As a result, during the problematic call to `hdr_fine_e`, the `inode` being supplied did not go through `indx_init`, hence the `cmp` function pointer was never set. The fix is simply to check whether `cmp` is not set, and return NULL if that's the case, in order to be consistent with other error scenarios of the `hdr_find_e` method. The rationale behind this patch is that: - We should prevent crashing the kernel even if the mounted filesystem is corrupted. Any syscalls made on the filesystem could return invalid, but the kernel should be able to sustain these calls. - Only very specific corruption would lead to this bug, so it would be a pretty rare case in actual usage anyways. Therefore, introducing a check to specifically protect against this bug seems appropriate. Because of its rarity, an `unlikely` clause is used to wrap around this nullity check. Reported-by: syzbot+60cf892fc31d1f4358fc@syzkaller.appspotmail.com Signed-off-by: Ziqi Zhao <astrajoan@yahoo.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix OOB read in ntfs_init_from_bootPavel Skripkin
Syzbot was able to create a device which has the last sector of size 512. After failing to boot from initial sector, reading from boot info from offset 511 causes OOB read. To prevent such reports add sanity check to validate if size of buffer_head if big enough to hold ntfs3 bootinfo Fixes: 6a4cd3ea7d77 ("fs/ntfs3: Alternative boot if primary boot is corrupted") Reported-by: syzbot+53ce40c8c0322c06aea5@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: fix panic about slab-out-of-bounds caused by ntfs_list_ea()Zeng Heng
Here is a BUG report about linux-6.1 from syzbot, but it still remains within upstream: BUG: KASAN: slab-out-of-bounds in ntfs_list_ea fs/ntfs3/xattr.c:191 [inline] BUG: KASAN: slab-out-of-bounds in ntfs_listxattr+0x401/0x570 fs/ntfs3/xattr.c:710 Read of size 1 at addr ffff888021acaf3d by task syz-executor128/3632 Call Trace: kasan_report+0x139/0x170 mm/kasan/report.c:495 ntfs_list_ea fs/ntfs3/xattr.c:191 [inline] ntfs_listxattr+0x401/0x570 fs/ntfs3/xattr.c:710 vfs_listxattr fs/xattr.c:457 [inline] listxattr+0x293/0x2d0 fs/xattr.c:804 path_listxattr fs/xattr.c:828 [inline] __do_sys_llistxattr fs/xattr.c:846 [inline] Before derefering field members of `ea` in unpacked_ea_size(), we need to check whether the EA_FULL struct is located in access validate range. Similarly, when derefering `ea->name` field member, we need to check whethe the ea->name is located in access validate range, too. Fixes: be71b5cba2e6 ("fs/ntfs3: Add attrib operations") Reported-by: syzbot+9fcea5ef6dc4dc72d334@syzkaller.appspotmail.com Signed-off-by: Zeng Heng <zengheng4@huawei.com> [almaz.alexandrovich@paragon-software.com: took the ret variable out of the loop block] Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix NULL pointer dereference on error in attr_allocate_frame()Konstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix possible NULL-ptr-deref in ni_readpage_cmpr()Konstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Do not allow to change label if volume is read-onlyKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Add more info into /proc/fs/ntfs3/<dev>/volinfoKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Refactoring and commentsKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix alternative boot searchingKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Allow repeated call to ntfs3_put_sbiKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Use inode_set_ctime_to_ts instead of inode_set_ctimeKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Fix shift-out-of-bounds in ntfs_fill_superKonstantin Komarov
Reported-by: syzbot+478c1bf0e6bf4a8f3a04@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: fix deadlock in mark_as_free_exKonstantin Komarov
Reported-by: syzbot+e94d98936a0ed08bde43@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Add more attributes checks in mi_enum_attr()Konstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Use kvmalloc instead of kmalloc(... __GFP_NOWARN)Konstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Write immediately updated ntfs stateKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28fs/ntfs3: Add ckeck in ni_update_parent()Konstantin Komarov
Check simple case when parent inode equals current inode. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-09-28ice: always add legacy 32byte RXDID in supported_rxdidsMichal Schmidt
When the PF and VF drivers both support flexible rx descriptors and have negotiated the VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC capability, the VF driver queries the PF for the list of supported descriptor formats (VIRTCHNL_OP_GET_SUPPORTED_RXDIDS). The PF driver is supposed to set the supported_rxdids bits that correspond to the descriptor formats the firmware implements. The legacy 32-byte rx desc format is always supported, even though it is not expressed in GLFLXP_RXDID_FLAGS. The ice driver does not advertise the legacy 32-byte rx desc support, which leads to this failure to bring up the VF using the Intel out-of-tree iavf driver: iavf 0000:41:01.0: PF does not list support for default Rx descriptor format ... iavf 0000:41:01.0: PF returned error -5 (VIRTCHNL_STATUS_ERR_PARAM) to our request 6 The in-tree iavf driver does not expose this bug, because it does not yet implement VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC. The ice driver must always set the ICE_RXDID_LEGACY_1 bit in supported_rxdids. The Intel out-of-tree ice driver and the ice driver in DPDK both do this. I copied this piece of the code and the comment text from the Intel out-of-tree driver. Fixes: e753df8fbca5 ("ice: Add support Flex RXD") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20230920115439.61172-1-mschmidt@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-28dmaengine: fsl-edma: fix edma4 channel enable failure on second attemptFrank Li
When attempting to start DMA for the second time using fsl_edma3_enable_request(), channel never start. CHn_MUX must have a unique value when selecting a peripheral slot in the channel mux configuration. The only value that may overlap is source 0. If there is an attempt to write a mux configuration value that is already consumed by another channel, a mux configuration of 0 (SRC = 0) will be written. Check CHn_MUX before writing in fsl_edma3_enable_request(). Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20230823182635.2618118-1-Frank.Li@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-09-28dt-bindings: dmaengine: zynqmp_dma: add xlnx,bus-width required propertyRadhey Shyam Pandey
xlnx,bus-width is a required property. In yaml conversion somehow it got missed out. Bring it back and mention it in required list. Also add Harini and myself to the maintainer list. Fixes: 5a04982df8da ("dt-bindings: dmaengine: zynqmp_dma: convert to yaml") Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Michael Tretter <m.tretter@pengutronix.de> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Link: https://lore.kernel.org/r/1695216326-3841352-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-09-28dmaengine: fsl-dma: fix DMA error when enabling sg if 'DONE' bit is setFrank Li
In eDMAv3, clearing 'DONE' bit (bit 30) of CHn_CSR is required when enabling scatter-gather (SG). eDMAv4 does not require this change. Cc: stable@vger.kernel.org Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20230921144652.3259813-1-Frank.Li@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-09-28x86/srso: Add SRSO mitigation for Hygon processorsPu Wen
Add mitigation for the speculative return stack overflow vulnerability which exists on Hygon processors too. Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/tencent_4A14812842F104E93AA722EC939483CEFF05@qq.com
2023-09-28arm64: defconfig: enable syscon-poweroff driverKrzysztof Kozlowski
Enable the generic syscon-poweroff driver used on all Exynos ARM64 SoCs (e.g. Exynos5433) and few APM SoCs. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Link: https://lore.kernel.org/r/20230901115732.45854-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-09-28ARM: locomo: fix locomolcd_power declarationArnd Bergmann
The locomolcd driver has one remaining missing-prototype warning: drivers/video/backlight/locomolcd.c:83:6: error: no previous prototype for 'locomolcd_power' [-Werror=missing-prototypes] There is in fact an unused prototype with a similar name in a global header, so move the actual one there and remove the old one. Link: https://lore.kernel.org/r/20230927194844.680771-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-09-28Merge tag 'scmi-fix-6.6' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm SCMI fix for v6.6 A single fix to address scmi_perf_attributes_get() using the protocol version even before it was populated and ending up with unexpected bogowatts power scale. * tag 'scmi-fix-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Fixup perf power-cost/microwatt support Link: https://lore.kernel.org/r/20230927121604.158645-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>