summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-03ata: libata-core: Set ATA_QCFLAG_RTF_FILLED in fill_result_tf()Igor Pylypiv
ATA_QCFLAG_RTF_FILLED is not specific to ahci and can be used generally to check if qc->result_tf contains valid data. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20240702024735.1152293-7-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03ata: libata-scsi: Do not pass ATA device id to ata_to_sense_error()Igor Pylypiv
ATA device id is not used in ata_to_sense_error(). Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20240702024735.1152293-6-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03ata: libata-scsi: Remove redundant sense_buffer memsetsIgor Pylypiv
SCSI layer clears sense_buffer in scsi_queue_rq() so there is no need for libata to clear it again. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20240702024735.1152293-5-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no errorIgor Pylypiv
SAT-5 revision 8 specification removed the text about the ANSI INCITS 431-2007 compliance which was requiring SCSI/ATA Translation (SAT) to return descriptor format sense data for the ATA PASS-THROUGH commands regardless of the setting of the D_SENSE bit. Let's honor the D_SENSE bit for ATA PASS-THROUGH commands while generating the "ATA PASS-THROUGH INFORMATION AVAILABLE" sense data. SAT-5 revision 7 ================ 12.2.2.8 Fixed format sense data Table 212 shows the fields returned in the fixed format sense data (see SPC-5) for ATA PASS-THROUGH commands. SATLs compliant with ANSI INCITS 431-2007, SCSI/ATA Translation (SAT) return descriptor format sense data for the ATA PASS-THROUGH commands regardless of the setting of the D_SENSE bit. SAT-5 revision 8 ================ 12.2.2.8 Fixed format sense data Table 211 shows the fields returned in the fixed format sense data (see SPC-5) for ATA PASS-THROUGH commands. Cc: stable@vger.kernel.org # 4.19+ Reported-by: Niklas Cassel <cassel@kernel.org> Closes: https://lore.kernel.org/linux-ide/Zn1WUhmLglM4iais@ryzen.lan Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240702024735.1152293-4-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1Igor Pylypiv
Current ata_gen_passthru_sense() code performs two actions: 1. Generates sense data based on the ATA 'status' and ATA 'error' fields. 2. Populates "ATA Status Return sense data descriptor" / "Fixed format sense data" with ATA taskfile fields. The problem is that #1 generates sense data even when a valid sense data is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into a separate function allows us to generate sense data only when there is no valid sense data (ATA_QCFLAG_SENSE_VALID is not set). As a bonus, we can now delete a FIXME comment in atapi_qc_complete() which states that we don't want to translate taskfile registers into sense descriptors for ATAPI. Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because SAT specification mandates that SATL shall return CHECK CONDITION if the CK_COND bit is set. The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard to read/understand. Improve the readability of the code by moving checks into self-explanatory boolean variables. Cc: stable@vger.kernel.org # 4.19+ Co-developed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03ata: libata-scsi: Fix offsets for the fixed format sense dataIgor Pylypiv
Correct the ATA PASS-THROUGH fixed format sense data offsets to conform to SPC-6 and SAT-5 specifications. Additionally, set the VALID bit to indicate that the INFORMATION field contains valid information. INFORMATION =========== SAT-5 Table 212 — "Fixed format sense data INFORMATION field for the ATA PASS-THROUGH commands" defines the following format: +------+------------+ | Byte | Field | +------+------------+ | 0 | ERROR | | 1 | STATUS | | 2 | DEVICE | | 3 | COUNT(7:0) | +------+------------+ SPC-6 Table 48 - "Fixed format sense data" specifies that the INFORMATION field starts at byte 3 in sense buffer resulting in the following offsets for the ATA PASS-THROUGH commands: +------------+-------------------------+ | Field | Offset in sense buffer | +------------+-------------------------+ | ERROR | 3 | | STATUS | 4 | | DEVICE | 5 | | COUNT(7:0) | 6 | +------------+-------------------------+ COMMAND-SPECIFIC INFORMATION ============================ SAT-5 Table 213 - "Fixed format sense data COMMAND-SPECIFIC INFORMATION field for ATA PASS-THROUGH" defines the following format: +------+-------------------+ | Byte | Field | +------+-------------------+ | 0 | FLAGS | LOG INDEX | | 1 | LBA (7:0) | | 2 | LBA (15:8) | | 3 | LBA (23:16) | +------+-------------------+ SPC-6 Table 48 - "Fixed format sense data" specifies that the COMMAND-SPECIFIC-INFORMATION field starts at byte 8 in sense buffer resulting in the following offsets for the ATA PASS-THROUGH commands: Offsets of these fields in the fixed sense format are as follows: +-------------------+-------------------------+ | Field | Offset in sense buffer | +-------------------+-------------------------+ | FLAGS | LOG INDEX | 8 | | LBA (7:0) | 9 | | LBA (15:8) | 10 | | LBA (23:16) | 11 | +-------------------+-------------------------+ Reported-by: Akshat Jain <akshatzen@google.com> Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense") Cc: stable@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20240702024735.1152293-2-ipylypiv@google.com Signed-off-by: Niklas Cassel <cassel@kernel.org>
2024-07-03Merge tag 'regulator-hw-enable-helper' of ↵Philipp Zabel
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator into reset/next regulator: Add helper to allow enable/disable in interrupt context Add a helper function that enables exclusive consumers to bypass locking and do an enable/disable from within interrupt context. Link: https://lore.kernel.org/r/988df019-00d4-4209-8716-39e82c565bf1@sirena.org.uk Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2024-07-03fat: Convert to new uid/gid option parsing helpersEric Sandeen
Convert to new uid/gid option parsing helpers Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/1a67d2a8-0aae-42a2-9c0f-21cd4cd87d13@redhat.com Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03fat: Convert to new mount apiEric Sandeen
vfat and msdos share a common set of options, with additional, unique options for each filesystem. Each filesystem calls common fc initialization and parsing routines, with an "is_vfat" parameter. For parsing, if the option is not found in the common parameter_spec, parsing is retried with the fs-specific parameter_spec. This patch leaves nls loading to fill_super, so the codepage and charset options are not validated as they are requested. This matches current behavior. It would be possible to test-load as each option is parsed, but that would make i.e. mount -o "iocharset=nope,iocharset=iso8859-1" fail, where it does not fail today because only the last iocharset option is considered. The obsolete "conv=" option is set up with an enum of acceptable values; currently invalid "conv=" options are rejected as such, even though the option is obsolete, so this patch preserves that behavior. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/a9411b02-5f8e-4e1e-90aa-0c032d66c312@redhat.com Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03fat: move debug into fat_mount_optionsEric Sandeen
Move the debug variable into fat_mount_options for consistency and to facilitate conversion to new mount API. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/f6155247-32ee-4cfe-b808-9102b17f7cd1@redhat.com Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03sctp: cancel a blocking accept when shutdown a listen socketXin Long
As David Laight noticed, "In a multithreaded program it is reasonable to have a thread blocked in accept(). With TCP a subsequent shutdown(listen_fd, SHUT_RDWR) causes the accept to fail. But nothing happens for SCTP." sctp_disconnect() is eventually called when shutdown a listen socket, but nothing is done in this function. This patch sets RCV_SHUTDOWN flag in sk->sk_shutdown there, and adds the check (sk->sk_shutdown & RCV_SHUTDOWN) to break and return in sctp_accept(). Note that shutdown() is only supported on TCP-style SCTP socket. Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-03cachefiles: add missing lock protection when pollingJingbo Xu
Add missing lock protection in poll routine when iterating xarray, otherwise: Even with RCU read lock held, only the slot of the radix tree is ensured to be pinned there, while the data structure (e.g. struct cachefiles_req) stored in the slot has no such guarantee. The poll routine will iterate the radix tree and dereference cachefiles_req accordingly. Thus RCU read lock is not adequate in this case and spinlock is needed here. Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode") Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-10-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: cyclic allocation of msg_id to avoid reuseBaokun Li
Reusing the msg_id after a maliciously completed reopen request may cause a read request to remain unprocessed and result in a hung, as shown below: t1 | t2 | t3 ------------------------------------------------- cachefiles_ondemand_select_req cachefiles_ondemand_object_is_close(A) cachefiles_ondemand_set_object_reopening(A) queue_work(fscache_object_wq, &info->work) ondemand_object_worker cachefiles_ondemand_init_object(A) cachefiles_ondemand_send_req(OPEN) // get msg_id 6 wait_for_completion(&req_A->done) cachefiles_ondemand_daemon_read // read msg_id 6 req_A cachefiles_ondemand_get_fd copy_to_user // Malicious completion msg_id 6 copen 6,-1 cachefiles_ondemand_copen complete(&req_A->done) // will not set the object to close // because ondemand_id && fd is valid. // ondemand_object_worker() is done // but the object is still reopening. // new open req_B cachefiles_ondemand_init_object(B) cachefiles_ondemand_send_req(OPEN) // reuse msg_id 6 process_open_req copen 6,A.size // The expected failed copen was executed successfully Expect copen to fail, and when it does, it closes fd, which sets the object to close, and then close triggers reopen again. However, due to msg_id reuse resulting in a successful copen, the anonymous fd is not closed until the daemon exits. Therefore read requests waiting for reopen to complete may trigger hung task. To avoid this issue, allocate the msg_id cyclically to avoid reusing the msg_id for a very short duration of time. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-9-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: wait for ondemand_object_worker to finish when dropping objectHou Tao
When queuing ondemand_object_worker() to re-open the object, cachefiles_object is not pinned. The cachefiles_object may be freed when the pending read request is completed intentionally and the related erofs is umounted. If ondemand_object_worker() runs after the object is freed, it will incur use-after-free problem as shown below. process A processs B process C process D cachefiles_ondemand_send_req() // send a read req X // wait for its completion // close ondemand fd cachefiles_ondemand_fd_release() // set object as CLOSE cachefiles_ondemand_daemon_read() // set object as REOPENING queue_work(fscache_wq, &info->ondemand_work) // close /dev/cachefiles cachefiles_daemon_release cachefiles_flush_reqs complete(&req->done) // read req X is completed // umount the erofs fs cachefiles_put_object() // object will be freed cachefiles_ondemand_deinit_obj_info() kmem_cache_free(object) // both info and object are freed ondemand_object_worker() When dropping an object, it is no longer necessary to reopen the object, so use cancel_work_sync() to cancel or wait for ondemand_object_worker() to finish. Fixes: 0a7e54c1959c ("cachefiles: resend an open request if the read request's object is closed") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-8-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: cancel all requests for the object that is being droppedBaokun Li
Because after an object is dropped, requests for that object are useless, cancel them to avoid causing other problems. This prepares for the later addition of cancel_work_sync(). After the reopen requests is generated, cancel it to avoid cancel_work_sync() blocking by waiting for daemon to complete the reopen requests. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-7-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: stop sending new request when dropping objectBaokun Li
Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles object is being dropped, and is set after the close request for the dropped object completes, and no new requests are allowed to be sent after this state. This prepares for the later addition of cancel_work_sync(). It prevents leftover reopen requests from being sent, to avoid processing unnecessary requests and to avoid cancel_work_sync() blocking by waiting for daemon to complete the reopen requests. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-6-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: propagate errors from vfs_getxattr() to avoid infinite loopBaokun Li
In cachefiles_check_volume_xattr(), the error returned by vfs_getxattr() is not passed to ret, so it ends up returning -ESTALE, which leads to an endless loop as follows: cachefiles_acquire_volume retry: ret = cachefiles_check_volume_xattr ret = -ESTALE xlen = vfs_getxattr // return -EIO // The ret is not updated when xlen < 0, so -ESTALE is returned. return ret // Supposed to jump out of the loop at this judgement. if (ret != -ESTALE) goto error_dir; cachefiles_bury_object // EIO causes rename failure goto retry; Hence propagate the error returned by vfs_getxattr() to avoid the above issue. Do the same in cachefiles_check_auxdata(). Fixes: 32e150037dce ("fscache, cachefiles: Store the volume coherency data") Fixes: 72b957856b0c ("cachefiles: Implement metadata/coherency data storage in xattrs") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-5-libaokun@huaweicloud.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()Baokun Li
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in cachefiles_withdraw_cookie+0x4d9/0x600 Read of size 8 at addr ffff888118efc000 by task kworker/u78:0/109 CPU: 13 PID: 109 Comm: kworker/u78:0 Not tainted 6.8.0-dirty #566 Call Trace: <TASK> kasan_report+0x93/0xc0 cachefiles_withdraw_cookie+0x4d9/0x600 fscache_cookie_state_machine+0x5c8/0x1230 fscache_cookie_worker+0x91/0x1c0 process_one_work+0x7fa/0x1800 [...] Allocated by task 117: kmalloc_trace+0x1b3/0x3c0 cachefiles_acquire_volume+0xf3/0x9c0 fscache_create_volume_work+0x97/0x150 process_one_work+0x7fa/0x1800 [...] Freed by task 120301: kfree+0xf1/0x2c0 cachefiles_withdraw_cache+0x3fa/0x920 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 do_exit+0x87a/0x29b0 [...] ================================================================== Following is the process that triggers the issue: p1 | p2 ------------------------------------------------------------ fscache_begin_lookup fscache_begin_volume_access fscache_cache_is_live(fscache_cache) cachefiles_daemon_release cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache fscache_withdraw_cache fscache_set_cache_state(cache, FSCACHE_CACHE_IS_WITHDRAWN); cachefiles_withdraw_objects(cache) fscache_wait_for_objects(fscache) atomic_read(&fscache_cache->object_count) == 0 fscache_perform_lookup cachefiles_lookup_cookie cachefiles_alloc_object refcount_set(&object->ref, 1); object->volume = volume fscache_count_object(vcookie->cache); atomic_inc(&fscache_cache->object_count) cachefiles_withdraw_volumes cachefiles_withdraw_volume fscache_withdraw_volume __cachefiles_free_volume kfree(cachefiles_volume) fscache_cookie_state_machine cachefiles_withdraw_cookie cache = object->volume->cache; // cachefiles_volume UAF !!! After setting FSCACHE_CACHE_IS_WITHDRAWN, wait for all the cookie lookups to complete first, and then wait for fscache_cache->object_count == 0 to avoid the cookie exiting after the volume has been freed and triggering the above issue. Therefore call fscache_withdraw_volume() before calling cachefiles_withdraw_objects(). This way, after setting FSCACHE_CACHE_IS_WITHDRAWN, only the following two cases will occur: 1) fscache_begin_lookup fails in fscache_begin_volume_access(). 2) fscache_withdraw_volume() will ensure that fscache_count_object() has been executed before calling fscache_wait_for_objects(). Fixes: fe2140e2f57f ("cachefiles: Implement volume support") Suggested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-4-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: fix slab-use-after-free in fscache_withdraw_volume()Baokun Li
We got the following issue in our fault injection stress test: ================================================================== BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798 CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #565 Call Trace: kasan_check_range+0xf6/0x1b0 fscache_withdraw_volume+0x2e1/0x370 cachefiles_withdraw_volume+0x31/0x50 cachefiles_withdraw_cache+0x3ad/0x900 cachefiles_put_unbind_pincount+0x1f6/0x250 cachefiles_daemon_release+0x13b/0x290 __fput+0x204/0xa00 task_work_run+0x139/0x230 Allocated by task 5820: __kmalloc+0x1df/0x4b0 fscache_alloc_volume+0x70/0x600 __fscache_acquire_volume+0x1c/0x610 erofs_fscache_register_volume+0x96/0x1a0 erofs_fscache_register_fs+0x49a/0x690 erofs_fc_fill_super+0x6c0/0xcc0 vfs_get_super+0xa9/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] Freed by task 5820: kfree+0xf1/0x2c0 fscache_put_volume.part.0+0x5cb/0x9e0 erofs_fscache_unregister_fs+0x157/0x1b0 erofs_kill_sb+0xd9/0x1c0 deactivate_locked_super+0xa3/0x100 vfs_get_super+0x105/0x140 vfs_get_tree+0x8e/0x300 do_new_mount+0x28c/0x580 [...] ================================================================== Following is the process that triggers the issue: mount failed | daemon exit ------------------------------------------------------------ deactivate_locked_super cachefiles_daemon_release erofs_kill_sb erofs_fscache_unregister_fs fscache_relinquish_volume __fscache_relinquish_volume fscache_put_volume(fscache_volume, fscache_volume_put_relinquish) zero = __refcount_dec_and_test(&fscache_volume->ref, &ref); cachefiles_put_unbind_pincount cachefiles_daemon_unbind cachefiles_withdraw_cache cachefiles_withdraw_volumes list_del_init(&volume->cache_link) fscache_free_volume(fscache_volume) cache->ops->free_volume cachefiles_free_volume list_del_init(&cachefiles_volume->cache_link); kfree(fscache_volume) cachefiles_withdraw_volume fscache_withdraw_volume fscache_volume->n_accesses // fscache_volume UAF !!! The fscache_volume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscache_try_get_volume() helper function try to get its reference count. If the reference count of fscache_volume is 0, fscache_put_volume() is freeing it, so wait for it to be removed from cache->volumes. If its reference count is not 0, call cachefiles_withdraw_volume() with reference count protection to avoid the above issue. Fixes: fe2140e2f57f ("cachefiles: Implement volume support") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume()Baokun Li
Export fscache_put_volume() and add fscache_try_get_volume() helper function to allow cachefiles to get/put fscache_volume via linux/fscache-cache.h. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-2-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03fs: fix dentry sizeChristian Brauner
On CONFIG_SMP=y and on 32bit we need to decrease DNAME_INLINE_LEN to 36 btyes to end up with 128 bytes in total. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Links: https://lore.kernel.org/r/CAHk-=whtoqTSCcAvV-X-KPqoDWxS4vxmWpuKLB+Vv8=FtUd5vA@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03vfs: move d_lockref out of the area used by RCU lookupMateusz Guzik
Stock kernel scales worse than FreeBSD when doing a 20-way stat(2) on the same tmpfs-backed file. According to perf top: 38.09% lockref_put_return 26.08% lockref_get_not_dead 25.60% __d_lookup_rcu 0.89% clear_bhb_loop __d_lookup_rcu is participating in cacheline ping pong due to the embedded name sharing a cacheline with lockref. Moving it out resolves the problem: 41.50% lockref_put_return 41.03% lockref_get_not_dead 1.54% clear_bhb_loop benchmark (will-it-scale, Sapphire Rapids, tmpfs, ops/s): FreeBSD:7219334 before: 5038006 after: 7842883 (+55%) One minor remark: the 'after' result is unstable, fluctuating in the range ~7.8 mln to ~9 mln during different runs. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20240613001215.648829-3-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03net: dsa: microchip: lan937x: disable VPHY supportLucas Stach
As described by the microchip article "LAN937X - The required configuration for the external MAC port to operate at RGMII-to-RGMII 1Gbps link speed." [1]: "When VPHY is enabled, the auto-negotiation process following IEEE 802.3 standard will be triggered and will result in RGMII-to-RGMII signal failure on the interface because VPHY will try to poll the PHY status that is not available in the scenario of RGMII-to-RGMII connection (normally the link partner is usually an external processor). Note that when VPHY fails on accessing PHY registers, it will fall back to 100Mbps speed, it indicates disabling VPHY is optional if you only need the port to link at 100Mbps speed. Again, VPHY must and can only be disabled by writing VPHY_DISABLE bit in the register below as there is no strapping pin for the control." This patch was tested on LAN9372, so far it seems to not to affect VPHY based clock crossing optimization for the ports with integrated PHYs. [1]: https://microchip.my.site.com/s/article/LAN937X-The-required-configuration-for-the-external-MAC-port-to-operate-at-RGMII-to-RGMII-1Gbps-link-speed Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-03net: dsa: microchip: lan937x: disable in-band status support for RGMII ↵Lucas Stach
interfaces This driver do not support in-band mode and in case of CPU<->Switch link, this mode is not working any way. So, disable it otherwise ingress path of the switch MAC will stay disabled. Note: lan9372 manual do not document 0xN301 BIT(2) for the RGMII mode and recommend[1] to disable in-band link status update for the RGMII RX path by clearing 0xN302 BIT(0). But, 0xN301 BIT(2) seems to work too, so keep it unified with other KSZ switches. [1] https://microchip.my.site.com/s/article/LAN937X-The-required-configuration-for-the-external-MAC-port-to-operate-at-RGMII-to-RGMII-1Gbps-link-speed Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-03net: dsa: microchip: lan9371/2: add 100BaseTX PHY supportLucas Stach
On the LAN9371 and LAN9372, the 4th internal PHY is a 100BaseTX PHY instead of a 100BaseT1 PHY. The 100BaseTX PHYs have a different base register offset. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-03net: stmmac: enable HW-accelerated VLAN stripping for gmac4 onlyFurong Xu
Commit 750011e239a5 ("net: stmmac: Add support for HW-accelerated VLAN stripping") enables MAC level VLAN tag stripping for all MAC cores, but leaves set_hw_vlan_mode() and rx_hw_vlan() un-implemented for both gmac and xgmac. On gmac and xgmac, ethtool reports rx-vlan-offload is on, both MAC and driver do nothing about VLAN packets actually, although VLAN works well. Driver level stripping should be used on gmac and xgmac for now. Fixes: 750011e239a5 ("net: stmmac: Add support for HW-accelerated VLAN stripping") Signed-off-by: Furong Xu <0x1207@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-03drm/managed: Simplify if conditionThorsten Blum
The if condition !A || A && B can be simplified to !A || B. Fixes the following Coccinelle/coccicheck warning reported by excluded_middle.cocci: WARNING !A || A && B is equivalent to !A || B Compile-tested only. Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240701195607.228852-1-thorsten.blum@toblux.com
2024-07-03drm/fbdev-generic: Fix framebuffer on big endian devicesThomas Huth
Starting with kernel 6.7, the framebuffer text console is not working anymore with the virtio-gpu device on s390x hosts. Such big endian fb devices are usinga different pixel ordering than little endian devices, e.g. DRM_FORMAT_BGRX8888 instead of DRM_FORMAT_XRGB8888. This used to work fine as long as drm_client_buffer_addfb() was still calling drm_mode_addfb() which called drm_driver_legacy_fb_format() internally to get the right format. But drm_client_buffer_addfb() has recently been reworked to call drm_mode_addfb2() instead with the format value that has been passed to it as a parameter (see commit 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()"). That format parameter is determined in drm_fbdev_generic_helper_fb_probe() via the drm_mode_legacy_fb_format() function - which only generates formats suitable for little endian devices. So to fix this issue switch to drm_driver_legacy_fb_format() here instead to take the device endianness into consideration. Fixes: 6ae2ff23aa43 ("drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()") Closes: https://issues.redhat.com/browse/RHEL-45158 Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240627173530.460615-1-thuth@redhat.com
2024-07-03drm/panthor: Fix sync-only jobsBoris Brezillon
A sync-only job is meant to provide a synchronization point on a queue, so we can't return a NULL fence there, we have to add a signal operation to the command stream which executes after all other previously submitted jobs are done. v2: - Fixed a UAF bug - Added R-bs Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703071640.231278-3-boris.brezillon@collabora.com
2024-07-03drm/panthor: Don't check the array stride on empty uobj arraysBoris Brezillon
The user is likely to leave all the drm_panthor_obj_array fields to zero when the array is empty, which will cause an EINVAL failure. v2: - Added R-bs Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703071640.231278-2-boris.brezillon@collabora.com
2024-07-03drm/ast: Use drm_atomic_helper_commit_tail() helperThomas Zimmermann
Ast has no special requirements for runtime power management. So replace drm_atomic_helper_commit_tail_rpm() with the regular helper drm_atomic_helper_commit_tail(). Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-9-tzimmermann@suse.de
2024-07-03drm/ast: Inline ast_crtc_dpms() into callersThomas Zimmermann
The function ast_crtc_dpms() is left over from when the ast driver did not implement atomic modesetting. But DPMS is not supported by atomic modesetting and the helper is only called to enable or disable the CRTC sync pulses. Inline the function into its callers. To disable the CRTC, ast sets (AST_DPMS_VSYNC_OFF | AST_DPMS_HSYNC_OFF) in VGACRB6. Replace the constants with the correct register constants for VGACRB6. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-8-tzimmermann@suse.de
2024-07-03drm/ast: Only set VGA SCREEN_DISABLE bit in CRTC codeThomas Zimmermann
The SCREEN_DISABLE bit controls scanout from display memory. The bit affects all planes, so set it only in the CRTC's atomic enable and disable functions. A number of bugs affect this fix. First of all, ast_set_std_regs() tries to set VGASR1 except for the SD bit. But the read bitmask is invert, so it preserves anything except the SD bit. Fix this by re-inverting the read mask. The second issue is that primary-plane and CRTC helpers modify the SD bit. The bit controls scanout for all planes, primary and HW cursor, so set it only in the CRTC code. Further add a constant to represent the SD bit in VGASR1. Keep the plane's atomic_disable around to make the DRM framework happy. v2: - fix typos in commit message Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-7-tzimmermann@suse.de
2024-07-03drm/ast: Remove gamma LUT updates from DPMS codeThomas Zimmermann
The DPMS code, called from the CRTC's atomic_enable, rewrites the gamma LUT. This is already done by the CRTC's atomic_flush. Remove the duplication. v2: - fix a typo in commit message Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-6-tzimmermann@suse.de
2024-07-03drm/ast: Handle primary-plane format setup in atomic_updateThomas Zimmermann
Several color registers are programmed in the DPMS code of the CRTC's atomic_enable helper and the primary plane's atomic_update. It requires the color format and the display mode. Both code paths handle different cases: the DPMS's code will not be executed if the color format changes without a full mode switch. The plane's code only runs if the color format changes, but ignores display-mode changes. The color format is a property of the primary plane, so consolidate all color-format code in the plane's atomic_update. Remove it from the DPMS helper. v2: - clarify commit message (Jocelyn) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-5-tzimmermann@suse.de
2024-07-03drm/ast: Move mode-setting code into mode_set_nofb CRTC helperThomas Zimmermann
Do all mode setting in ast_crtc_helper_mode_set_nofb(), which always runs after disabling the CRTC and before programming the planes. Removes implicit synchronization between the CRTC's atomic disable, enable and the vertical retrace. Display-mode updates require HW cursors to be disabled. The HW cursor only picks up changes at vertical retrace periods. So the CRTC's atomic_disable helper waited for the retrace to delay any following mode-setting operations, which then happened in atomic_enable. See [1] for a description of the problem. With the CRTC helper callback mode_set_nofb, we can now synchronize and reprogram in the same place. As it always runs before the plane update, the plane code can be reordered with the CRTC's later atomic_enable et al. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/79914/ # 1 Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-4-tzimmermann@suse.de
2024-07-03drm/ast: Program mode for AST DP in atomic_mode_setThomas Zimmermann
The CRTC's atomic_flush function contains code to program the display mode to the AST DP chip. Move the code to the encoder's atomic_mode_set callback. The DRM atomic-modesetting code invoke this callback as part of the atomic commit. v2: - fix typos in commit message Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-3-tzimmermann@suse.de
2024-07-03drm/ast: Implement atomic enable/disable for encodersThomas Zimmermann
The CRTC helpers contain code to enable and disable DisplayPort connectors. Implement this functionality in the respective connector's atomic_enable/atomic_disable callbacks. DRM's atomic-modesetting helpers will call the functions as part of the atomic commit. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627153638.8765-2-tzimmermann@suse.de
2024-07-02objtool/x86: objtool can confuse memory and stack accessAlexandre Chartre
The encoding of an x86 instruction can include a ModR/M and a SIB (Scale-Index-Base) byte to describe the addressing mode of the instruction. objtool processes all addressing mode with a SIB base of 5 as having %rbp as the base register. However, a SIB base of 5 means that the effective address has either no base (if ModR/M mod is zero) or %rbp as the base (if ModR/M mod is 1 or 2). This can cause objtool to confuse an absolute address access with a stack operation. For example, objtool will see the following instruction: 4c 8b 24 25 e0 ff ff mov 0xffffffffffffffe0,%r12 as a stack operation (i.e. similar to: mov -0x20(%rbp), %r12). [Note that this kind of weird absolute address access is added by the compiler when using KASAN.] If this perceived stack operation happens to reference the location where %r12 was pushed on the stack then the objtool validation will think that %r12 is being restored and this can cause a stack state mismatch. This kind behavior was seen on xfs code, after a minor change (convert kmem_alloc() to kmalloc()): >> fs/xfs/xfs.o: warning: objtool: xfs_da_grow_inode_int+0x6c1: stack state mismatch: reg1[12]=-2-48 reg2[12]=-1+0 Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402220435.MGN0EV6l-lkp@intel.com/ Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lore.kernel.org/r/20240620144747.2524805-1-alexandre.chartre@oracle.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02objtool: Use "action" in error message to be consistent with helpSiddh Raman Pant
The help message mentions the main options as "actions", which is different from the optional "options". But the check error messages outputs "option" or "command" for referring to actions. Make the error messages consistent with help. Signed-off-by: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Check only two symbols when calculating symbol sizeBrian Johannesmeyer
Rather than looping through each symbol in a particular section to calculate a symbol's size, grep for the symbol and its immediate successor, and only use those two symbols. Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-8-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Remove call to addr2line from find_dir_prefix()Brian Johannesmeyer
Use the single long-running faddr2line process from find_dir_prefix(). Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-7-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Invoke addr2line as a single long-running processBrian Johannesmeyer
Rather than invoking a separate addr2line process for each address, invoke a single addr2line coprocess, and pass each address to its stdin. Previous work [0] applied a similar change to perf, leading to a ~60x speed-up [1]. If using an object file that is _not_ vmlinux, faddr2line passes a section name argument to addr2line. Because we do not know until runtime which section names will be passed to addr2line, we cannot apply this change to non-vmlinux object files. Hence, it only applies to vmlinux. [0] commit be8ecc57f180 ("perf srcline: Use long-running addr2line per DSO") [1] Link: https://eighty-twenty.org/2021/09/09/perf-addr2line-speed-improvement Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-6-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Pass --addresses argument to addr2lineBrian Johannesmeyer
In preparation for identifying an addr2line sentinel. See previous work [0], which applies a similar change to perf. [0] commit 8dc26b6f718a ("perf srcline: Make sentinel reading for binutils addr2line more robust") Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-5-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Check vmlinux only onceBrian Johannesmeyer
Rather than checking whether the object file is vmlinux for each invocation of __faddr2line, check it only once beforehand. Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-4-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Combine three readelf calls into oneBrian Johannesmeyer
Rather than calling readelf three separate times to collect three different types of info, call it only once, and parse out the different types of info from its output. Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-3-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-02scripts/faddr2line: Reduce number of readelf calls to threeBrian Johannesmeyer
Rather than calling readelf several times for each invocation of __faddr2line, call readelf only three times at the beginning, and save its result for future use. Signed-off-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/r/20240415145538.1938745-2-bjohannesmeyer@gmail.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2024-07-03power: supply: cros_charge-control: Avoid accessing attributes out of boundsNathan Chancellor
Clang warns (or errors with CONFIG_WERROR=y): drivers/power/supply/cros_charge-control.c:319:2: error: array index 3 is past the end of the array (that has type 'struct attribute *[3]') [-Werror,-Warray-bounds] 319 | priv->attributes[_CROS_CHCTL_ATTR_COUNT] = NULL; | ^ ~~~~~~~~~~~~~~~~~~~~~~ drivers/power/supply/cros_charge-control.c:49:2: note: array 'attributes' declared here 49 | struct attribute *attributes[_CROS_CHCTL_ATTR_COUNT]; | ^ 1 error generated. In earlier revisions of the driver, the attributes array in cros_chctl_priv had four elements with four distinct assignments but during review, the number of elements was changed to three through use of an enum and the assignments became a for loop, except for this one, which is now out of bounds. This assignment is no longer necessary because the size of the attributes array no longer accounts for it, so just remove it to clear up the warning. Fixes: c6ed48ef5259 ("power: supply: add ChromeOS EC based charge control driver") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240702-cros_charge-control-fix-clang-array-bounds-warning-v1-1-ae04d995cd1d@kernel.org Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
2024-07-03drm/exynos/vidi: convert to struct drm_edidJani Nikula
Prefer the struct drm_edid based functions for storing the EDID and updating the connector. It would be better if the vidi connection ioctl passed in the EDID size separately instead of relying on the extension count specified in the EDID, but that's what we have to rely on. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2024-07-03drm/exynos/vidi: simplify fake edid handlingJani Nikula
Avoid assigning fake_edid_info to ctx->raw_edid. Always keep ctx->raw_edid either an allocated pointer or NULL. Defer fake_edid_info handling to .get_modes(). This should be functionally equivalent but slightly easier to follow. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>