summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-14net: use a bounce buffer for copying skb->markEric Dumazet
syzbot found arm64 builds would crash in sock_recv_mark() when CONFIG_HARDENED_USERCOPY=y x86 and powerpc are not detecting the issue because they define user_access_begin. This will be handled in a different patch, because a check_object_size() is missing. Only data from skb->cb[] can be copied directly to/from user space, as explained in commit 79a8a642bf05 ("net: Whitelist the skbuff_head_cache "cb" field") syzbot report was: usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_head_cache' (offset 168, size 4)! ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:102 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 4410 Comm: syz-executor533 Not tainted 6.2.0-rc7-syzkaller-17907-g2d3827b3f393 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : usercopy_abort+0x90/0x94 mm/usercopy.c:90 lr : usercopy_abort+0x90/0x94 mm/usercopy.c:90 sp : ffff80000fb9b9a0 x29: ffff80000fb9b9b0 x28: ffff0000c6073400 x27: 0000000020001a00 x26: 0000000000000014 x25: ffff80000cf52000 x24: fffffc0000000000 x23: 05ffc00000000200 x22: fffffc000324bf80 x21: ffff0000c92fe1a8 x20: 0000000000000001 x19: 0000000000000004 x18: 0000000000000000 x17: 656a626f2042554c x16: ffff0000c6073dd0 x15: ffff80000dbd2118 x14: ffff0000c6073400 x13: 00000000ffffffff x12: ffff0000c6073400 x11: ff808000081bbb4c x10: 0000000000000000 x9 : 7b0572d7cc0ccf00 x8 : 7b0572d7cc0ccf00 x7 : ffff80000bf650d4 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : ffff0001fefbff08 x1 : 0000000100000000 x0 : 000000000000006c Call trace: usercopy_abort+0x90/0x94 mm/usercopy.c:90 __check_heap_object+0xa8/0x100 mm/slub.c:4761 check_heap_object mm/usercopy.c:196 [inline] __check_object_size+0x208/0x6b8 mm/usercopy.c:251 check_object_size include/linux/thread_info.h:199 [inline] __copy_to_user include/linux/uaccess.h:115 [inline] put_cmsg+0x408/0x464 net/core/scm.c:238 sock_recv_mark net/socket.c:975 [inline] __sock_recv_cmsgs+0x1fc/0x248 net/socket.c:984 sock_recv_cmsgs include/net/sock.h:2728 [inline] packet_recvmsg+0x2d8/0x678 net/packet/af_packet.c:3482 ____sys_recvmsg+0x110/0x3a0 ___sys_recvmsg net/socket.c:2737 [inline] __sys_recvmsg+0x194/0x210 net/socket.c:2767 __do_sys_recvmsg net/socket.c:2777 [inline] __se_sys_recvmsg net/socket.c:2774 [inline] __arm64_sys_recvmsg+0x2c/0x3c net/socket.c:2774 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x64/0x178 arch/arm64/kernel/syscall.c:52 el0_svc_common+0xbc/0x180 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x48/0x110 arch/arm64/kernel/syscall.c:193 el0_svc+0x58/0x14c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: 91388800 aa0903e1 f90003e8 94e6d752 (d4210000) Fixes: 6fd1d51cfa25 ("net: SO_RCVMARK socket option for SO_MARK with recvmsg()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20230213160059.3829741-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-14drm/vmwgfx: Do not drop the reference to the handle too soonZack Rusin
v3: Fix vmw_user_bo_lookup which was also dropping the gem reference before the kernel was done with buffer depending on userspace doing the right thing. Same bug, different spot. It is possible for userspace to predict the next buffer handle and to destroy the buffer while it's still used by the kernel. Delay dropping the internal reference on the buffers until kernel is done with them. Instead of immediately dropping the gem reference in vmw_user_bo_lookup and vmw_gem_object_create_with_handle let the callers decide when they're ready give the control back to userspace. Also fixes the second usage of vmw_gem_object_create_with_handle in vmwgfx_surface.c which wasn't grabbing an explicit reference to the gem object which could have been destroyed by the userspace on the owning surface at any point. Signed-off-by: Zack Rusin <zackr@vmware.com> Fixes: 8afa13a0583f ("drm/vmwgfx: Implement DRIVER_GEM") Reviewed-by: Martin Krastev <krastevm@vmware.com> Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230211050514.2431155-1-zack@kde.org (cherry picked from commit 9ef8d83e8e25d5f1811b3a38eb1484f85f64296c) Cc: <stable@vger.kernel.org> # v5.17+
2023-02-14drm/vmwgfx: Stop accessing buffer objects which failed initZack Rusin
ttm_bo_init_reserved on failure puts the buffer object back which causes it to be deleted, but kfree was still being called on the same buffer in vmw_bo_create leading to a double free. After the double free the vmw_gem_object_create_with_handle was setting the gem function objects before checking the return status of vmw_bo_create leading to null pointer access. Fix the entire path by relaying on ttm_bo_init_reserved to delete the buffer objects on failure and making sure the return status is checked before setting the gem function objects on the buffer object. Signed-off-by: Zack Rusin <zackr@vmware.com> Fixes: 8afa13a0583f ("drm/vmwgfx: Implement DRIVER_GEM") Reviewed-by: Maaz Mombasawala <mombasawalam@vmware.com> Reviewed-by: Martin Krastev <krastevm@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230208180050.2093426-1-zack@kde.org (cherry picked from commit 36d421e632e9a0e8375eaed0143551a34d81a7e3) Cc: <stable@vger.kernel.org> # v5.17+
2023-02-15erofs: unify anonymous inodes for blobJingbo Xu
Currently there're two anonymous inodes (inode and anon_inode in struct erofs_fscache) for each blob. The former was introduced as the address_space of page cache for bootstrap. The latter was initially introduced as both the address_space of page cache and also a sentinel in the shared domain. Since now the management of cookies in share domain has been decoupled with the anonymous inode, there's no need to maintain an extra anonymous inode. Let's unify these two anonymous inodes. Besides, in non-share-domain mode only bootstrap will allocate anonymous inode. To simplify the implementation, always allocate anonymous inode for both bootstrap and data blobs. Similarly release anonymous inodes for data blobs when .put_super() is called, or we'll get "VFS: Busy inodes after unmount." warning. Also remove the redundant set_nlink() when initializing the anonymous inode, since i_nlink has already been initialized to 1 when the inode gets allocated. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20230209063913.46341-5-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: relinquish volume with mutex heldJingbo Xu
Relinquish fscache volume with mutex held. Otherwise if a new domain is registered when the old domain with the same name gets removed from the list but not relinquished yet, fscache may complain the collision. Fixes: 8b7adf1dff3d ("erofs: introduce fscache-based domain") Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20230209063913.46341-4-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: maintain cookies of share domain in self-contained listJingbo Xu
We'd better not touch sb->s_inodes list and inode->i_count directly. Let's maintain cookies of share domain in a self-contained list in erofs. Besides, relinquish cookie with the mutex held. Otherwise if a cookie is registered when the old cookie with the same name in the same domain has been removed from the list but not relinquished yet, fscache may complain "Duplicate cookie detected". Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20230209063913.46341-3-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: remove unused device mapping in meta routineJingbo Xu
Currently metadata is always on bootstrap, and thus device mapping is not needed so far. Remove the redundant device mapping in the meta routine. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209063913.46341-2-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15MAINTAINERS: erofs: Add Documentation/ABI/testing/sysfs-fs-erofsYangtao Li
Add this doc to the erofs maintainers entry. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209052013.34952-1-frank.li@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15Documentation/ABI: sysfs-fs-erofs: update supported featuresYue Hu
Add missing feaures for sysfs-fs-erofs feature doc. Signed-off-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209051128.10571-1-zbestahu@gmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: remove unused EROFS_GET_BLOCKS_RAW flagJingbo Xu
For erofs_map_blocks() and erofs_map_blocks_flatmode(), the flags argument is always EROFS_GET_BLOCKS_RAW. Thus remove the unused flags parameter for these two functions. Besides EROFS_GET_BLOCKS_RAW is originally introduced for reading compressed (raw) data for compressed files. However it's never used actually and let's remove it now. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209024825.17335-2-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: update print symbols for various flags in traceJingbo Xu
As new flags introduced, the corresponding print symbols for trace are not added accordingly. Add these missing print symbols for these flags. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209024825.17335-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: make kobj_type structures constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230209-kobj_type-erofs-v1-1-078c945e2c4b@weissschuh.net Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: add per-cpu threads for decompression as an optionSandeep Dhavale
Using per-cpu thread pool we can reduce the scheduling latency compared to workqueue implementation. With this patch scheduling latency and variation is reduced as per-cpu threads are high priority kthread_workers. The results were evaluated on arm64 Android devices running 5.10 kernel. The table below shows resulting improvements of total scheduling latency for the same app launch benchmark runs with 50 iterations. Scheduling latency is the latency between when the task (workqueue kworker vs kthread_worker) became eligible to run to when it actually started running. +-------------------------+-----------+----------------+---------+ | | workqueue | kthread_worker | diff | +-------------------------+-----------+----------------+---------+ | Average (us) | 15253 | 2914 | -80.89% | | Median (us) | 14001 | 2912 | -79.20% | | Minimum (us) | 3117 | 1027 | -67.05% | | Maximum (us) | 30170 | 3805 | -87.39% | | Standard deviation (us) | 7166 | 359 | | +-------------------------+-----------+----------------+---------+ Background: Boot times and cold app launch benchmarks are very important to the Android ecosystem as they directly translate to responsiveness from user point of view. While EROFS provides a lot of important features like space savings, we saw some performance penalty in cold app launch benchmarks in few scenarios. Analysis showed that the significant variance was coming from the scheduling cost while decompression cost was more or less the same. Having per-cpu thread pool we can see from the above table that this variation is reduced by ~80% on average. This problem was discussed at LPC 2022. Link to LPC 2022 slides and talk at [1] [1] https://lpc.events/event/16/contributions/1338/ [ Gao Xiang: At least, we have to add this until WQ_UNBOUND workqueue issue [2] on many arm64 devices is resolved. ] [2] https://lore.kernel.org/r/CAJkfWY490-m6wNubkxiTPsW59sfsQs37Wey279LmiRxKt7aQYg@mail.gmail.com Signed-off-by: Sandeep Dhavale <dhavale@google.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230208093322.75816-1-hsiangkao@linux.alibaba.com
2023-02-15erofs: tidy up internal.hGao Xiang
Reorder internal.h code so that removing unneeded macros and more. No logic changes. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-6-hsiangkao@linux.alibaba.com
2023-02-15erofs: get rid of z_erofs_do_map_blocks() forward declarationGao Xiang
The code can be neater without forward declarations. Let's get rid of z_erofs_do_map_blocks() forward declaration. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-5-hsiangkao@linux.alibaba.com
2023-02-15erofs: move zdata.h into zdata.cGao Xiang
Definitions in zdata.h are only used in zdata.c and for internal use only. No logic changes. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-4-hsiangkao@linux.alibaba.com
2023-02-15erofs: remove tagged pointer helpersGao Xiang
Just open-code the remaining one to simplify the code. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-3-hsiangkao@linux.alibaba.com
2023-02-15erofs: avoid tagged pointers to mark sync decompressionGao Xiang
We could just use a boolean in z_erofs_decompressqueue for sync decompression to simplify the code. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-2-hsiangkao@linux.alibaba.com
2023-02-15erofs: get rid of erofs_inode_datablocks()Gao Xiang
erofs_inode_datablocks() has the only one caller, let's just get rid of it entirely. No logic changes. Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230204093040.97967-1-hsiangkao@linux.alibaba.com
2023-02-15erofs: simplify iloc()Gao Xiang
Actually we could pass in inodes directly to clean up all callers. Also rename iloc() as erofs_iloc(). Link: https://lore.kernel.org/r/20230114150823.432069-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: get rid of debug_one_dentry()Gao Xiang
Since erofsdump is available, no need to keep this debugging functionality at all. Also drop a useless comment since it's the VFS behavior. Link: https://lore.kernel.org/r/20230114125746.399253-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: remove linux/buffer_head.h dependencyGao Xiang
EROFS actually never uses buffer heads, therefore just get rid of BH_xxx definitions and linux/buffer_head.h inclusive. Link: https://lore.kernel.org/r/20230113065226.68801-2-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15erofs: clean up erofs_iget()Gao Xiang
Move inode hash function into inode.c and simplify erofs_iget(). Link: https://lore.kernel.org/r/20230113065226.68801-1-hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-14x86/hotplug: Remove incorrect comment about mwait_play_dead()Srivatsa S. Bhat (VMware)
The comment that says mwait_play_dead() returns only on failure is a bit misleading because mwait_play_dead() could actually return for valid reasons (such as mwait not being supported by the platform) that do not indicate a failure of the CPU offline operation. So, remove the comment. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230128003751.141317-1-srivatsa@csail.mit.edu
2023-02-14Revert "blk-cgroup: pin the gendisk in struct blkcg_gq"Christoph Hellwig
This reverts commit 84d7d462b16dd5f0bf7c7ca9254bf81db2c952a2. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230214183308.1658775-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-14Revert "blk-cgroup: pass a gendisk to blkg_lookup"Christoph Hellwig
This reverts commit 821e840c08ad83736eced4037cdad864e95e2584. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230214183308.1658775-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-14Revert "blk-cgroup: delay blk-cgroup initialization until add_disk"Christoph Hellwig
This reverts commit 178fa7d49815ea8001f43ade37a22072829fd8ab. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230214183308.1658775-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-14Revert "blk-cgroup: delay calling blkcg_exit_disk until disk_release"Christoph Hellwig
This reverts commit c43332fe028c252a2a28e46be70a530f64fc3c9d as it is not needed without moving to disk references in the blkg. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230214183308.1658775-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-14Revert "blk-cgroup: move the cgroup information to struct gendisk"Christoph Hellwig
This reverts commit 3f13ab7c80fdb0ada86a8e3e818960bc1ccbaa59 as a patch it depends on caused a few problems. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230214183308.1658775-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-14drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT listMatt Roper
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround has 'BUS' style reset, indicating that it does not lose its value on engine resets. Furthermore, this register is part of the GT forcewake domain rather than the RENDER domain, so it should not be impacted by RCS engine resets. As such, we should implement this on the GT workaround list rather than an engine list. Bspec: 19219 Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthew.d.roper@intel.com (cherry picked from commit 5f21dc07b52eb54a908e66f5d6e05a87bcb5b049) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-02-14ixgbe: add double of VLAN header when computing the max MTUJason Xing
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do. Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-14i40e: add double of VLAN header when computing the max MTUJason Xing
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do. Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-14ixgbe: allow to increase MTU to 3K with XDP enabledJason Xing
Recently I encountered one case where I cannot increase the MTU size directly from 1500 to a much bigger value with XDP enabled if the server is equipped with IXGBE card, which happened on thousands of servers in production environment. After applying the current patch, we can set the maximum MTU size to 3K. This patch follows the behavior of changing MTU as i40e/ice does. References: [1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP") [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-14Merge tag 'pm-6.2-rc9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Add a missing NULL pointer check to the cpufreq drver for Qualcomm platforms (Manivannan Sadhasivam)" * tag 'pm-6.2-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: qcom-hw: Add missing null pointer check
2023-02-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the sibling threads transitions out of C0 state, the other thread gets access to twice as many entries in the RSB, but unfortunately the predictions of the now-halted logical processor are not purged. Therefore, the executing processor could speculatively execute from locations that the now-halted processor had trained the RSB on. The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB when context switching to the idle thread. However, KVM allows a VMM to prevent exiting guest mode when transitioning out of C0 using the KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change this behavior. To mitigate the cross-thread return address predictions bug, a VMM must not be allowed to override the default behavior to intercept C0 transitions. These patches introduce a KVM module parameter that, if set, will prevent the user from disabling the HLT, MWAIT and CSTATE exits" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions KVM: x86: Mitigate the cross-thread return address predictions bug x86/speculation: Identify processors vulnerable to SMT RSB predictions
2023-02-14EDAC/amd64: Shut up an -Werror,-Wsometimes-uninitialized clang false positiveYazen Ghannam
Reportedly, clang cannot do interprocedural analysis: https://lore.kernel.org/r/20230213-amd64_edac-wsometimes-uninitialized-v1-1-5bde32b89e02@kernel.org and see that those arguments won't be used uninitialized. So, yeah, the code's fine even without this. Normally, such a "fix" won't be applied but that warning gets automatically enabled in -Wall builds and when CONFIG_WERROR is set in allmodconfig builds, the build fails. So shut it up with a minimal fix as this code will see more reorganization very soon. [ bp: Write commit message. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/Y%2BqdVHidnrrKvxiD@dev-arch.thelio-3990X
2023-02-14ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP LaptopsAndy Chi
On HP Laptops, requires the ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED quirk to make its audio LEDs and speaker work. Signed-off-by: Andy Chi <andy.chi@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230214140432.39654-1-andy.chi@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-14cpufreq: qcom-hw: Add missing null pointer checkManivannan Sadhasivam
of_device_get_match_data() may return NULL, so add a check to prevent potential null pointer dereference. Issue reported by Qualcomm's internal static analysis tool. Fixes: 4f7961706c63 ("cpufreq: qcom-hw: Move soc_data to struct qcom_cpufreq") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-14ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.Andy Chi
There is a HP platform needs ALC236_FIXUP_HP_GPIO_LED quirk to make mic-mute/audio-mute working. Signed-off-by: Andy Chi <andy.chi@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230214035853.31217-1-andy.chi@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-14alarmtimer: Prevent starvation by small intervals and SIG_IGNThomas Gleixner
syzbot reported a RCU stall which is caused by setting up an alarmtimer with a very small interval and ignoring the signal. The reproducer arms the alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a problem per se, but that's an issue when the signal is ignored because then the timer is immediately rearmed because there is no way to delay that rearming to the signal delivery path. See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent softirq starvation by small intervals and SIG_IGN") for details. The reproducer does not set SIG_IGN explicitely, but it sets up the timers signal with SIGCONT. That has the same effect as explicitely setting SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and the task is not ptraced. The log clearly shows that: [pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} --- It works because the tasks are traced and therefore the signal is queued so the tracer can see it, which delays the restart of the timer to the signal delivery path. But then the tracer is killed: [pid 5087] kill(-5102, SIGKILL <unfinished ...> ... ./strace-static-x86_64: Process 5107 detached and after it's gone the stall can be observed: syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns [ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: ... [ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran: [ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0: [ 184.669821][ C0] NMI backtrace for cpu 0 [ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0 ... [ 184.670036][ C0] Call Trace: [ 184.670041][ C0] <IRQ> [ 184.670045][ C0] alarmtimer_fired+0x327/0x670 posix_timer_fn() prevents that by checking whether the interval for timers which have the signal ignored is smaller than a jiffie and artifically delay it by shifting the next expiry out by a jiffie. That's accurate vs. the overrun accounting, but slightly inaccurate vs. timer_gettimer(2). The comment in that function says what needs to be done and there was a fix available for the regular userspace induced SIG_IGN mechanism, but that did not work due to the implicit ignore for SIGCONT and similar signals. This needs to be worked on, but for now the only available workaround is to do exactly what posix_timer_fn() does: Increase the interval of self-rearming timers, which have their signal ignored, to at least a jiffie. Interestingly this has been fixed before via commit ff86bf0c65f1 ("alarmtimer: Rate limit periodic intervals") already, but that fix got lost in a later rework. Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <jstultz@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
2023-02-14x86/mtrr: Revert 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR ↵Juergen Gross
disabled case") Commit 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case") broke the use case of running Xen dom0 kernels on machines with an external disk enclosure attached via USB, see Link tag. What this commit was originally fixing - SEV-SNP guests on Hyper-V - is a more specialized situation which has other issues at the moment anyway so reverting this now and addressing the issue properly later is the prudent thing to do. So revert it in time for the 6.2 proper release. [ bp: Rewrite commit message. ] Reported-by: Christian Kujau <lists@nerdbynature.de> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/4fe9541e-4d4c-2b2a-f8c8-2d34a7284930@nerdbynature.de
2023-02-14net: stmmac: Restrict warning on disabling DMA store and fwd modeCristian Ciocaltea
When setting 'snps,force_thresh_dma_mode' DT property, the following warning is always emitted, regardless the status of force_sf_dma_mode: dwmac-starfive 10020000.ethernet: force_sf_dma_mode is ignored if force_thresh_dma_mode is set. Do not print the rather misleading message when DMA store and forward mode is already disabled. Fixes: e2a240c7d3bc ("driver:net:stmmac: Disable DMA store and forward mode if platform data force_thresh_dma_mode is set.") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20230210202126.877548-1-cristian.ciocaltea@collabora.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-14nvme-pci: remove iod use_sglsKeith Busch
It's not used anywhere anymore, so remove it. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-14nvme-pci: fix freeing single sglKeith Busch
There may only be a single DMA mapped entry from multiple physical segments, which means we don't allocate a separte SGL list. Check the number of allocations prior to know if we need to free something. Freeing a single list allocation is the same for both PRP and SGL usages, so we don't need to check the use_sgl flag anymore. Fixes: 01df742d8c5c0 ("nvme-pci: remove SGL segment descriptors") Reported-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
2023-02-14nvme-pci: always return an ERR_PTR from nvme_pci_alloc_devIrvin Cote
Don't mix NULL and ERR_PTR returns. Fixes: 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") Signed-off-by: Irvin Cote <irvin.cote@insa-lyon.fr> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-14nvme-pci: set the DMA mask earlierChristoph Hellwig
Set the DMA mask before calling dma_addressing_limited, which depends on it. Note that this stop checking the return value of dma_set_mask_and_coherent as this function can only fail for masks < 32-bit. Fixes: 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") Reported-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Michael Kelley <mikelley@microsoft.com>
2023-02-13net/sched: act_ctinfo: use percpu statsPedro Tammela
The tc action act_ctinfo was using shared stats, fix it to use percpu stats since bstats_update() must be called with locks or with a percpu pointer argument. tdc results: 1..12 ok 1 c826 - Add ctinfo action with default setting ok 2 0286 - Add ctinfo action with dscp ok 3 4938 - Add ctinfo action with valid cpmark and zone ok 4 7593 - Add ctinfo action with drop control ok 5 2961 - Replace ctinfo action zone and action control ok 6 e567 - Delete ctinfo action with valid index ok 7 6a91 - Delete ctinfo action with invalid index ok 8 5232 - List ctinfo actions ok 9 7702 - Flush ctinfo actions ok 10 3201 - Add ctinfo action with duplicate index ok 11 8295 - Add ctinfo action with invalid index ok 12 3964 - Replace ctinfo action with invalid goto_chain control Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20230210200824.444856-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-13net: stmmac: fix order of dwmac5 FlexPPS parametrization sequenceJohannes Zink
So far changing the period by just setting new period values while running did not work. The order as indicated by the publicly available reference manual of the i.MX8MP [1] indicates a sequence: * initiate the programming sequence * set the values for PPS period and start time * start the pulse train generation. This is currently not used in dwmac5_flex_pps_config(), which instead does: * initiate the programming sequence and immediately start the pulse train generation * set the values for PPS period and start time This caused the period values written not to take effect until the FlexPPS output was disabled and re-enabled again. This patch fix the order and allows the period to be set immediately. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPRM Fixes: 9a8a02c9d46d ("net: stmmac: Add Flexible PPS support") Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230210143937.3427483-1-j.zink@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-14ata: pata_octeon_cf: drop kernel-doc notationRandy Dunlap
Fix a slew of kernel-doc warnings in pata_octeon_cf.c by changing all "/**" comments to "/*" since they are not in kernel-doc format. Fixes: 3c929c6f5aa7 ("libata: New driver for OCTEON SOC Compact Flash interface (v7).") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/all/202302101722.5O56RClE-lkp@intel.com/ Cc: David Daney <ddaney@caviumnetworks.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-ide@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-02-14ata: ahci: Add Tiger Lake UP{3,4} AHCI controllerSimon Gaiser
Mark the Tiger Lake UP{3,4} AHCI controller as "low_power". This enables S0ix to work out of the box. Otherwise this isn't working unless the user manually sets /sys/class/scsi_host/*/link_power_management_policy. Intel lists a total of 4 SATA controller IDs in [1] for those mobile PCHs. This commit just adds the "AHCI" variant since I only tested those. [1]: https://cdrdv2.intel.com/v1/dl/getContent/631119 Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>