summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-02-26md: return directly before setting did_set_md_closingLi Nan
There is nothing to do at 'out' before setting 'did_set_md_closing' in md_ioctl(). Return directly, and it will help us to remove 'did_set_md_closing' later. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-5-linan666@huaweicloud.com
2024-02-26md: clean up invalid BUG_ON in md_ioctlLi Nan
'disk->private_data' is set to mddev in md_alloc() and never set to NULL, and users need to open mddev before submitting ioctl. So mddev must not have been freed during ioctl, and there is no need to check mddev here. Clean up it. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-4-linan666@huaweicloud.com
2024-02-26md: changed the switch of RAID_VERSION to ifLi Nan
There is only one case of this 'switch'. Change it to 'if'. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-3-linan666@huaweicloud.com
2024-02-26md: merge the check of capabilities into md_ioctl_valid()Li Nan
There is no functional change. Just to make code cleaner. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-2-linan666@huaweicloud.com
2024-02-26ceph: switch to corrected encoding of max_xattr_size in mdsmapXiubo Li
The addition of bal_rank_mask with encoding version 17 was merged into ceph.git in Oct 2022 and made it into v18.2.0 release normally. A few months later, the much delayed addition of max_xattr_size got merged, also with encoding version 17, placed before bal_rank_mask in the encoding -- but it didn't make v18.2.0 release. The way this ended up being resolved on the MDS side is that bal_rank_mask will continue to be encoded in version 17 while max_xattr_size is now encoded in version 18. This does mean that older kernels will misdecode version 17, but this is also true for v18.2.0 and v18.2.1 clients in userspace. The best we can do is backport this adjustment -- see ceph.git commit 78abfeaff27fee343fb664db633de5b221699a73 for details. [ idryomov: changelog ] Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/64440 Fixes: d93231a6bc8a ("ceph: prevent a client from exceeding the MDS maximum xattr size") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-26fs/ntfs3: fix build without CONFIG_NTFS3_LZX_XPRESSMark O'Donovan
When CONFIG_NTFS3_LZX_XPRESS is not set then we get the following build error: fs/ntfs3/frecord.c:2460:16: error: unused variable ‘i_size’ Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Fixes: 4fd6c08a16d7 ("fs/ntfs3: Use i_size_read and i_size_write") Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-26landlock: Fix asymmetric private inodes referringMickaël Salaün
When linking or renaming a file, if only one of the source or destination directory is backed by an S_PRIVATE inode, then the related set of layer masks would be used as uninitialized by is_access_to_paths_allowed(). This would result to indeterministic access for one side instead of always being allowed. This bug could only be triggered with a mounted filesystem containing both S_PRIVATE and !S_PRIVATE inodes, which doesn't seem possible. The collect_domain_accesses() calls return early if is_nouser_or_private() returns false, which means that the directory's superblock has SB_NOUSER or its inode has S_PRIVATE. Because rename or link actions are only allowed on the same mounted filesystem, the superblock is always the same for both source and destination directories. However, it might be possible in theory to have an S_PRIVATE parent source inode with an !S_PRIVATE parent destination inode, or vice versa. To make sure this case is not an issue, explicitly initialized both set of layer masks to 0, which means to allow all actions on the related side. If at least on side has !S_PRIVATE, then collect_domain_accesses() and is_access_to_paths_allowed() check for the required access rights. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Jann Horn <jannh@google.com> Cc: Shervin Oloumi <enlightened@chromium.org> Cc: stable@vger.kernel.org Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Link: https://lore.kernel.org/r/20240219190345.2928627-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2024-02-26x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registersPaolo Bonzini
MKTME repurposes the high bit of physical address to key id for encryption key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same, the valid bits in the MTRR mask register are based on the reduced number of physical address bits. detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts it from the total usable physical bits, but it is called too late. Move the call to early_init_intel() so that it is called in setup_arch(), before MTRRs are setup. This fixes boot on TDX-enabled systems, which until now only worked with "disable_mtrr_cleanup". Without the patch, the values written to the MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and the writes failed; with the patch, the values are 46-bit wide, which matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo. Reported-by: Zixi Chen <zixchen@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240131230902.1867092-3-pbonzini%40redhat.com
2024-02-26x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()Paolo Bonzini
In commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach"), the initialization of c->x86_phys_bits was moved after this_cpu->c_early_init(c). This is incorrect because early_init_amd() expected to be able to reduce the value according to the contents of CPUID leaf 0x8000001f. Fortunately, the bug was negated by init_amd()'s call to early_init_amd(), which does reduce x86_phys_bits in the end. However, this is very late in the boot process and, most notably, the wrong value is used for x86_phys_bits when setting up MTRRs. To fix this, call get_cpu_address_sizes() as soon as X86_FEATURE_CPUID is set/cleared, and c->extended_cpuid_level is retrieved. Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240131230902.1867092-2-pbonzini%40redhat.com
2024-02-26fbcon: always restore the old font data in fbcon_do_set_font()Jiri Slaby (SUSE)
Commit a5a923038d70 (fbdev: fbcon: Properly revert changes when vc_resize() failed) started restoring old font data upon failure (of vc_resize()). But it performs so only for user fonts. It means that the "system"/internal fonts are not restored at all. So in result, the very first call to fbcon_do_set_font() performs no restore at all upon failing vc_resize(). This can be reproduced by Syzkaller to crash the system on the next invocation of font_get(). It's rather hard to hit the allocation failure in vc_resize() on the first font_set(), but not impossible. Esp. if fault injection is used to aid the execution/failure. It was demonstrated by Sirius: BUG: unable to handle page fault for address: fffffffffffffff8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD cb7b067 P4D cb7b067 PUD cb7d067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8007 Comm: poc Not tainted 6.7.0-g9d1694dc91ce #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:fbcon_get_font+0x229/0x800 drivers/video/fbdev/core/fbcon.c:2286 Call Trace: <TASK> con_font_get drivers/tty/vt/vt.c:4558 [inline] con_font_op+0x1fc/0xf20 drivers/tty/vt/vt.c:4673 vt_k_ioctl drivers/tty/vt/vt_ioctl.c:474 [inline] vt_ioctl+0x632/0x2ec0 drivers/tty/vt/vt_ioctl.c:752 tty_ioctl+0x6f8/0x1570 drivers/tty/tty_io.c:2803 vfs_ioctl fs/ioctl.c:51 [inline] ... So restore the font data in any case, not only for user fonts. Note the later 'if' is now protected by 'old_userfont' and not 'old_data' as the latter is always set now. (And it is supposed to be non-NULL. Otherwise we would see the bug above again.) Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Fixes: a5a923038d70 ("fbdev: fbcon: Properly revert changes when vc_resize() failed") Reported-and-tested-by: Ubisectech Sirius <bugreport@ubisectech.com> Cc: Ubisectech Sirius <bugreport@ubisectech.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Helge Deller <deller@gmx.de> Cc: linux-fbdev@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240208114411.14604-1-jirislaby@kernel.org
2024-02-26ASoC: amd: yc: Add Lenovo ThinkBook 21J0 into DMI quirk tableJohnny Hsieh
This patch adds Lenovo 21J0 (ThinkBook 16 G5+ ARP) to the DMI quirks table to enable internal microphone array. Cc: linux-sound@vger.kernel.org Signed-off-by: Johnny Hsieh <mnixry@outlook.com> Link: https://msgid.link/r/TYSPR04MB8429D62DFDB6727866ECF1DEC55A2@TYSPR04MB8429.apcprd04.prod.outlook.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-02-26drm/ttm/tests: depend on UML || COMPILE_TESTChristian König
At least the device test requires that no other driver using TTM is loaded. So make those unit tests depend on UML || COMPILE_TEST to prevent people from trying them on bare metal. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/all/20240219230116.77b8ad68@yea/
2024-02-26Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard
Sima needs a more recent release to apply a patch. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-02-26drm/tegra: Remove existing framebuffer only if we support displayThierry Reding
Tegra DRM doesn't support display on Tegra234 and later, so make sure not to remove any existing framebuffers in that case. v2: - add comments explaining how this situation can come about - clear DRIVER_MODESET and DRIVER_ATOMIC feature bits Fixes: 6848c291a54f ("drm/aperture: Convert drivers to aperture interfaces") Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Robert Foss <rfoss@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240223150333.1401582-1-thierry.reding@gmail.com
2024-02-26ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()Eric Dumazet
It seems that if userspace provides a correct IFA_TARGET_NETNSID value but no IFA_ADDRESS and IFA_LOCAL attributes, inet6_rtm_getaddr() returns -EINVAL with an elevated "struct net" refcount. Fixes: 6ecf4c37eb3e ("ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26selftests: net: veth: test syncing GRO and XDP state while device is downJakub Kicinski
Test that we keep GRO flag in sync when XDP is disabled while the device is closed. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26net: veth: clear GRO when clearing XDP even when downJakub Kicinski
veth sets NETIF_F_GRO automatically when XDP is enabled, because both features use the same NAPI machinery. The logic to clear NETIF_F_GRO sits in veth_disable_xdp() which is called both on ndo_stop and when XDP is turned off. To avoid the flag from being cleared when the device is brought down, the clearing is skipped when IFF_UP is not set. Bringing the device down should indeed not modify its features. Unfortunately, this means that clearing is also skipped when XDP is disabled _while_ the device is down. And there's nothing on the open path to bring the device features back into sync. IOW if user enables XDP, disables it and then brings the device up we'll end up with a stray GRO flag set but no NAPI instances. We don't depend on the GRO flag on the datapath, so the datapath won't crash. We will crash (or hang), however, next time features are sync'ed (either by user via ethtool or peer changing its config). The GRO flag will go away, and veth will try to disable the NAPIs. But the open path never created them since XDP was off, the GRO flag was a stray. If NAPI was initialized before we'll hang in napi_disable(). If it never was we'll crash trying to stop uninitialized hrtimer. Move the GRO flag updates to the XDP enable / disable paths, instead of mixing them with the ndo_open / ndo_close paths. Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP") Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: syzbot+039399a9b96297ddedca@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26xfrm: Avoid clang fortify warning in copy_to_user_tmpl()Nathan Chancellor
After a couple recent changes in LLVM, there is a warning (or error with CONFIG_WERROR=y or W=e) from the compile time fortify source routines, specifically the memset() in copy_to_user_tmpl(). In file included from net/xfrm/xfrm_user.c:14: ... include/linux/fortify-string.h:438:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] 438 | __write_overflow_field(p_size_field, size); | ^ 1 error generated. While ->xfrm_nr has been validated against XFRM_MAX_DEPTH when its value is first assigned in copy_templates() by calling validate_tmpl() first (so there should not be any issue in practice), LLVM/clang cannot really deduce that across the boundaries of these functions. Without that knowledge, it cannot assume that the loop stops before i is greater than XFRM_MAX_DEPTH, which would indeed result a stack buffer overflow in the memset(). To make the bounds of ->xfrm_nr clear to the compiler and add additional defense in case copy_to_user_tmpl() is ever used in a path where ->xfrm_nr has not been properly validated against XFRM_MAX_DEPTH first, add an explicit bound check and early return, which clears up the warning. Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1985 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-02-26drm/i915/hdcp: Remove additional timing for reading mst hdcp messageSuraj Kandpal
Now that we have moved back to direct reads the additional timing is not required hence this can be removed. --v2 -Add Fixes tag [Ankit] Fixes: 3974f9c17bb9 ("drm/i915/hdcp: Adjust timeout for read in DPMST Scenario") Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223081453.1576918-10-suraj.kandpal@intel.com (cherry picked from commit 429ccbd1c39baefc6114b482ae98c188f007afcd) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-02-26drm/i915/hdcp: Move to direct reads for HDCPSuraj Kandpal
Even for MST scenarios we need to do direct reads only on the immediate downstream device the rest of the authentication is taken care by that device. Remote reads will only be used to check capability of the monitors in MST topology. --v2 -Add fixes tag [Ankit] -Derive aux where needed rather than through a function [Ankit] Fixes: ae4f902bb344 ("drm/i915/hdcp: Send the correct aux for DPMST HDCP scenario") Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223081453.1576918-3-suraj.kandpal@intel.com (cherry picked from commit 287c0de8b29489cdb20957980ca08c33ae4a67b9) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-02-25Linux 6.8-rc6v6.8-rc6Linus Torvalds
2024-02-25Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "Some more mostly boring fixes, but some not User reported ones: - the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty performance bug; user reported an untar initially taking two seconds and then ~2 minutes - kill a __GFP_NOFAIL in the buffered read path; this was a leftover from the trickier fix to kill __GFP_NOFAIL in readahead, where we can't return errors (and have to silently truncate the read ourselves). bcachefs can't use GFP_NOFAIL for folio state unlike iomap based filesystems because our folio state is just barely too big, 2MB hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL. additionally, the flags argument was just buggy, we weren't supplying GFP_KERNEL previously (!)" * tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs: bcachefs: fix bch2_save_backtrace() bcachefs: Fix check_snapshot() memcpy bcachefs: Fix bch2_journal_flush_device_pins() bcachefs: fix iov_iter count underflow on sub-block dio read bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree bcachefs: Kill __GFP_NOFAIL in buffered read path bcachefs: fix backpointer_to_text() when dev does not exist
2024-02-25rcu-tasks: Maintain real-time response in rcu_tasks_postscan()Paul E. McKenney
The current code will scan the entirety of each per-CPU list of exiting tasks in ->rtp_exit_list with interrupts disabled. This is normally just fine, because each CPU typically won't have very many tasks in this state. However, if a large number of tasks block late in do_exit(), these lists could be arbitrarily long. Low probability, perhaps, but it really could happen. This commit therefore occasionally re-enables interrupts while traversing these lists, inserting a dummy element to hold the current place in the list. In kernels built with CONFIG_PREEMPT_RT=y, this re-enabling happens after each list element is processed, otherwise every one-to-two jiffies. [ paulmck: Apply Frederic Weisbecker feedback. ] Link: https://lore.kernel.org/all/ZdeI_-RfdLR8jlsm@localhost.localdomain/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasksPaul E. McKenney
Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce. In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact. This commit therefore eliminates these deadlock by replacing the SRCU-based wait for do_exit() completion with per-CPU lists of tasks currently exiting. A given task will be on one of these per-CPU lists for the same period of time that this task would previously have been in the previous SRCU read-side critical section. These lists enable RCU Tasks to find the tasks that have already been removed from the tasks list, but that must nevertheless be waited upon. The RCU Tasks grace period gathers any of these do_exit() tasks that it must wait on, and adds them to the list of holdouts. Per-CPU locking and get_task_struct() are used to synchronize addition to and removal from these lists. Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> Reported-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocksPaul E. McKenney
This commit continues the elimination of deadlocks involving do_exit() and RCU tasks by causing exit_tasks_rcu_start() to add the current task to a per-CPU list and causing exit_tasks_rcu_stop() to remove the current task from whatever list it is on. These lists will be used to track tasks that are exiting, while still accounting for any RCU-tasks quiescent states that these tasks pass though. [ paulmck: Apply Frederic Weisbecker feedback. ] Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> Reported-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocksPaul E. McKenney
Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce. In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact. This commit therefore initializes the data structures that will be needed to rely on these quiescent states and to eliminate these deadlocks. Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> Reported-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25rcu-tasks: Initialize callback lists at rcu_init() timePaul E. McKenney
In order for RCU Tasks to reliably maintain per-CPU lists of exiting tasks, those lists must be initialized before it is possible for tasks to exit, especially given that the boot CPU is not necessarily CPU 0 (an example being, powerpc kexec() kernels). And at the time that rcu_init_tasks_generic() is called, a task could potentially exit, unconventional though that sort of thing might be. This commit therefore moves the calls to cblist_init_generic() from functions called from rcu_init_tasks_generic() to a new function named tasks_cblist_init_generic() that is invoked from rcu_init(). This constituted a bug in a commit that never went to mainline, so there is no need for any backporting to -stable. Reported-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocksPaul E. McKenney
Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce. In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact. This commit therefore adds the data structures that will be needed to rely on these quiescent states and to eliminate these deadlocks. Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> Reported-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-02-25bcachefs: fix bch2_save_backtrace()Kent Overstreet
Missed a call in the previous fix. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-25Merge tag 'docs-6.8-fixes3' of git://git.lwn.net/linuxLinus Torvalds
Pull two documentation build fixes from Jonathan Corbet: - The XFS online fsck documentation uses incredibly deeply nested subsection and list nesting; that broke the PDF docs build. Tweak a parameter to tell LaTeX to allow the deeper nesting. - Fix a 6.8 PDF-build regression * tag 'docs-6.8-fixes3' of git://git.lwn.net/linux: docs: translations: use attribute to store current language docs: Instruct LaTeX to cope with deeper nesting
2024-02-25Merge tag 'usb-6.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 6.8-rc6 to resolve some reported problems. These include: - regression fixes with typec tpcm code as reported by many - cdnsp and cdns3 driver fixes - usb role setting code bugfixes - build fix for uhci driver - ncm gadget driver bugfix - MAINTAINERS entry update All of these have been in linux-next all week with no reported issues and there is at least one fix in here that is in Thorsten's regression list that is being tracked" * tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tpcm: Fix issues with power being removed during reset MAINTAINERS: Drop myself as maintainer of TYPEC port controller drivers usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs Revert "usb: typec: tcpm: reset counter when enter into unattached state after try role" usb: gadget: omap_udc: fix USB gadget regression on Palm TE usb: dwc3: gadget: Don't disconnect if not started usb: cdns3: fix memory double free when handle zero packet usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable() usb: roles: don't get/set_role() when usb_role_switch is unregistered usb: roles: fix NULL pointer issue when put module's reference usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllers usb: cdnsp: blocked some cdns3 specific code usb: uhci-grlib: Explicitly include linux/platform_device.h
2024-02-25Merge tag 'tty-6.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are three small serial/tty driver fixes for 6.8-rc6 that resolve the following reported errors: - riscv hvc console driver fix that was reported by many - amba-pl011 serial driver fix for RS485 mode - stm32 serial driver fix for RS485 mode All of these have been in linux-next all week with no reported problems" * tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: amba-pl011: Fix DMA transmission in RS485 mode serial: stm32: do not always set SER_RS485_RX_DURING_TX if RS485 is enabled tty: hvc: Don't enable the RISC-V SBI console by default
2024-02-25Merge tag 'x86_urgent_for_v6.8_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure clearing CPU buffers using VERW happens at the latest possible point in the return-to-userspace path, otherwise memory accesses after the VERW execution could cause data to land in CPU buffers again * tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM/VMX: Move VERW closer to VMentry for MDS mitigation KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key x86/entry_32: Add VERW just before userspace transition x86/entry_64: Add VERW just before userspace transition x86/bugs: Add asm helpers for executing VERW
2024-02-25rust: add `container_of!` macroWedson Almeida Filho
This macro is used to obtain a pointer to an entire struct when given a pointer to a field in that struct. Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Matt Gilbride <mattgilbride@google.com> Link: https://lore.kernel.org/r/20240219-b4-rbtree-v2-1-0b113aab330d@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-02-25rust: str: implement `Display` and `Debug` for `BStr`Yutaro Ohno
Currently, `BStr` is just a type alias of `[u8]`, limiting its representation to a byte list rather than a character list, which is not ideal for printing and debugging. Implement `Display` and `Debug` traits for `BStr` to facilitate easier printing and debugging. Also, for this purpose, change `BStr` from a type alias of `[u8]` to a struct wrapper of `[u8]`. Co-developed-by: Virgile Andreani <armavica@ulminfo.fr> Signed-off-by: Virgile Andreani <armavica@ulminfo.fr> Signed-off-by: Yutaro Ohno <yutaro.ono.418@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/ZcSlGMGP-e9HqybA@ohnotp [ Formatted code comment. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-02-25rust: module: place generated init_module() function in .init.textThomas Bertschinger
Currently Rust kernel modules have their init code placed in the `.text` section of the .ko file. I don't think this causes any real problems for Rust modules as long as all code called during initialization lives in `.text`. However, if a Rust `init_module()` function (that lives in `.text`) calls a function marked with `__init` (in C) or `#[link_section = ".init.text"]` (in Rust), then a warning is generated by modpost because that function lives in `.init.text`. For example: WARNING: modpost: fs/bcachefs/bcachefs: section mismatch in reference: init_module+0x6 (section: .text) -> _RNvXCsj7d3tFpT5JS_15bcachefs_moduleNtB2_8BcachefsNtCsjDtqRIL3JAG_6kernel6Module4init (section: .init.text) I ran into this while experimenting with converting the bcachefs kernel module from C to Rust. The module's `init()`, written in Rust, calls C functions like `bch2_vfs_init()` which are placed in `.init.text`. This patch places the macro-generated `init_module()` Rust function in the `.init.text` section. It also marks `init_module()` as unsafe--now it may not be called after module initialization completes because it may be freed already. Note that this is not enough on its own to actually get all the module initialization code in that section. The module author must still add the `#[link_section = ".init.text"]` attribute to the Rust `init()` in the `impl kernel::Module` block in order to then call `__init` functions. However, this patch enables module authors do so, when previously it would not be possible (without warnings). Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240206153806.567055-1-tahbertschinger@gmail.com [ Reworded title to add prefix. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-02-25rust: types: add `try_from_foreign()` methodObei Sideg
Currently `ForeignOwnable::from_foreign()` only works for non-null pointers for the existing `impl`s (e.g. `Box`, `Arc`). In turn, this means callers may write code like: ```rust // `p` is a pointer that may be null. if p.is_null() { None } else { unsafe { Some(Self::from_foreign(ptr)) } } ``` Add a `try_from_foreign()` method to the trait with a default implementation that returns `None` if `ptr` is null, otherwise `Some(from_foreign(ptr))`, so that it can be used by callers instead. Link: https://github.com/Rust-for-Linux/linux/issues/1057 Signed-off-by: Obei Sideg <linux@obei.io> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/0100018d53f737f8-80c1fe97-0019-40d7-ab69-b1b192785cd7-000000@email.amazonses.com [ Fixed intra-doc links, improved `SAFETY` comment and reworded commit. ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-02-25Merge tag 'irq_urgent_for_v6.8_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Make sure GICv4 always gets initialized to prevent a kexec-ed kernel from silently failing to set it up - Do not call bus_get_dev_root() for the mbigen irqchip as it always returns NULL - use NULL directly - Fix hardware interrupt number truncation when assigning MSI interrupts - Correct sending end-of-interrupt messages to disabled interrupts lines on RISC-V PLIC * tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Do not assume vPE tables are preallocated irqchip/mbigen: Don't use bus_get_dev_root() to find the parent PCI/MSI: Prevent MSI hardware interrupt number truncation irqchip/sifive-plic: Enable interrupt if needed before EOI
2024-02-25Merge tag 'erofs-for-6.8-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fix from Gao Xiang: - Fix page refcount leak when looking up specific inodes introduced by metabuf reworking * tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix refcount on the metabuf used for inode lookup
2024-02-25Merge tag 'pull-fixes.pathwalk-rcu-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull RCU pathwalk fixes from Al Viro: "We still have some races in filesystem methods when exposed to RCU pathwalk. This series is a result of code audit (the second round of it) and it should deal with most of that stuff. Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate(). Up to maintainers (a note for NTFS folks - when documentation says that a method may not block, it *does* imply that blocking allocations are to be avoided. Really)" [ More explanations for people who aren't familiar with the vagaries of RCU path walking: most of it is hidden from filesystems, but if a filesystem actively participates in the low-level path walking it needs to make sure the fields involved in that walk are RCU-safe. That "actively participate in low-level path walking" includes things like having its own ->d_hash()/->d_compare() routines, or by having its own directory permission function that doesn't just use the common helpers. Having a ->d_revalidate() function will also have this issue. Note that instead of making everything RCU safe you can also choose to abort the RCU pathwalk if your operation cannot be done safely under RCU, but that obviously comes with a performance penalty. One common pattern is to allow the simple cases under RCU, and abort only if you need to do something more complicated. So not everything needs to be RCU-safe, and things like the inode etc that the VFS itself maintains obviously already are. But these fixes tend to be about properly RCU-delaying things like ->s_fs_info that are maintained by the filesystem and that got potentially released too early. - Linus ] * tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ext4_get_link(): fix breakage in RCU mode cifs_get_link(): bail out in unsafe case fuse: fix UAF in rcu pathwalks procfs: make freeing proc_fs_info rcu-delayed procfs: move dropping pde and pid from ->evict_inode() to ->free_inode() nfs: fix UAF on pathwalk running into umount nfs: make nfs_set_verifier() safe for use in RCU pathwalk afs: fix __afs_break_callback() / afs_drop_open_mmap() race hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper affs: free affs_sb_info with kfree_rcu() rcu pathwalk: prevent bogus hard errors from may_lookup() fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
2024-02-25Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fixes from Al Viro: "A couple of fixes - revert of regression from this cycle and a fix for erofs failure exit breakage (had been there since way back)" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: erofs: fix handling kern_mount() failure Revert "get rid of DCACHE_GENOCIDE"
2024-02-25iio: accel: adxl367: fix I2C FIFO data registerCosmin Tanislav
As specified in the datasheet, the I2C FIFO data register is 0x18, not 0x42. 0x42 was used by mistake when adapting the ADXL372 driver. Fix this mistake. Fixes: cbab791c5e2a ("iio: accel: add ADXL367 driver") Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240207033657.206171-2-demonsingur@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-25iio: accel: adxl367: fix DEVID read after resetCosmin Tanislav
regmap_read_poll_timeout() will not sleep before reading, causing the first read to return -ENXIO on I2C, since the chip does not respond to it while it is being reset. The datasheet specifies that a soft reset operation has a latency of 7.5ms. Add a 15ms sleep between reset and reading the DEVID register, and switch to a simple regmap_read() call. Fixes: cbab791c5e2a ("iio: accel: add ADXL367 driver") Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com> Reviewed-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20240207033657.206171-1-demonsingur@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-25arm64: dts: imx8mp: Fix LDB clocks propertyLiu Ying
The "media_ldb_root_clk" is the gate clock to enable or disable the clock provided by CCM(Clock Control Module) to LDB instead of the "media_ldb" clock which is the parent of the "media_ldb_root_clk" clock as a composite clock. Fix LDB clocks property by referencing the "media_ldb_root_clk" clock instead of the "media_ldb" clock. Fixes: e7567840ecd3 ("arm64: dts: imx8mp: Reorder clock and reg properties") Fixes: 94e6197dadc9 ("arm64: dts: imx8mp: Add LCDIF2 & LDB nodes") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Marek Vasut <marex@denx.de> Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2024-02-25arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoMMarek Vasut
The TC9595 reset GPIO is SAI1_RXC / GPIO4_IO01, fix the DT accordingly. The SAI5_RXD0 / GPIO3_IO21 is thus far unused TC9595 interrupt line. Fixes: 20d0b83e712b ("arm64: dts: imx8mp: Add TC9595 bridge on DH electronics i.MX8M Plus DHCOM") Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2024-02-25MAINTAINERS: Use a proper mailinglist for NXP i.MX developmentDaniel Baluta
So far we used an internal linux-imx@nxp.com email address to gather all patches related to NXP i.MX development. Let's switch to an open mailing list that provides ability for people from the community to subscribe and also have a proper archive. List interface at: https://lists.linux.dev. Archive is at: https://lore.kernel.org/imx/ Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2024-02-25ARM: dts: imx7: remove DSI port endpointsFrancesco Dolcini
This fixes the display not working on colibri imx7, the driver fails to load with the following error: mxsfb 30730000.lcdif: error -ENODEV: Cannot connect bridge NXP i.MX7 LCDIF is connected to both the Parallel LCD Display and to a MIPI DSI IP block, currently it's not possible to describe the connection to both. Remove the port endpoint from the SOC dtsi to prevent regressions, this would need to be defined on the board DTS. Reported-by: Hiago De Franco <hiagofranco@gmail.com> Closes: https://lore.kernel.org/r/34yzygh3mbwpqr2re7nxmhyxy3s7qmqy4vhxvoyxnoguktriur@z66m7gvpqlia/ Fixes: edbbae7fba49 ("ARM: dts: imx7: add MIPI-DSI support") Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2024-02-25iio: pressure: dlhl60d: Initialize empty DLH bytesKees Cook
3 bytes were being read but 4 were being written. Explicitly initialize the unused bytes to 0 and refactor the loop to use direct array indexing, which appears to silence a Clang false positive warning[1]. Indent improvement included for readability of the fixed code. Link: https://github.com/ClangBuiltLinux/linux/issues/2000 [1] Fixes: ac78c6aa4a5d ("iio: pressure: Add driver for DLH pressure sensors") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240223172936.it.875-kees@kernel.org Cc: <Stable@vger.kerenl.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-25iio: imu: inv_mpu6050: fix frequency setting when chip is offJean-Baptiste Maneyrol
Track correctly FIFO state and apply ODR change before starting the chip. Without the fix, you cannot change ODR more than 1 time when data buffering is off. This restriction on a single pending ODR change should only apply when the FIFO is on. Fixes: 111e1abd0045 ("iio: imu: inv_mpu6050: use the common inv_sensors timestamp module") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240219154741.90601-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-02-25Merge series 'Open block devices as files' of ↵Christian Brauner
https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-0-adbd023e19cc@kernel.org Pull block devices as files from Christian Brauner: This opens block devices as files. Instead of introducing a separate indirection into bdev_open_by_*() vis struct bdev_handle we can just make bdev_file_open_by_*() return a struct file. Opening and closing a block device from setup_bdev_super() and in all other places just becomes equivalent to opening and closing a file. This has held up in xfstests and in blktests so far and it seems stable and clean. The equivalence of opening and closing block devices to regular files is a win in and of itself imho. Added to that is the ability to do away with struct bdev_handle completely and make various low-level helpers private to the block layer. All places were we currently stash a struct bdev_handle we just stash a file and use an accessor such as file_bdev() akin to I_BDEV() to get to the block device. It's now also possible to use file->f_mapping as a replacement for bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as an alternative to bdev->bd_inode allowing us to significantly reduce or even fully remove bdev->bd_inode in follow-up patches. In addition, we could get rid of sb->s_bdev and various other places that stash the block device directly and instead stash the block device file. Again, this is follow-up work if we want this. * series 'Open block devices as files' of https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-0-adbd023e19cc@kernel.org: (35 commits) file: add alloc_file_pseudo_noaccount() file: prepare for new helper init: flush async file closing block: remove bdev_handle completely block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access bdev: remove bdev pointer from struct bdev_handle bdev: make struct bdev_handle private to the block layer bdev: make bdev_{release, open_by_dev}() private to block layer bdev: remove bdev_open_by_path() reiserfs: port block device access to file ocfs2: port block device access to file nfs: port block device access to files jfs: port block device access to file f2fs: port block device access to files ext4: port block device access to file erofs: port device access to file btrfs: port device access to file bcachefs: port block device access to file target: port block device access to file s390: port block device access to file nvme: port block device access to file block2mtd: port device access to files bcache: port block device access to files ... Signed-off-by: Christian Brauner <brauner@kernel.org>