summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-13drm/doc: Document device wedged eventRaag Jadav
Add documentation for device wedged event in a new "Device wedging" chapter. This describes basic definitions, prerequisites and consumer expectations along with an example. v8: Improve introduction (Christian, Rodrigo) v9: Add prerequisites section (Christian) v10: Clarify mmap cleanup and consumer prerequisites (Christian, Aravind) v11: Reference wedged event in device reset chapter (André) v12: Refine consumer expectations and terminologies (Xaver, Pekka) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-3-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-13drm: Introduce device wedged eventRaag Jadav
Introduce device wedged event, which notifies userspace of 'wedged' (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected and has become unrecoverable from driver context. Purpose of this implementation is to provide drivers a generic way to recover the device with the help of userspace intervention without taking any drastic measures (like resetting or re-enumerating the full bus, on which the underlying physical device is sitting) in the driver. A 'wedged' device is basically a device that is declared dead by the driver after exhausting all possible attempts to recover it from driver context. The uevent is the notification that is sent to userspace along with a hint about what could possibly be attempted to recover the device from userspace and bring it back to usable state. Different drivers may have different ideas of a 'wedged' device depending on hardware implementation of the underlying physical device, and hence the vendor agnostic nature of the event. It is up to the drivers to decide when they see the need for device recovery and how they want to recover from the available methods. Driver prerequisites -------------------- The driver, before opting for recovery, needs to make sure that the 'wedged' device doesn't harm the system as a whole by taking care of the prerequisites. Necessary actions must include disabling DMA to system memory as well as any communication channels with other devices. Further, the driver must ensure that all dma_fences are signalled and any device state that the core kernel might depend on is cleaned up. All existing mmaps should be invalidated and page faults should be redirected to a dummy page. Once the event is sent, the device must be kept in 'wedged' state until the recovery is performed. New accesses to the device (IOCTLs) should be rejected, preferably with an error code that resembles the type of failure the device has encountered. This will signify the reason for wedging, which can be reported to the application if needed. Recovery -------- Current implementation defines three recovery methods, out of which, drivers can use any one, multiple or none. Method(s) of choice will be sent in the uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in order of less to more side-effects. If driver is unsure about recovery or method is unknown (like soft/hard system reboot, firmware flashing, physical device replacement or any other procedure which can't be attempted on the fly), ``WEDGED=unknown`` will be sent instead. Userspace consumers can parse this event and attempt recovery as per the following expectations. =============== ======================================== Recovery method Consumer expectations =============== ======================================== none optional telemetry collection rebind unbind + bind driver bus-reset unbind + bus reset/re-enumeration + bind unknown consumer policy =============== ======================================== The only exception to this is ``WEDGED=none``, which signifies that the device was temporarily 'wedged' at some point but was recovered from driver context using device specific methods like reset. No explicit recovery is expected from the consumer in this case, but it can still take additional steps like gathering telemetry information (devcoredump, syslog). This is useful because the first hang is usually the most critical one which can result in consequential hangs or complete wedging. Consumer prerequisites ---------------------- It is the responsibility of the consumer to make sure that the device or its resources are not in use by any process before attempting recovery. With IOCTLs erroring out, all device memory should be unmapped and file descriptors should be closed to prevent leaks or undefined behaviour. The idea here is to clear the device of all user context beforehand and set the stage for a clean recovery. Example ------- Udev rule:: SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]", RUN+="/path/to/rebind.sh $env{DEVPATH}" Recovery script:: #!/bin/sh DEVPATH=$(readlink -f /sys/$1/device) DEVICE=$(basename $DEVPATH) DRIVER=$(readlink -f $DEVPATH/driver) echo -n $DEVICE > $DRIVER/unbind echo -n $DEVICE > $DRIVER/bind Customization ------------- Although basic recovery is possible with a simple script, consumers can define custom policies around recovery. For example, if the driver supports multiple recovery methods, consumers can opt for the suitable one depending on scenarios like repeat offences or vendor specific failures. Consumers can also choose to have the device available for debugging or telemetry collection and base their recovery decision on the findings. This is useful especially when the driver is unsure about recovery or method is unknown. v4: s/drm_dev_wedged/drm_dev_wedged_event Use drm_info() (Jani) Kernel doc adjustment (Aravind) v5: Send recovery method with uevent (Lina) v6: Access wedge_recovery_opts[] using helper function (Jani) Use snprintf() (Jani) v7: Convert recovery helpers into regular functions (Andy, Jani) Aesthetic adjustments (Andy) Handle invalid recovery method v8: Allow sending multiple methods with uevent (Lucas, Michal) static_assert() globally (Andy) v9: Provide 'none' method for device reset (Christian) Provide recovery opts using switch cases v11: Log device reset (André) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-2-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-13sched_ext: Use SCX_CALL_OP_TASK in task_tick_scxChuyi Zhou
Now when we use scx_bpf_task_cgroup() in ops.tick() to get the cgroup of the current task, the following error will occur: scx_foo[3795244] triggered exit kind 1024: runtime error (called on a task not being operated on) The reason is that we are using SCX_CALL_OP() instead of SCX_CALL_OP_TASK() when calling ops.tick(), which triggers the error during the subsequent scx_kf_allowed_on_arg_tasks() check. SCX_CALL_OP_TASK() was first introduced in commit 36454023f50b ("sched_ext: Track tasks that are subjects of the in-flight SCX operation") to ensure task's rq lock is held when accessing task's sched_group. Since ops.tick() is marked as SCX_KF_TERMINAL and task_tick_scx() is protected by the rq lock, we can use SCX_CALL_OP_TASK() to avoid the above issue. Similarly, the same changes should be made for ops.disable() and ops.exit_task(), as they are also protected by task_rq_lock() and it's safe to access the task's task_group. Fixes: 36454023f50b ("sched_ext: Track tasks that are subjects of the in-flight SCX operation") Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13Reapply "net: skb: introduce and use a single page frag cache"Jakub Kicinski
This reverts commit 011b0335903832facca86cd8ed05d7d8d94c9c76. Sabrina reports that the revert may trigger warnings due to intervening changes, especially the ability to rise MAX_SKB_FRAGS. Let's drop it and revisit once that part is also ironed out. Fixes: 011b03359038 ("Revert "net: skb: introduce and use a single page frag cache"") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/6bf54579233038bc0e76056c5ea459872ce362ab.1739375933.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h.Chuyi Zhou
Now BPF only supports bpf_list_push_{front,back}_impl kfunc, not bpf_list_ push_{front,back}. This patch fix this issue. Without this patch, if we use bpf_list kfunc in scx, the BPF verifier would complain: libbpf: extern (func ksym) 'bpf_list_push_back': not found in kernel or module BTFs libbpf: failed to load object 'scx_foo' libbpf: failed to load BPF skeleton 'scx_foo': -EINVAL With this patch, the bpf list kfunc will work as expected. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Fixes: 2a52ca7c98960 ("sched_ext: Add scx_simple and scx_example_qmap example schedulers") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13drm: panel: jd9365da-h3: fix reset signal polarityHugo Villeneuve
In jadard_prepare() a reset pulse is generated with the following statements (delays ommited for clarity): gpiod_set_value(jadard->reset, 1); --> Deassert reset gpiod_set_value(jadard->reset, 0); --> Assert reset for 10ms gpiod_set_value(jadard->reset, 1); --> Deassert reset However, specifying second argument of "0" to gpiod_set_value() means to deassert the GPIO, and "1" means to assert it. If the reset signal is defined as GPIO_ACTIVE_LOW in the DTS, the above statements will incorrectly generate the reset pulse (inverted) and leave it asserted (LOW) at the end of jadard_prepare(). Fix reset behavior by inverting gpiod_set_value() second argument in jadard_prepare(). Also modify second argument to devm_gpiod_get() in jadard_dsi_probe() to assert the reset when probing. Do not modify it in jadard_unprepare() as it is already properly asserted with "1", which seems to be the intended behavior. Fixes: 6b818c533dd8 ("drm: panel: Add Jadard JD9365DA-H3 DSI panel") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240927135306.857617-1-hugo@hugovil.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240927135306.857617-1-hugo@hugovil.com
2025-02-13sched_ext: selftests: Fix grammar in tests descriptionDevaansh Kumar
Fixed grammar for a few tests of sched_ext. Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13Merge tag 'loongarch-fixes-6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix bugs about idle, kernel_page_present(), IP checksum and KVM, plus some trival cleanups" * tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Set host with kernel mode when switch to VM mode LoongArch: KVM: Remove duplicated cache attribute setting LoongArch: KVM: Fix typo issue about GCFG feature detection LoongArch: csum: Fix OoB access in IP checksum code for negative lengths LoongArch: Remove the deprecated notifier hook mechanism LoongArch: Use str_yes_no() helper function for /proc/cpuinfo LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE LoongArch: Fix idle VS timer enqueue
2025-02-13Merge tag 'platform-drivers-x86-v6.14-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - thinkpad_acpi: - Fix registration of tpacpi platform driver - Support fan speed in ticks per revolution (Thinkpad X120e) - Support V9 DYTC profiles (new Thinkpad AMD platforms) - int3472: Handle GPIO "enable" vs "reset" variation (ov7251) * tag 'platform-drivers-x86-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: thinkpad_acpi: Fix registration of tpacpi platform driver platform/x86: int3472: Call "reset" GPIO "enable" for INT347E platform/x86: int3472: Use correct type for "polarity", call it gpio_flags platform/x86: thinkpad_acpi: Support for V9 DYTC platform profiles platform/x86: thinkpad_acpi: Fix invalid fan speed on ThinkPad X120e
2025-02-13s390/qeth: move netif_napi_add_tx() and napi_enable() from under BHAlexandra Winter
Like other drivers qeth is calling local_bh_enable() after napi_schedule() to kick-start softirqs [0]. Since netif_napi_add_tx() and napi_enable() now take the netdev_lock() mutex [1], move them out from under the BH protection. Same solution as in commit a60558644e20 ("wifi: mt76: move napi_enable() from under BH") Fixes: 1b23cdbd2bbc ("net: protect netdev->napi_list with netdev_lock()") Link: https://lore.kernel.org/netdev/20240612181900.4d9d18d0@kernel.org/ [0] Link: https://lore.kernel.org/netdev/20250115035319.559603-1-kuba@kernel.org/ [1] Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250212163659.2287292-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw()Wentao Liang
Add a check for the return value of mlxsw_sp_port_get_stats_raw() in __mlxsw_sp_port_get_stats(). If mlxsw_sp_port_get_stats_raw() returns an error, exit the function to prevent further processing with potentially invalid data. Fixes: 614d509aa1e7 ("mlxsw: Move ethtool_ops to spectrum_ethtool.c") Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250212152311.1332-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13ipv6: mcast: add RCU protection to mld_newpack()Eric Dumazet
mld_newpack() can be called without RTNL or RCU being held. Note that we no longer can use sock_alloc_send_skb() because ipv6.igmp_sk uses GFP_KERNEL allocations which can sleep. Instead use alloc_skb() and charge the net->ipv6.igmp_sk socket under RCU protection. Fixes: b8ad0cbc58f7 ("[NETNS][IPV6] mcast - handle several network namespace") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250212141021.1663666-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13team: better TEAM_OPTION_TYPE_STRING validationEric Dumazet
syzbot reported following splat [1] Make sure user-provided data contains one nul byte. [1] BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:633 [inline] BUG: KMSAN: uninit-value in string+0x3ec/0x5f0 lib/vsprintf.c:714 string_nocheck lib/vsprintf.c:633 [inline] string+0x3ec/0x5f0 lib/vsprintf.c:714 vsnprintf+0xa5d/0x1960 lib/vsprintf.c:2843 __request_module+0x252/0x9f0 kernel/module/kmod.c:149 team_mode_get drivers/net/team/team_core.c:480 [inline] team_change_mode drivers/net/team/team_core.c:607 [inline] team_mode_option_set+0x437/0x970 drivers/net/team/team_core.c:1401 team_option_set drivers/net/team/team_core.c:375 [inline] team_nl_options_set_doit+0x1339/0x1f90 drivers/net/team/team_core.c:2662 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x1214/0x12c0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2543 genl_rcv+0x40/0x60 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1348 netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:718 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:733 ____sys_sendmsg+0x877/0xb60 net/socket.c:2573 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2627 __sys_sendmsg net/socket.c:2659 [inline] __do_sys_sendmsg net/socket.c:2664 [inline] __se_sys_sendmsg net/socket.c:2662 [inline] __x64_sys_sendmsg+0x212/0x3c0 net/socket.c:2662 x64_sys_call+0x2ed6/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Reported-by: syzbot+1fcd957a82e3a1baa94d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1fcd957a82e3a1baa94d Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250212134928.1541609-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13Bluetooth: L2CAP: Fix corrupted list in hci_chan_delLuiz Augusto von Dentz
This fixes the following trace by reworking the locking of l2cap_conn so instead of only locking when changing the chan_l list this promotes chan_lock to a general lock of l2cap_conn so whenever it is being held it would prevents the likes of l2cap_conn_del to run: list_del corruption, ffff888021297e00->prev is LIST_POISON2 (dead000000000122) ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:61! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5896 Comm: syz-executor213 Not tainted 6.14.0-rc1-next-20250204-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 RIP: 0010:__list_del_entry_valid_or_report+0x12c/0x190 lib/list_debug.c:59 Code: 8c 4c 89 fe 48 89 da e8 32 8c 37 fc 90 0f 0b 48 89 df e8 27 9f 14 fd 48 c7 c7 a0 c0 60 8c 4c 89 fe 48 89 da e8 15 8c 37 fc 90 <0f> 0b 4c 89 e7 e8 0a 9f 14 fd 42 80 3c 2b 00 74 08 4c 89 e7 e8 cb RSP: 0018:ffffc90003f6f998 EFLAGS: 00010246 RAX: 000000000000004e RBX: dead000000000122 RCX: 01454d423f7fbf00 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff819f077c R09: 1ffff920007eded0 R10: dffffc0000000000 R11: fffff520007eded1 R12: dead000000000122 R13: dffffc0000000000 R14: ffff8880352248d8 R15: ffff888021297e00 FS: 00007f7ace6686c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7aceeeb1d0 CR3: 000000003527c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __list_del_entry_valid include/linux/list.h:124 [inline] __list_del_entry include/linux/list.h:215 [inline] list_del_rcu include/linux/rculist.h:168 [inline] hci_chan_del+0x70/0x1b0 net/bluetooth/hci_conn.c:2858 l2cap_conn_free net/bluetooth/l2cap_core.c:1816 [inline] kref_put include/linux/kref.h:65 [inline] l2cap_conn_put+0x70/0xe0 net/bluetooth/l2cap_core.c:1830 l2cap_sock_shutdown+0xa8a/0x1020 net/bluetooth/l2cap_sock.c:1377 l2cap_sock_release+0x79/0x1d0 net/bluetooth/l2cap_sock.c:1416 __sock_release net/socket.c:642 [inline] sock_close+0xbc/0x240 net/socket.c:1393 __fput+0x3e9/0x9f0 fs/file_table.c:448 task_work_run+0x24f/0x310 kernel/task_work.c:227 ptrace_notify+0x2d2/0x380 kernel/signal.c:2522 ptrace_report_syscall include/linux/ptrace.h:415 [inline] ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline] syscall_exit_work+0xc7/0x1d0 kernel/entry/common.c:173 syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] syscall_exit_to_user_mode+0x24a/0x340 kernel/entry/common.c:218 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f7aceeaf449 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f7ace668218 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: fffffffffffffffc RBX: 00007f7acef39328 RCX: 00007f7aceeaf449 RDX: 000000000000000e RSI: 0000000020000100 RDI: 0000000000000004 RBP: 00007f7acef39320 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 0000000000000004 R14: 00007f7ace668670 R15: 000000000000000b </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:__list_del_entry_valid_or_report+0x12c/0x190 lib/list_debug.c:59 Code: 8c 4c 89 fe 48 89 da e8 32 8c 37 fc 90 0f 0b 48 89 df e8 27 9f 14 fd 48 c7 c7 a0 c0 60 8c 4c 89 fe 48 89 da e8 15 8c 37 fc 90 <0f> 0b 4c 89 e7 e8 0a 9f 14 fd 42 80 3c 2b 00 74 08 4c 89 e7 e8 cb RSP: 0018:ffffc90003f6f998 EFLAGS: 00010246 RAX: 000000000000004e RBX: dead000000000122 RCX: 01454d423f7fbf00 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff819f077c R09: 1ffff920007eded0 R10: dffffc0000000000 R11: fffff520007eded1 R12: dead000000000122 R13: dffffc0000000000 R14: ffff8880352248d8 R15: ffff888021297e00 FS: 00007f7ace6686c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7acef05b08 CR3: 000000003527c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Reported-by: syzbot+10bd8fe6741eedd2be2e@syzkaller.appspotmail.com Tested-by: syzbot+10bd8fe6741eedd2be2e@syzkaller.appspotmail.com Fixes: b4f82f9ed43a ("Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
2025-02-13Bluetooth: btintel_pcie: Fix a potential race conditionKiran K
On HCI_OP_RESET command, firmware raises alive interrupt. Driver needs to wait for this before sending other command. This patch fixes the potential miss of alive interrupt due to which HCI_OP_RESET can timeout. Expected flow: If tx command is HCI_OP_RESET, 1. set data->gp0_received = false 2. send HCI_OP_RESET 3. wait for alive interrupt Actual flow having potential race: If tx command is HCI_OP_RESET, 1. send HCI_OP_RESET 1a. Firmware raises alive interrupt here and in ISR data->gp0_received is set to true 2. set data->gp0_received = false 3. wait for alive interrupt Signed-off-by: Kiran K <kiran.k@intel.com> Fixes: 05c200c8f029 ("Bluetooth: btintel_pcie: Add handshake between driver and firmware") Reported-by: Bjorn Helgaas <helgaas@kernel.org> Closes: https://patchwork.kernel.org/project/bluetooth/patch/20241001104451.626964-1-kiran.k@intel.com/ Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-02-13Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmdLuiz Augusto von Dentz
After the hci sync command releases l2cap_conn, the hci receive data work queue references the released l2cap_conn when sending to the upper layer. Add hci dev lock to the hci receive data work queue to synchronize the two. [1] BUG: KASAN: slab-use-after-free in l2cap_send_cmd+0x187/0x8d0 net/bluetooth/l2cap_core.c:954 Read of size 8 at addr ffff8880271a4000 by task kworker/u9:2/5837 CPU: 0 UID: 0 PID: 5837 Comm: kworker/u9:2 Not tainted 6.13.0-rc5-syzkaller-00163-gab75170520d4 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: hci1 hci_rx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 l2cap_build_cmd net/bluetooth/l2cap_core.c:2964 [inline] l2cap_send_cmd+0x187/0x8d0 net/bluetooth/l2cap_core.c:954 l2cap_sig_send_rej net/bluetooth/l2cap_core.c:5502 [inline] l2cap_sig_channel net/bluetooth/l2cap_core.c:5538 [inline] l2cap_recv_frame+0x221f/0x10db0 net/bluetooth/l2cap_core.c:6817 hci_acldata_packet net/bluetooth/hci_core.c:3797 [inline] hci_rx_work+0x508/0xdb0 net/bluetooth/hci_core.c:4040 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 5837: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4329 kmalloc_noprof include/linux/slab.h:901 [inline] kzalloc_noprof include/linux/slab.h:1037 [inline] l2cap_conn_add+0xa9/0x8e0 net/bluetooth/l2cap_core.c:6860 l2cap_connect_cfm+0x115/0x1090 net/bluetooth/l2cap_core.c:7239 hci_connect_cfm include/net/bluetooth/hci_core.h:2057 [inline] hci_remote_features_evt+0x68e/0xac0 net/bluetooth/hci_event.c:3726 hci_event_func net/bluetooth/hci_event.c:7473 [inline] hci_event_packet+0xac2/0x1540 net/bluetooth/hci_event.c:7525 hci_rx_work+0x3f3/0xdb0 net/bluetooth/hci_core.c:4035 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 54: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2353 [inline] slab_free mm/slub.c:4613 [inline] kfree+0x196/0x430 mm/slub.c:4761 l2cap_connect_cfm+0xcc/0x1090 net/bluetooth/l2cap_core.c:7235 hci_connect_cfm include/net/bluetooth/hci_core.h:2057 [inline] hci_conn_failed+0x287/0x400 net/bluetooth/hci_conn.c:1266 hci_abort_conn_sync+0x56c/0x11f0 net/bluetooth/hci_sync.c:5603 hci_cmd_sync_work+0x22b/0x400 net/bluetooth/hci_sync.c:332 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Reported-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=31c2f641b850a348a734 Tested-by: syzbot+31c2f641b850a348a734@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-02-13rust/kernel: Add faux device bindingsLyude Paul
This introduces a module for working with faux devices in rust, along with adding sample code to show how the API is used. Unlike other types of devices, we don't provide any hooks for device probe/removal - since these are optional for the faux API and are unnecessary in rust. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Maíra Canal <mairacanal@riseup.net> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/2025021026-exert-accent-b4c6@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-13driver core: add a faux bus for use when a simple device/bus is neededGreg Kroah-Hartman
Many drivers abuse the platform driver/bus system as it provides a simple way to create and bind a device to a driver-specific set of probe/release functions. Instead of doing that, and wasting all of the memory associated with a platform device, here is a "faux" bus that can be used instead. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/2025021026-atlantic-gibberish-3f0c@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-13MAINTAINERS: Use my kernel.org address for I2C ACPI workMika Westerberg
Switch to use my kernel.org address for I2C ACPI work. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-02-13block: cleanup and fix batch completion adding conditionsJens Axboe
The conditions for whether or not a request is allowed adding to a completion batch are a bit hard to read, and they also have a few issues. One is that ioerror may indeed be a random value on passthrough, and it's being checked unconditionally of whether or not the given request is a passthrough request or not. Rewrite the conditions to be separate for easier reading, and only check ioerror for non-passthrough requests. This fixes an issue with bio unmapping on passthrough, where it fails getting added to a batch. This both leads to suboptimal performance, and may trigger a potential schedule-under-atomic condition for polled passthrough IO. Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()") Link: https://lore.kernel.org/r/20575f0a-656e-4bb3-9d82-dec6c7e3a35c@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-13drm: bridge: ti-sn65dsi83: Add error recovery mechanismHerve Codina
In some cases observed during ESD tests, the TI SN65DSI83 cannot recover from errors by itself. A full restart of the bridge is needed in those cases to have the bridge output LVDS signals again. Also, during tests, cases were observed where reading the status of the bridge was not even possible. Indeed, in those cases, the bridge stops to acknowledge I2C transactions. Only a full reset of the bridge (power off/on) brings back the bridge to a functional state. The TI SN65DSI83 has some error detection capabilities. Introduce an error recovery mechanism based on this detection. The errors detected are signaled through an interrupt. On system where this interrupt is not available, the driver uses a polling monitoring fallback to check for errors. When an error is present or when reading the bridge status leads to an I2C failure, the recovery process is launched. Restarting the bridge needs to redo the initialization sequence. This initialization sequence has to be done with the DSI data lanes driven in LP11 state. In order to do that, the recovery process resets the whole output path (i.e the path from the encoder to the connector) where the bridge is located. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250210132620.42263-5-herve.codina@bootlin.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-13drm/vc4: hdmi: Use drm_atomic_helper_reset_crtc()Herve Codina
The current code uses a the reset_pipe() local function to reset the CRTC outputs. drm_atomic_helper_reset_crtc() has been introduced recently and it performs exact same operations. In order to avoid code duplication, use the new helper instead of the local function. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250210132620.42263-4-herve.codina@bootlin.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-13drm/atomic-helper: Introduce drm_atomic_helper_reset_crtc()Herve Codina
drm_atomic_helper_reset_crtc() allows to reset the CRTC active outputs. This resets all active components available between the CRTC and connectors. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250210132620.42263-3-herve.codina@bootlin.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-13dt-bindings: display: bridge: sn65dsi83: Add interruptHerve Codina
Both the TI SN65DSI83 and SN65DSI84 bridges have an IRQ pin to signal errors using interrupt. This interrupt is not documented in the binding. Add the missing interrupts property. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250210132620.42263-2-herve.codina@bootlin.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-02-13Merge patch series "netfs: Miscellaneous fixes"Christian Brauner
David Howells <dhowells@redhat.com> says: Here are some miscellaneous fixes and changes for netfslib: (1) Fix a number of read-retry hangs, including: (a) Incorrect getting/putting of references on subreqs as we retry them. (b) Failure to track whether a last old subrequest in a retried set is superfluous. (c) Inconsistency in the usage of wait queues used for subrequests (ie. using clear_and_wake_up_bit() whilst waiting on a private waitqueue). (Note that waitqueue consistency also needs looking at for netfs_io_request structs.) (2) Add stats counters for retries and publish in /proc/fs/netfs/stats. This is not a fix per se, but is useful in debugging and shouldn't otherwise change the operation of the code. (3) Fix the ordering of queuing subrequests with respect to setting the request flag that says we've now queued them all. * patches from https://lore.kernel.org/r/20250212222402.3618494-1-dhowells@redhat.com: netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queued netfs: Add retry stat counters netfs: Fix a number of read-retry hangs Link: https://lore.kernel.org/r/20250212222402.3618494-1-dhowells@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13netfs: Fix setting NETFS_RREQ_ALL_QUEUED to be after all subreqs queuedDavid Howells
Due to the code that queues a subreq on the active subrequest list getting moved to netfs_issue_read(), the NETFS_RREQ_ALL_QUEUED flag may now get set before the list-add actually happens. This is not a problem if the collection worker happens after the list-add, but it's a race - and, for 9P, where the read from the server is synchronous and done in the submitting thread, this is a lot more likely. The result is that, if the timing is wrong, a ref gets leaked because the collector thinks that all the subreqs have completed (because it can't see the last one yet) and clears NETFS_RREQ_IN_PROGRESS - at which point, the collection worker no longer goes into the collector. This can be provoked with AFS by injecting an msleep() right before the final subreq is queued. Fix this by splitting the queuing part out of netfs_issue_read() into a new function, netfs_queue_read(), and calling it separately. The setting of NETFS_RREQ_ALL_QUEUED is then done by netfs_queue_read() whilst it is holding the spinlock (that's probably unnecessary, but shouldn't hurt). It might be better to set a flag on the final subreq, but this could be a problem if an error occurs and we can't queue it. Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Closes: https://lore.kernel.org/r/a7x33d4dnMdGTtRivptq6S1i8btK70SNBP2XyX_xwDAhLvgQoPox6FVBOkifq4eBinfFfbZlIkMZBe3QarlWTxoEtHZwJCZbNKtaqrR7PvI=@pm.me/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250212222402.3618494-4-dhowells@redhat.com Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Steve French <stfrench@microsoft.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13netfs: Add retry stat countersDavid Howells
Add stat counters to count the number of request and subrequest retries and display them in /proc/fs/netfs/stats. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250212222402.3618494-3-dhowells@redhat.com cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13netfs: Fix a number of read-retry hangsDavid Howells
Fix a number of hangs in the netfslib read-retry code, including: (1) netfs_reissue_read() doubles up the getting of references on subrequests, thereby leaking the subrequest and causing inode eviction to wait indefinitely. This can lead to the kernel reporting a hang in the filesystem's evict_inode(). Fix this by removing the get from netfs_reissue_read() and adding one to netfs_retry_read_subrequests() to deal with the one place that didn't double up. (2) The loop in netfs_retry_read_subrequests() that retries a sequence of failed subrequests doesn't record whether or not it retried the one that the "subreq" pointer points to when it leaves the loop. It may not if renegotiation/repreparation of the subrequests means that fewer subrequests are needed to span the cumulative range of the sequence. Because it doesn't record this, the piece of code that discards now-superfluous subrequests doesn't know whether it should discard the one "subreq" points to - and so it doesn't. Fix this by noting whether the last subreq it examines is superfluous and if it is, then getting rid of it and all subsequent subrequests. If that one one wasn't superfluous, then we would have tried to go round the previous loop again and so there can be no further unretried subrequests in the sequence. (3) netfs_retry_read_subrequests() gets yet an extra ref on any additional subrequests it has to get because it ran out of ones it could reuse to to renegotiation/repreparation shrinking the subrequests. Fix this by removing that extra ref. (4) In netfs_retry_reads(), it was using wait_on_bit() to wait for NETFS_SREQ_IN_PROGRESS to be cleared on all subrequests in the sequence - but netfs_read_subreq_terminated() is now using a wait queue on the request instead and so this wait will never finish. Fix this by waiting on the wait queue instead. To make this work, a new flag, NETFS_RREQ_RETRYING, is now set around the wait loop to tell the wake-up code to wake up the wait queue rather than requeuing the request's work item. Note that this flag replaces the NETFS_RREQ_NEED_RETRY flag which is no longer used. (5) Whilst not strictly anything to do with the hang, netfs_retry_read_subrequests() was also doubly incrementing the subreq_counter and re-setting the debug index, leaving a gap in the trace. This is also fixed. One of these hangs was observed with 9p and with cifs. Others were forced by manual code injection into fs/afs/file.c. Firstly, afs_prepare_read() was created to provide an changing pattern of maximum subrequest sizes: static int afs_prepare_read(struct netfs_io_subrequest *subreq) { struct netfs_io_request *rreq = subreq->rreq; if (!S_ISREG(subreq->rreq->inode->i_mode)) return 0; if (subreq->retry_count < 20) rreq->io_streams[0].sreq_max_len = umax(200, 2222 - subreq->retry_count * 40); else rreq->io_streams[0].sreq_max_len = 3333; return 0; } and pointed to by afs_req_ops. Then the following: struct netfs_io_subrequest *subreq = op->fetch.subreq; if (subreq->error == 0 && S_ISREG(subreq->rreq->inode->i_mode) && subreq->retry_count < 20) { subreq->transferred = subreq->already_done; __clear_bit(NETFS_SREQ_HIT_EOF, &subreq->flags); __set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); afs_fetch_data_notify(op); return; } was inserted into afs_fetch_data_success() at the beginning and struct netfs_io_subrequest given an extra field, "already_done" that was set to the value in "subreq->transferred" by netfs_reissue_read(). When reading a 4K file, the subrequests would get gradually smaller, a new subrequest would be allocated around the 3rd retry and then eventually be rendered superfluous when the 20th retry was hit and the limit on the first subrequest was eased. Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250212222402.3618494-2-dhowells@redhat.com Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Steve French <stfrench@microsoft.com> cc: Ihor Solodrai <ihor.solodrai@pm.me> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13drm/vkms: Fix use after free and double free on init errorJosé Expósito
If the driver initialization fails, the vkms_exit() function might access an uninitialized or freed default_config pointer and it might double free it. Fix both possible errors by initializing default_config only when the driver initialization succeeded. Reported-by: Louis Chauvet <louis.chauvet@bootlin.com> Closes: https://lore.kernel.org/all/Z5uDHcCmAwiTsGte@louis-chauvet-laptop/ Fixes: 2df7af93fdad ("drm/vkms: Add vkms_config type") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmremann@suse.de> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250212084912.3196-1-jose.exposito89@gmail.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-02-13drm/i915/psr: Write PSR2_MAN_TRK_CTL on DSB commit as wellJouni Högander
Add PSR2_MAN_TRK_CTL writing into DSB commit in intel_atomic_dsb_finish. Taking PSR lock over DSB commit is not needed because PSR2_MAN_TRK_CTL is now written only by DSB. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-8-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Allow writing PSR2_MAN_TRK_CTL using DSBJouni Högander
Allow writing PSR2_MAN_TRK_CTL using DSB by using intel_de_write_dsb. Do not check intel_dp->psr.lock being held when using DSB. This assertion doesn't make sense as in case of using DSB the actual write happens later and we are not taking intel_dp->psr.lock mutex over dsb commit. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-7-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Use SFF_CTL on invalidate/flush for LunarLake onwardsJouni Högander
In LunarLake we have SFF_CTL register which contains SFF bit ored with respective SFF bit in PSR2_MAN_TRK_CTL register. Use this register instead of the bit in PSR2_MAN_TRK_CTL on frontbuffer tracking callbacks. This helps us avoiding taking psr mutex when performing atomic commit. We don't need to set the CFF bit as selective update configuration in PSR2_MAN_TRL_CTL is not overwritten anymore. I.e. we have valid configuration in PSR2_MAN_TRK_CTL and in plane SEL_FETCH_* registers when SFF bit gets cleared by the HW in case something triggers "frame change" event after SFF bit is cleared. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-6-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Add register definitions for SFF_CTL and CFF_CTL registersJouni Högander
Add register definitions for SFF_CTL and CFF_CTL registers. Name them as LNL_SFF_CTL and LNL_CFF_CTL. v2: use _MMIO_TRANS instead of _MMIO_TRANS2 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-5-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Split setting sff and cff bits away from intel_psr_force_updateJouni Högander
This is a clean-up and a preparation for adding own SFF and CFF registers for LunarLake onwards. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-4-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Rename psr_force_hw_tracking_exit as intel_psr_force_updateJouni Högander
psr_force_hw_tracking_exit is misleading name as it is used for PSR1, PSR2 HW tracking and PSR2 selective fetch. Due to this rename it as intel_psr_force_update. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-3-jouni.hogander@intel.com
2025-02-13drm/i915/psr: Use PSR2_MAN_TRK_CTL CFF bit only to send full updateJouni Högander
We are preparing for a change where only frontbuffer flush will use single full frame bit of a new register (SFF_CTL) available on LunarLake onwards. It shouldn't be necessary to have SFF bit set if CFF bit is set in PSR2_MAN_TRK_CTL -> removing setting it on all platforms as there is not reason to have it different on older platforms. v2: commit message improved Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213064804.2077127-2-jouni.hogander@intel.com
2025-02-13Merge tag 'usb-serial-6.14-rc3' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.14-rc3 Here are some new modem device ids and a couple of related cleanups of the device id table. All have been in linux-next with no reported issues. * tag 'usb-serial-6.14-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: drop MeiG Smart defines USB: serial: option: fix Telit Cinterion FN990A name USB: serial: option: add Telit Cinterion FN990B compositions USB: serial: option: add MeiG Smart SLM828
2025-02-13PCI: Avoid FLR for Mediatek MT7922 WiFiBjorn Helgaas
The Mediatek MT7922 WiFi device advertises FLR support, but it apparently does not work, and all subsequent config reads return ~0: pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint pciback 0000:01:00.0: not ready 65535ms after FLR; giving up After an FLR, pci_dev_wait() waits for the device to become ready. Prior to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"), it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR (~0). If it times out, pci_dev_wait() returns -ENOTTY and __pci_reset_function_locked() tries the next available reset method. Typically this is Secondary Bus Reset, which does work, so the MT7922 is eventually usable. After d591f6804e7e, if Configuration Request Retry Status Software Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it is something other than the special 0x0001 Vendor ID that indicates a completion with RRS status. When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001, i.e., the config read was completed with RRS, or a valid Vendor ID. On the MT7922, it seems that all config reads after FLR return ~0 indefinitely. When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's a valid Vendor ID and the device is now ready, so it returns with success. After pci_dev_wait() returns success, we restore config space and continue. Since the MT7922 is not actually ready after the FLR, the restore fails and the device is unusable. We considered changing pci_dev_wait() to continue polling if a PCI_VENDOR_ID read returns either 0x0001 or 0xffff. This "works" as it did before d591f6804e7e, although we have to wait for the timeout and then fall back to SBR. But it doesn't work for SR-IOV VFs, which *always* return 0xffff as the Vendor ID. Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely. This will cause fallback to another reset method, such as SBR. Link: https://lore.kernel.org/r/20250212193516.88741-1-helgaas@kernel.org Fixes: d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS") Link: https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149 Link: https://lore.kernel.org/r/Z4pHll_6GX7OUBzQ@mail-itl Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Cc: stable@vger.kernel.org
2025-02-13firmware: arm_scmi: imx: Correct tx size of scmi_imx_misc_ctrl_setPeng Fan
'struct scmi_imx_misc_ctrl_set_in' has a zero length array in the end, The sizeof will not count 'value[]', and hence Tx size will be smaller than actual size for Tx,and SCMI firmware will flag this as protocol error. Fix this by enlarge the Tx size with 'num * sizeof(__le32)' to count in the size of data. Fixes: 61c9f03e22fc ("firmware: arm_scmi: Add initial support for i.MX MISC protocol") Reviewed-by: Jacky Bai <ping.bai@nxp.com> Tested-by: Shengjiu Wang <shengjiu.wang@nxp.com> Acked-by: Jason Liu <jason.hui.liu@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Message-Id: <20250123063441.392555-1-peng.fan@oss.nxp.com> (sudeep.holla: Commit rewording and replace hardcoded sizeof(__le32) value) Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2025-02-13genirq: Remove unused CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGSAnup Patel
CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS is not used anymore, hence remove it. Fixes: f94a18249b7f ("genirq: Remove IRQ_MOVE_PCNTXT and related code") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250209041655.331470-7-apatel@ventanamicro.com
2025-02-13Xen/swiotlb: mark xen_swiotlb_fixup() __initJan Beulich
It's sole user (pci_xen_swiotlb_init()) is __init, too. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-ID: <e1198286-99ec-41c1-b5ad-e04e285836c9@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-13x86/xen: allow larger contiguous memory regions in PV guestsJuergen Gross
Today a PV guest (including dom0) can create 2MB contiguous memory regions for DMA buffers at max. This has led to problems at least with the megaraid_sas driver, which wants to allocate a 2.3MB DMA buffer. The limiting factor is the frame array used to do the hypercall for making the memory contiguous, which has 512 entries and is just a static array in mmu_pv.c. In order to not waste memory for non-PV guests, put the initial frame array into .init.data section and dynamically allocate an array from the .init_after_bootmem hook of PV guests. In case a contiguous memory area larger than the initially supported 2MB is requested, allocate a larger buffer for the frame list. Note that such an allocation is tried only after memory management has been initialized properly, which is tested via a flag being set in the .init_after_bootmem hook. Fixes: 9f40ec84a797 ("xen/swiotlb: add alignment check for dma buffers") Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Alan Robinson <Alan.Robinson@fujitsu.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-13xen/swiotlb: relax alignment requirementsJuergen Gross
When mapping a buffer for DMA via .map_page or .map_sg DMA operations, there is no need to check the machine frames to be aligned according to the mapped areas size. All what is needed in these cases is that the buffer is contiguous at machine level. So carve out the alignment check from range_straddles_page_boundary() and move it to a helper called by xen_swiotlb_alloc_coherent() and xen_swiotlb_free_coherent() directly. Fixes: 9f40ec84a797 ("xen/swiotlb: add alignment check for dma buffers") Reported-by: Jan Vejvalka <jan.vejvalka@lfmotol.cuni.cz> Tested-by: Jan Vejvalka <jan.vejvalka@lfmotol.cuni.cz> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-13arm64: rust: clean Rust 1.85.0 warning using softfloat targetMiguel Ojeda
Starting with Rust 1.85.0 (to be released 2025-02-20), `rustc` warns [1] about disabling neon in the aarch64 hardfloat target: warning: target feature `neon` cannot be toggled with `-Ctarget-feature`: unsound on hard-float targets because it changes float ABI | = note: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release! = note: for more information, see issue #116344 <https://github.com/rust-lang/rust/issues/116344> Thus, instead, use the softfloat target instead. While trying it out, I found that the kernel sanitizers were not enabled for that built-in target [2]. Upstream Rust agreed to backport the enablement for the current beta so that it is ready for the 1.85.0 release [3] -- thanks! However, that still means that before Rust 1.85.0, we cannot switch since sanitizers could be in use. Thus conditionally do so. Cc: stable@vger.kernel.org # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Matthew Maurer <mmaurer@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Ralf Jung <post@ralfj.de> Cc: Jubilee Young <workingjubilee@gmail.com> Link: https://github.com/rust-lang/rust/pull/133417 [1] Link: https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/arm64.20neon.20.60-Ctarget-feature.60.20warning/near/495358442 [2] Link: https://github.com/rust-lang/rust/pull/135905 [3] Link: https://github.com/rust-lang/rust/issues/116344 Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Trevor Gross <tmgross@umich.edu> Tested-by: Matthew Maurer <mmaurer@google.com> Reviewed-by: Ralf Jung <post@ralfj.de> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250210163732.281786-1-ojeda@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13MIPS: fix mips_get_syscall_arg() for o32Dmitry V. Levin
This makes ptrace/get_syscall_info selftest pass on mips o32 and mips64 o32 by fixing the following two test assertions: 1. get_syscall_info test assertion on mips o32: # get_syscall_info.c:218:get_syscall_info:Expected exp_args[5] (3134521044) == info.entry.args[4] (4911432) # get_syscall_info.c:219:get_syscall_info:wait #1: entry stop mismatch 2. get_syscall_info test assertion on mips64 o32: # get_syscall_info.c:209:get_syscall_info:Expected exp_args[2] (3134324433) == info.entry.args[1] (18446744072548908753) # get_syscall_info.c:210:get_syscall_info:wait #1: entry stop mismatch The first assertion happens due to mips_get_syscall_arg() trying to access another task's context but failing to do it properly because get_user() it calls just peeks at the current task's context. It usually does not crash because the default user stack always gets assigned the same VMA, but it is pure luck which mips_get_syscall_arg() wouldn't have if e.g. the stack was switched (via setcontext(3) or however) or a non-default process's thread peeked at, and in any case irrelevant data is obtained just as observed with the test case. mips_get_syscall_arg() ought to be using access_remote_vm() instead to retrieve the other task's stack contents, but given that the data has been already obtained and saved in `struct pt_regs' it would be an overkill. The first assertion is fixed for mips o32 by using struct pt_regs.args instead of get_user() to obtain syscall arguments. This approach works due to this piece in arch/mips/kernel/scall32-o32.S: /* * Ok, copy the args from the luser stack to the kernel stack. */ .set push .set noreorder .set nomacro load_a4: user_lw(t5, 16(t0)) # argument #5 from usp load_a5: user_lw(t6, 20(t0)) # argument #6 from usp load_a6: user_lw(t7, 24(t0)) # argument #7 from usp load_a7: user_lw(t8, 28(t0)) # argument #8 from usp loads_done: sw t5, PT_ARG4(sp) # argument #5 to ksp sw t6, PT_ARG5(sp) # argument #6 to ksp sw t7, PT_ARG6(sp) # argument #7 to ksp sw t8, PT_ARG7(sp) # argument #8 to ksp .set pop .section __ex_table,"a" PTR_WD load_a4, bad_stack_a4 PTR_WD load_a5, bad_stack_a5 PTR_WD load_a6, bad_stack_a6 PTR_WD load_a7, bad_stack_a7 .previous arch/mips/kernel/scall64-o32.S has analogous code for mips64 o32 that allows fixing the issue by obtaining syscall arguments from struct pt_regs.regs[4..11] instead of the erroneous use of get_user(). The second assertion is fixed by truncating 64-bit values to 32-bit syscall arguments. Fixes: c0ff3c53d4f9 ("MIPS: Enable HAVE_ARCH_TRACEHOOK.") Signed-off-by: Dmitry V. Levin <ldv@strace.io> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-02-13MIPS: Export syscall stack arguments properly for remote useMaciej W. Rozycki
We have several places across the kernel where we want to access another task's syscall arguments, such as ptrace(2), seccomp(2), etc., by making a call to syscall_get_arguments(). This works for register arguments right away by accessing the task's `regs' member of `struct pt_regs', however for stack arguments seen with 32-bit/o32 kernels things are more complicated. Technically they ought to be obtained from the user stack with calls to an access_remote_vm(), but we have an easier way available already. So as to be able to access syscall stack arguments as regular function arguments following the MIPS calling convention we copy them over from the user stack to the kernel stack in arch/mips/kernel/scall32-o32.S, in handle_sys(), to the current stack frame's outgoing argument space at the top of the stack, which is where the handler called expects to see its incoming arguments. This area is also pointed at by the `pt_regs' pointer obtained by task_pt_regs(). Make the o32 stack argument space a proper member of `struct pt_regs' then, by renaming the existing member from `pad0' to `args' and using generated offsets to access the space. No functional change though. With the change in place the o32 kernel stack frame layout at the entry to a syscall handler invoked by handle_sys() is therefore as follows: $sp + 68 -> | ... | <- pt_regs.regs[9] +---------------------+ $sp + 64 -> | $t0 | <- pt_regs.regs[8] +---------------------+ $sp + 60 -> | $a3/argument #4 | <- pt_regs.regs[7] +---------------------+ $sp + 56 -> | $a2/argument #3 | <- pt_regs.regs[6] +---------------------+ $sp + 52 -> | $a1/argument #2 | <- pt_regs.regs[5] +---------------------+ $sp + 48 -> | $a0/argument #1 | <- pt_regs.regs[4] +---------------------+ $sp + 44 -> | $v1 | <- pt_regs.regs[3] +---------------------+ $sp + 40 -> | $v0 | <- pt_regs.regs[2] +---------------------+ $sp + 36 -> | $at | <- pt_regs.regs[1] +---------------------+ $sp + 32 -> | $zero | <- pt_regs.regs[0] +---------------------+ $sp + 28 -> | stack argument #8 | <- pt_regs.args[7] +---------------------+ $sp + 24 -> | stack argument #7 | <- pt_regs.args[6] +---------------------+ $sp + 20 -> | stack argument #6 | <- pt_regs.args[5] +---------------------+ $sp + 16 -> | stack argument #5 | <- pt_regs.args[4] +---------------------+ $sp + 12 -> | psABI space for $a3 | <- pt_regs.args[3] +---------------------+ $sp + 8 -> | psABI space for $a2 | <- pt_regs.args[2] +---------------------+ $sp + 4 -> | psABI space for $a1 | <- pt_regs.args[1] +---------------------+ $sp + 0 -> | psABI space for $a0 | <- pt_regs.args[0] +---------------------+ holding user data received and with the first 4 frame slots reserved by the psABI for the compiler to spill the incoming arguments from $a0-$a3 registers (which it sometimes does according to its needs) and the next 4 frame slots designated by the psABI for any stack function arguments that follow. This data is also available for other tasks to peek/poke at as reqired and where permitted. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-02-13ASoC: imx-audmix: remove cpu_mclk which is from cpu dai deviceShengjiu Wang
When defer probe happens, there may be below error: platform 59820000.sai: Resources present before probing The cpu_mclk clock is from the cpu dai device, if it is not released, then the cpu dai device probe will fail for the second time. The cpu_mclk is used to get rate for rate constraint, rate constraint may be specific for each platform, which is not necessary for machine driver, so remove it. Fixes: b86ef5367761 ("ASoC: fsl: Add Audio Mixer machine driver") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20250213070518.547375-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-13arm64: Add missing registrations of hwcapsMark Brown
Commit 819935464cb2 ("arm64/hwcap: Describe 2024 dpISA extensions to userspace") added definitions for HWCAP_FPRCVT, HWCAP_F8MM8 and HWCAP_F8MM4 but did not include the crucial registration in arm64_elf_hwcaps. Add it. Fixes: 819935464cb2 ("arm64/hwcap: Describe 2024 dpISA extensions to userspace") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250212-arm64-fix-2024-dpisa-v2-1-67a1c11d6001@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13ACPI: GTDT: Relax sanity checking on Platform Timers array countOliver Upton
Perhaps unsurprisingly there are some platforms where the GTDT isn't quite right and the Platforms Timer array overflows the length of the overall table. While the recently-added sanity checking isn't wrong, it makes it impossible to boot the kernel on offending platforms. Try to hobble along and limit the Platform Timer count to the bounds of the table. Cc: Marc Zyngier <maz@kernel.org> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Zheng Zengkai <zhengzengkai@huawei.com> Cc: stable@vger.kernel.org Fixes: 263e22d6bd1f ("ACPI: GTDT: Tighten the check for the array of platform timer structures") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20250128001749.3132656-1-oliver.upton@linux.dev Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13arm64: amu: Delay allocating cpumask for AMU FIE supportBeata Michalska
For the time being, the amu_fie_cpus cpumask is being exclusively used by the AMU-related internals of FIE support and is guaranteed to be valid on every access currently made. Still the mask is not being invalidated on one of the error handling code paths, which leaves a soft spot with theoretical risk of UAF for CPUMASK_OFFSTACK cases. To make things sound, delay allocating said cpumask (for CPUMASK_OFFSTACK) avoiding otherwise nasty sanitising case failing to register the cpufreq policy notifications. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20250131155842.3839098-1-beata.michalska@arm.com Signed-off-by: Will Deacon <will@kernel.org>