linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-04	net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device	Maxime Chevallier
	ethnl_req_get_phydev() is used to lookup a phy_device, in the case an ethtool netlink command targets a specific phydev within a netdev's topology. It takes as a parameter a const struct nlattr *header that's used for error handling : if (!phydev) { NL_SET_ERR_MSG_ATTR(extack, header, "no phy matching phyindex"); return ERR_PTR(-ENODEV); } In the notify path after a ->set operation however, there's no request attributes available. The typical callsite for the above function looks like: phydev = ethnl_req_get_phydev(req_base, tb[ETHTOOL_A_XXX_HEADER], info->extack); So, when tb is NULL (such as in the ethnl notify path), we have a nice crash. It turns out that there's only the PLCA command that is in that case, as the other phydev-specific commands don't have a notification. This commit fixes the crash by passing the cmd index and the nlattr array separately, allowing NULL-checking it directly inside the helper. Fixes: c15e065b46dc ("net: ethtool: Allow passing a phy index for some commands") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reported-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20250301141114.97204-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	ppp: Fix KMSAN uninit-value warning with bpf	Jiayuan Chen
	Syzbot caught an "KMSAN: uninit-value" warning [1], which is caused by the ppp driver not initializing a 2-byte header when using socket filter. The following code can generate a PPP filter BPF program: ''' struct bpf_program fp; pcap_t *handle; handle = pcap_open_dead(DLT_PPP_PPPD, 65535); pcap_compile(handle, &fp, "ip and outbound", 0, 0); bpf_dump(&fp, 1); ''' Its output is: ''' (000) ldh [2] (001) jeq #0x21 jt 2 jf 5 (002) ldb [0] (003) jeq #0x1 jt 4 jf 5 (004) ret #65535 (005) ret #0 ''' Wen can find similar code at the following link: https://github.com/ppp-project/ppp/blob/master/pppd/options.c#L1680 The maintainer of this code repository is also the original maintainer of the ppp driver. As you can see the BPF program skips 2 bytes of data and then reads the 'Protocol' field to determine if it's an IP packet. Then it read the first byte of the first 2 bytes to determine the direction. The issue is that only the first byte indicating direction is initialized in current ppp driver code while the second byte is not initialized. For normal BPF programs generated by libpcap, uninitialized data won't be used, so it's not a problem. However, for carefully crafted BPF programs, such as those generated by syzkaller [2], which start reading from offset 0, the uninitialized data will be used and caught by KMSAN. [1] https://syzkaller.appspot.com/bug?extid=853242d9c9917165d791 [2] https://syzkaller.appspot.com/text?tag=ReproC&x=11994913980000 Cc: Paul Mackerras <paulus@samba.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+853242d9c9917165d791@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/000000000000dea025060d6bc3bc@google.com/ Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250228141408.393864-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	Merge branch 'fixes-for-ipa-v4-7'	Jakub Kicinski
	Luca Weiss says: ==================== Fixes for IPA v4.7 During bringup of IPA v4.7 unfortunately some bits were missed, and it couldn't be tested much back then due to missing features in tqftpserv which caused the modem to not enable correctly. Especially the last commit is important since it makes mobile data actually functional on SoCs with IPA v4.7 like SM6350 - used on the Fairphone 4. Before that, you'd get an IP address on the interface but then e.g. ping never got any response back. ==================== Link: https://patch.msgid.link/20250227-ipa-v4-7-fixes-v1-0-a88dd8249d8a@fairphone.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	net: ipa: Enable checksum for IPA_ENDPOINT_AP_MODEM_{RX,TX} for v4.7	Luca Weiss
	Enable the checksum option for these two endpoints in order to allow mobile data to actually work. Without this, no packets seem to make it through the IPA. Fixes: b310de784bac ("net: ipa: add IPA v4.7 support") Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Alex Elder <elder@riscstar.com> Link: https://patch.msgid.link/20250227-ipa-v4-7-fixes-v1-3-a88dd8249d8a@fairphone.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	net: ipa: Fix QSB data for v4.7	Luca Weiss
	As per downstream reference, max_writes should be 12 and max_reads should be 13. Fixes: b310de784bac ("net: ipa: add IPA v4.7 support") Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Alex Elder <elder@riscstar.com> Link: https://patch.msgid.link/20250227-ipa-v4-7-fixes-v1-2-a88dd8249d8a@fairphone.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	net: ipa: Fix v4.7 resource group names	Luca Weiss
	In the downstream IPA driver there's only one group defined for source and destination, and the destination group doesn't have a _DPL suffix. Fixes: b310de784bac ("net: ipa: add IPA v4.7 support") Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Reviewed-by: Alex Elder <elder@riscstar.com> Link: https://patch.msgid.link/20250227-ipa-v4-7-fixes-v1-1-a88dd8249d8a@fairphone.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	HID: Intel-thc-hid: Intel-quickspi: Correct device state after S4	Even Xu
	During S4 retore flow, quickspi device was resetted by driver and state was changed to RESETTED. It is needed to be change to ENABLED state after S4 re-initialization finished, otherwise, device will run in wrong state and HID input data will be dropped. Signed-off-by: Even Xu <even.xu@intel.com> Fixes: 6912aaf3fd24 ("HID: intel-thc-hid: intel-quickspi: Add PM implementation") Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: intel-thc-hid: Fix spelling mistake "intput" -> "input"	Colin Ian King
	There is a spelling mistake in a dev_err_once message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Even Xu <even.xu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: hid-steam: Fix use-after-free when detaching device	Vicki Pfau
	When a hid-steam device is removed it must clean up the client_hdev used for intercepting hidraw access. This can lead to scheduling deferred work to reattach the input device. Though the cleanup cancels the deferred work, this was done before the client_hdev itself is cleaned up, so it gets rescheduled. This patch fixes the ordering to make sure the deferred work is properly canceled. Reported-by: syzbot+0154da2d403396b2bd59@syzkaller.appspotmail.com Fixes: 79504249d7e2 ("HID: hid-steam: Move hidraw input (un)registering to work") Signed-off-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: debug: Fix spelling mistake "Messanger" -> "Messenger"	Colin Ian King
	There is a spelling mistake in a literal string. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: appleir: Fix potential NULL dereference at raw event handle	Daniil Dulov
	Syzkaller reports a NULL pointer dereference issue in input_event(). BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:68 [inline] BUG: KASAN: null-ptr-deref in _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline] BUG: KASAN: null-ptr-deref in is_event_supported drivers/input/input.c:67 [inline] BUG: KASAN: null-ptr-deref in input_event+0x42/0xa0 drivers/input/input.c:395 Read of size 8 at addr 0000000000000028 by task syz-executor199/2949 CPU: 0 UID: 0 PID: 2949 Comm: syz-executor199 Not tainted 6.13.0-rc4-syzkaller-00076-gf097a36ef88d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 kasan_report+0xd9/0x110 mm/kasan/report.c:602 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189 instrument_atomic_read include/linux/instrumented.h:68 [inline] _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline] is_event_supported drivers/input/input.c:67 [inline] input_event+0x42/0xa0 drivers/input/input.c:395 input_report_key include/linux/input.h:439 [inline] key_down drivers/hid/hid-appleir.c:159 [inline] appleir_raw_event+0x3e5/0x5e0 drivers/hid/hid-appleir.c:232 __hid_input_report.constprop.0+0x312/0x440 drivers/hid/hid-core.c:2111 hid_ctrl+0x49f/0x550 drivers/hid/usbhid/hid-core.c:484 __usb_hcd_giveback_urb+0x389/0x6e0 drivers/usb/core/hcd.c:1650 usb_hcd_giveback_urb+0x396/0x450 drivers/usb/core/hcd.c:1734 dummy_timer+0x17f7/0x3960 drivers/usb/gadget/udc/dummy_hcd.c:1993 __run_hrtimer kernel/time/hrtimer.c:1739 [inline] __hrtimer_run_queues+0x20a/0xae0 kernel/time/hrtimer.c:1803 hrtimer_run_softirq+0x17d/0x350 kernel/time/hrtimer.c:1820 handle_softirqs+0x206/0x8d0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0xfa/0x160 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0x90/0xb0 arch/x86/kernel/apic/apic.c:1049 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 __mod_timer+0x8f6/0xdc0 kernel/time/timer.c:1185 add_timer+0x62/0x90 kernel/time/timer.c:1295 schedule_timeout+0x11f/0x280 kernel/time/sleep_timeout.c:98 usbhid_wait_io+0x1c7/0x380 drivers/hid/usbhid/hid-core.c:645 usbhid_init_reports+0x19f/0x390 drivers/hid/usbhid/hid-core.c:784 hiddev_ioctl+0x1133/0x15b0 drivers/hid/usbhid/hiddev.c:794 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl fs/ioctl.c:892 [inline] __x64_sys_ioctl+0x190/0x200 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> This happens due to the malformed report items sent by the emulated device which results in a report, that has no fields, being added to the report list. Due to this appleir_input_configured() is never called, hidinput_connect() fails which results in the HID_CLAIMED_INPUT flag is not being set. However, it does not make appleir_probe() fail and lets the event callback to be called without the associated input device. Thus, add a check for the HID_CLAIMED_INPUT flag and leave the event hook early if the driver didn't claim any input_dev for some reason. Moreover, some other hid drivers accessing input_dev in their event callbacks do have similar checks, too. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 9a4a5574ce42 ("HID: appleir: add support for Apple ir devices") Cc: stable@vger.kernel.org Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: apple: disable Fn key handling on the Omoton KB066	Alex Henrie
	Remove the fixup to make the Omoton KB066's F6 key F6 when not holding Fn. That was really just a hack to allow typing F6 in fnmode>0, and it didn't fix any of the other F keys that were likewise untypable in fnmode>0. Instead, because the Omoton's Fn key is entirely internal to the keyboard, completely disable Fn key translation when an Omoton is detected, which will prevent the hid-apple driver from interfering with the keyboard's built-in Fn key handling. All of the F keys, including F6, are then typable when Fn is held. The Omoton KB066 and the Apple A1255 both have HID product code 05ac:022c. The self-reported name of every original A1255 when they left the factory was "Apple Wireless Keyboard". By default, Mac OS changes the name to "<username>'s keyboard" when pairing with the keyboard, but Mac OS allows the user to set the internal name of Apple keyboards to anything they like. The Omoton KB066's name, on the other hand, is not configurable: It is always "Bluetooth Keyboard". Because that name is so generic that a user might conceivably use the same name for a real Apple keyboard, detect Omoton keyboards based on both having that exact name and having HID product code 022c. Fixes: 819083cb6eed ("HID: apple: fix up the F6 key on the Omoton KB066 keyboard") Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	HID: i2c-hid: improve i2c_hid_get_report error message	Wentao Guan
	We have two places to print "failed to set a report to ...", use "get a report from" instead of "set a report to", it makes people who knows less about the module to know where the error happened. Before: i2c_hid_acpi i2c-FTSC1000:00: failed to set a report to device: -11 After: i2c_hid_acpi i2c-FTSC1000:00: failed to get a report from device: -11 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-03-04	platform/x86/amd: pmf: Add balanced-performance to hidden choices	Mario Limonciello
	Acer's WMI driver uses balanced-performance but AMD-PMF doesn't. In case a machine binds with both drivers let amd-pmf use balanced-performance as well. Fixes: 688834743d67 ("ACPI: platform_profile: Allow multiple handlers") Suggested-by: Antheas Kapenekakis <lkml@antheas.dev> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Antheas Kapenekakis <lkml@antheas.dev> Tested-by: Derek J. Clark <derekjohn.clark@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250228170155.2623386-4-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-03-04	platform/x86/amd: pmf: Add 'quiet' to hidden choices	Mario Limonciello
	When amd-pmf and asus-wmi are both bound no low power option shows up in sysfs. Add a hidden choice for amd-pmf to support 'quiet' mode to let both bind. Fixes: 688834743d67 ("ACPI: platform_profile: Allow multiple handlers") Suggested-by: Antheas Kapenekakis <lkml@antheas.dev> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Antheas Kapenekakis <lkml@antheas.dev> Tested-by: Derek J. Clark <derekjohn.clark@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250228170155.2623386-3-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-03-04	ACPI: platform_profile: Add support for hidden choices	Mario Limonciello
	When two drivers don't support all the same profiles the legacy interface only exports the common profiles. This causes problems for cases where one driver uses low-power but another uses quiet because the result is that neither is exported to sysfs. To allow two drivers to disagree, add support for "hidden choices". Hidden choices are platform profiles that a driver supports to be compatible with the platform profile of another driver. Fixes: 688834743d67 ("ACPI: platform_profile: Allow multiple handlers") Reported-by: Antheas Kapenekakis <lkml@antheas.dev> Closes: https://lore.kernel.org/platform-driver-x86/e64b771e-3255-42ad-9257-5b8fc6c24ac9@gmx.de/T/#mc068042dd29df36c16c8af92664860fc4763974b Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Antheas Kapenekakis <lkml@antheas.dev> Tested-by: Derek J. Clark <derekjohn.clark@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250228170155.2623386-2-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-03-04	drm/xe: Fix GT "for each engine" workarounds	Tvrtko Ursulin
	Any rules using engine matching are currently broken due RTP processing happening too in early init, before the list of hardware engines has been initialised. Fix this by moving workaround processing to later in the driver probe sequence, to just before the processed list is used for the first time. Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649 should be present while we see, before: GT Workarounds 14011059788 14015795083 And with the patch: GT Workarounds 14011060649 14011059788 14015795083 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 25d434cef791e03cf40680f5441b576c639bfa84) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-04	drm/xe/userptr: properly setup pfn_flags_mask	Matthew Auld
	Currently we just leave it uninitialised, which at first looks harmless, however we also don't zero out the pfn array, and with pfn_flags_mask the idea is to be able set individual flags for a given range of pfn or completely ignore them, outside of default_flags. So here we end up with pfn[i] & pfn_flags_mask, and if both are uninitialised we might get back an unexpected flags value, like asking for read only with default_flags, but getting back write on top, leading to potentially bogus behaviour. To fix this ensure we zero the pfn_flags_mask, such that hmm only considers the default_flags and not also the initial pfn[i] value. v2 (Thomas): - Prefer proper initializer. Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226174748.294285-2-matthew.auld@intel.com (cherry picked from commit dd8c01e42f4c5c1eaf02f003d7d588ba6706aa71) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-04	drm/i915/mst: update max stream count to match number of pipes	Jani Nikula
	We create the stream encoders and attach connectors for each pipe we have. As the number of pipes has increased, we've failed to update the topology manager maximum number of payloads to match that. Bump up the max stream count to match number of pipes, enabling the fourth stream on platforms that support four pipes. Cc: stable@vger.kernel.org Cc: Imre Deak <imre.deak@intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226135626.1956012-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 15bccbfb78d63a2a621b30caff8b9424160c6c89) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-04	drm/xe: Remove double pageflip	Maarten Lankhorst
	This is already handled below in the code by fixup_initial_plane_config. Fixes: a8153627520a ("drm/i915: Try to relocate the BIOS fb to the start of ggtt") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210083111.230484-3-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> (cherry picked from commit 2218704997979fbf11765281ef752f07c5cf25bb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	docs: Kconfig: fix defconfig description	Satoru Takeuchi
	Commit 2a86f6612164 ("kbuild: use KBUILD_DEFCONFIG as the fallback for DEFCONFIG_LIST") removed arch/$ARCH/defconfig; however, the document has not been updated to reflect this change yet. Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-05	kbuild: hdrcheck: fix cross build with clang	Arnd Bergmann
	The headercheck tries to call clang with a mix of compiler arguments that don't include the target architecture. When building e.g. x86 headers on arm64, this produces a warning like clang: warning: unknown platform, assuming -mfloat-abi=soft Add in the KBUILD_CPPFLAGS, which contain the target, in order to make it build properly. See also 1b71c2fb04e7 ("kbuild: userprogs: fix bitsize and target detection on clang"). Reviewed-by: Nathan Chancellor <nathan@kernel.org> Fixes: feb843a469fb ("kbuild: add $(CLANG_FLAGS) to KBUILD_CPPFLAGS") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-04	Merge tag 'devicetree-fixes-for-6.14-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fix from Rob Herring: - Revert reserved-memory 'alignment' property to use '#address-cells' instead of '#size-cells'. What's in use trumps the spec. * tag 'devicetree-fixes-for-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: Revert "of: reserved-memory: Fix using wrong number of cells to get property 'alignment'"
2025-03-05	kbuild: userprogs: use correct lld when linking through clang	Thomas Weißschuh
	The userprog infrastructure links objects files through $(CC). Either explicitly by manually calling $(CC) on multiple object files or implicitly by directly compiling a source file to an executable. The documentation at Documentation/kbuild/llvm.rst indicates that ld.lld would be used for linking if LLVM=1 is specified. However clang instead will use either a globally installed cross linker from $PATH called ${target}-ld or fall back to the system linker, which probably does not support crosslinking. For the normal kernel build this is not an issue because the linker is always executed directly, without the compiler being involved. Explicitly pass --ld-path to clang so $(LD) is respected. As clang 13.0.1 is required to build the kernel, this option is available. Fixes: 7f3a59db274c ("kbuild: add infrastructure to build userspace programs") Cc: stable@vger.kernel.org # needs wrapping in $(cc-option) for < 6.9 Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-04	fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex	Linus Torvalds
	pipe_readable(), pipe_writable(), and pipe_poll() can read "pipe->head" and "pipe->tail" outside of "pipe->mutex" critical section. When the head and the tail are read individually in that order, there is a window for interruption between the two reads in which both the head and the tail can be updated by concurrent readers and writers. One of the problematic scenarios observed with hackbench running multiple groups on a large server on a particular pipe inode is as follows: pipe->head = 36 pipe->tail = 36 hackbench-118762 [057] ..... 1029.550548: pipe_write: wakes up: pipe not full hackbench-118762 [057] ..... 1029.550548: pipe_write: head: 36 -> 37 [tail: 36] hackbench-118762 [057] ..... 1029.550548: pipe_write: wake up next reader 118740 hackbench-118762 [057] ..... 1029.550548: pipe_write: wake up next writer 118768 hackbench-118768 [206] ..... 1029.55055X: pipe_write: writer wakes up hackbench-118768 [206] ..... 1029.55055X: pipe_write: head = READ_ONCE(pipe->head) [37] ... CPU 206 interrupted (exact wakeup was not traced but 118768 did read head at 37 in traces) hackbench-118740 [057] ..... 1029.550558: pipe_read: reader wakes up: pipe is not empty hackbench-118740 [057] ..... 1029.550558: pipe_read: tail: 36 -> 37 [head = 37] hackbench-118740 [057] ..... 1029.550559: pipe_read: pipe is empty; wakeup writer 118768 hackbench-118740 [057] ..... 1029.550559: pipe_read: sleeps hackbench-118766 [185] ..... 1029.550592: pipe_write: New writer comes in hackbench-118766 [185] ..... 1029.550592: pipe_write: head: 37 -> 38 [tail: 37] hackbench-118766 [185] ..... 1029.550592: pipe_write: wakes up reader 118766 hackbench-118740 [185] ..... 1029.550598: pipe_read: reader wakes up; pipe not empty hackbench-118740 [185] ..... 1029.550599: pipe_read: tail: 37 -> 38 [head: 38] hackbench-118740 [185] ..... 1029.550599: pipe_read: pipe is empty hackbench-118740 [185] ..... 1029.550599: pipe_read: reader sleeps; wakeup writer 118768 ... CPU 206 switches back to writer hackbench-118768 [206] ..... 1029.550601: pipe_write: tail = READ_ONCE(pipe->tail) [38] hackbench-118768 [206] ..... 1029.550601: pipe_write: pipe_full()? (u32)(37 - 38) >= 16? Yes hackbench-118768 [206] ..... 1029.550601: pipe_write: writer goes back to sleep [ Tasks 118740 and 118768 can then indefinitely wait on each other. ] The unsigned arithmetic in pipe_occupancy() wraps around when "pipe->tail > pipe->head" leading to pipe_full() returning true despite the pipe being empty. The case of genuine wraparound of "pipe->head" is handled since pipe buffer has data allowing readers to make progress until the pipe->tail wraps too after which the reader will wakeup a sleeping writer, however, mistaking the pipe to be full when it is in fact empty can lead to readers and writers waiting on each other indefinitely. This issue became more problematic and surfaced as a hang in hackbench after the optimization in commit aaec5a95d596 ("pipe_read: don't wake up the writer if the pipe is still full") significantly reduced the number of spurious wakeups of writers that had previously helped mask the issue. To avoid missing any updates between the reads of "pipe->head" and "pipe->write", unionize the two with a single unsigned long "pipe->head_tail" member that can be loaded atomically. Using "pipe->head_tail" to read the head and the tail ensures the lockless checks do not miss any updates to the head or the tail and since those two are only updated under "pipe->mutex", it ensures that the head is always ahead of, or equal to the tail resulting in correct calculations. [ prateek: commit log, testing on x86 platforms. ] Reported-and-debugged-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Closes: https://lore.kernel.org/lkml/e813814e-7094-4673-bc69-731af065a0eb@amd.com/ Reported-by: Alexey Gladkov <legion@kernel.org> Closes: https://lore.kernel.org/all/Z8Wn0nTvevLRG_4m@example.org/ Fixes: 8cefc107ca54 ("pipe: Use head and tail pointers for the ring, not cursor and length") Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-04	KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM	Xiaoyao Li
	Fix a goof where KVM sets CPUID.0x80000022.EAX to CPUID.0x80000022.EBX instead of zeroing both when PERFMON_V2 isn't supported by KVM. In practice, barring a buggy CPU (or vCPU model when running nested) only the !enable_pmu case is affected, as KVM always supports PERFMON_V2 if it's available in hardware, i.e. CPUID.0x80000022.EBX will be '0' if PERFMON_V2 is unsupported. For the !enable_pmu case, the bug is relatively benign as KVM will refuse to enable PMU capabilities, but a VMM that reflects KVM's supported CPUID into the guest could inadvertently induce #GPs in the guest due to advertising support for MSRs that KVM refuses to emulate. Fixes: 94cdeebd8211 ("KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022") Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20250304082314.472202-3-xiaoyao.li@intel.com [sean: massage shortlog and changelog, tag for stable] Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-04	Merge tag 'wireless-2025-03-04' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== bugfixes for 6.14: * regressions from this cycle: - mac80211: fix sparse warning for monitor - nl80211: disable multi-link reconfiguration (needs fixing) * older issues: - cfg80211: reject badly combined cooked monitor, fix regulatory hint validity checks - mac80211: handle TXQ flush w/o driver per-sta flush, fix debugfs for monitor, fix element inheritance - iwlwifi: fix rfkill, dead firmware handling, rate API version, free A-MSDU handling, avoid large allocations, fix string format - brcmfmac: fix power handling on some boards * tag 'wireless-2025-03-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: nl80211: disable multi-link reconfiguration wifi: cfg80211: regulatory: improve invalid hints checking wifi: brcmfmac: keep power during suspend if board requires it wifi: mac80211: Fix sparse warning for monitor_sdata wifi: mac80211: fix vendor-specific inheritance wifi: mac80211: fix MLE non-inheritance parsing wifi: iwlwifi: Fix A-MSDU TSO preparation wifi: iwlwifi: Free pages allocated when failing to build A-MSDU wifi: iwlwifi: limit printed string from FW file wifi: iwlwifi: mvm: use the right version of the rate API wifi: iwlwifi: mvm: don't try to talk to a dead firmware wifi: iwlwifi: mvm: don't dump the firmware state upon RFKILL while suspend wifi: iwlwifi: mvm: clean up ROC on failure wifi: iwlwifi: fw: avoid using an uninitialized variable wifi: iwlwifi: fw: allocate chained SG tables for dump wifi: mac80211: remove debugfs dir for virtual monitor wifi: mac80211: Cleanup sta TXQs on flush wifi: nl80211: reject cooked mode if it is set along with other flags ==================== Link: https://patch.msgid.link/20250304124435.126272-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04	s390/ftrace: Fix return address recovery of traced function	Sumanth Korikkar
	When fgraph is enabled the traced function return address is replaced with trampoline return_to_handler(). The original return address of the traced function is saved in per task return stack along with a stack pointer for reliable stack unwinding via function_graph_enter_regs(). During stack unwinding e.g. for livepatching, ftrace_graph_ret_addr() identifies the original return address of the traced function with the saved stack pointer. With a recent change, the stack pointers passed to ftrace_graph_ret_addr() and function_graph_enter_regs() do not match anymore, and therefore the original return address is not found. Pass the correct stack pointer to function_graph_enter_regs() to fix this. Fixes: 7495e179b478 ("s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04	selftests/vDSO: Fix GNU hash table entry size for s390x	Thomas Weißschuh
	Commit 14be4e6f3522 ("selftests: vDSO: fix ELF hash table entry size for s390x") changed the type of the ELF hash table entries to 64bit on s390x. However the GNU hash tables entries are always 32bit. The "bucket" pointer is shared between both hash algorithms. On s390, this caused the GNU hash algorithm to access its 32-bit entries as if they were 64-bit, triggering compiler warnings (assignment between "Elf64_Xword " and "Elf64_Word ") and runtime crashes. Introduce a new dedicated "gnu_bucket" pointer which is used by the GNU hash. Fixes: e0746bde6f82 ("selftests/vDSO: support DT_GNU_HASH") Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250217-selftests-vdso-s390-gnu-hash-v2-1-f6c2532ffe2a@linutronix.de Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04	s390/traps: Fix test_monitor_call() inline assembly	Heiko Carstens
	The test_monitor_call() inline assembly uses the xgr instruction, which also modifies the condition code, to clear a register. However the clobber list of the inline assembly does not specify that the condition code is modified, which may lead to incorrect code generation. Use the lhi instruction instead to clear the register without that the condition code is modified. Furthermore this limits clearing to the lower 32 bits of val, since its type is int. Fixes: 17248ea03674 ("s390: fix __EMIT_BUG() macro") Cc: stable@vger.kernel.org Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04	ALSA: hda: realtek: fix incorrect IS_REACHABLE() usage	Arnd Bergmann
	The alternative path leads to a build error after a recent change: sound/pci/hda/patch_realtek.c: In function 'alc233_fixup_lenovo_low_en_micmute_led': include/linux/stddef.h:9:14: error: called object is not a function or function pointer 9 \| #define NULL ((void *)0) \| ^ sound/pci/hda/patch_realtek.c:5041:49: note: in expansion of macro 'NULL' 5041 \| #define alc233_fixup_lenovo_line2_mic_hotkey NULL \| ^~~~ sound/pci/hda/patch_realtek.c:5063:9: note: in expansion of macro 'alc233_fixup_lenovo_line2_mic_hotkey' 5063 \| alc233_fixup_lenovo_line2_mic_hotkey(codec, fix, action); Using IS_REACHABLE() is somewhat questionable here anyway since it leads to the input code not working when the HDA driver is builtin but input is in a loadable module. Replace this with a hard compile-time dependency on CONFIG_INPUT. In practice this won't chance much other than solve the compiler error because it is rare to require sound output but no input support. Fixes: f603b159231b ("ALSA: hda/realtek - add supported Mic Mute LED for Lenovo platform") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250304142620.582191-1-arnd@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-04	usb: xhci: Fix host controllers "dying" after suspend and resume	Michal Pecio
	A recent cleanup went a bit too far and dropped clearing the cycle bit of link TRBs, so it stays different from the rest of the ring half of the time. Then a race occurs: if the xHC reaches such link TRB before more commands are queued, the link's cycle bit unintentionally matches the xHC's cycle so it follows the link and waits for further commands. If more commands are queued before the xHC gets there, inc_enq() flips the bit so the xHC later sees a mismatch and stops executing commands. This function is called before suspend and 50% of times after resuming the xHC is doomed to get stuck sooner or later. Then some Stop Endpoint command fails to complete in 5 seconds and this shows up xhci_hcd 0000:00:10.0: xHCI host not responding to stop endpoint command xhci_hcd 0000:00:10.0: xHCI host controller not responding, assume dead xhci_hcd 0000:00:10.0: HC died; cleaning up followed by loss of all USB decives on the affected bus. That's if you are lucky, because if Set Deq gets stuck instead, the failure is silent. Likely responsible for kernel bug 219824. I found this while searching for possible causes of that regression and reproduced it locally before hearing back from the reporter. To repro, simply wait for link cycle to become set (debugfs), then suspend, resume and wait. To accelerate the failure I used a script which repeatedly starts and stops a UVC camera. Some HCs get fully reinitialized on resume and they are not affected. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219824 Fixes: 36b972d4b7ce ("usb: xhci: improve xhci_clear_command_ring()") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250304113147.3322584-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-04	net: hns3: make sure ptp clock is unregister and freed if ↵	Peiyang Wang
	hclge_ptp_get_cycle returns an error During the initialization of ptp, hclge_ptp_get_cycle might return an error and returned directly without unregister clock and free it. To avoid that, call hclge_ptp_destroy_clock to unregist and free clock if hclge_ptp_get_cycle failed. Fixes: 8373cd38a888 ("net: hns3: change the method of obtaining default ptp cycle") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250228105258.1243461-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04	wifi: nl80211: disable multi-link reconfiguration	Johannes Berg
	Both the APIs in cfg80211 and the implementation in mac80211 aren't really ready yet, we have a large number of fixes. In addition, it's not possible right now to discover support for this feature from userspace. Disable it for now, there's no rush. Link: https://patch.msgid.link/20250303110538.fbeef42a5687.Iab122c22137e5675ebd99f5c031e30c0e5c7af2e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-04	net: dsa: rtl8366rb: don't prompt users for LED control	Jakub Kicinski
	Make NET_DSA_REALTEK_RTL8366RB_LEDS a hidden symbol. It seems very unlikely user would want to intentionally disable it. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250228004534.3428681-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04	be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink	Nikolay Aleksandrov
	Partially revert commit b71724147e73 ("be2net: replace polling with sleeping in the FW completion path") w.r.t mcc mutex it introduces and the use of usleep_range. The be2net be_ndo_bridge_getlink() callback is called with rcu_read_lock, so this code has been broken for a long time. Both the mutex_lock and the usleep_range can cause the issue Ian Kumlien reported[1]. The call path is: be_ndo_bridge_getlink -> be_cmd_get_hsw_config -> be_mcc_notify_wait -> be_mcc_wait_compl -> usleep_range() [1] https://lore.kernel.org/netdev/CAA85sZveppNgEVa_FD+qhOMtG_AavK9_mFiU+jWrMtXmwqefGA@mail.gmail.com/ Tested-by: Ian Kumlien <ian.kumlien@gmail.com> Fixes: b71724147e73 ("be2net: replace polling with sleeping in the FW completion path") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250227164129.1201164-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04	x86/cpu: Properly parse CPUID leaf 0x2 TLB descriptor 0x63	Ahmed S. Darwish
	CPUID leaf 0x2's one-byte TLB descriptors report the number of entries for specific TLB types, among other properties. Typically, each emitted descriptor implies the same number of entries for its respective TLB type(s). An emitted 0x63 descriptor is an exception: it implies 4 data TLB entries for 1GB pages and 32 data TLB entries for 2MB or 4MB pages. For the TLB descriptors parsing code, the entry count for 1GB pages is encoded at the intel_tlb_table[] mapping, but the 2MB/4MB entry count is totally ignored. Update leaf 0x2's parsing logic 0x2 to account for 32 data TLB entries for 2MB/4MB pages implied by the 0x63 descriptor. Fixes: e0ba94f14f74 ("x86/tlb_info: get last level TLB entry number of CPU") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250304085152.51092-4-darwi@linutronix.de
2025-03-04	x86/cpu: Validate CPUID leaf 0x2 EDX output	Ahmed S. Darwish
	CPUID leaf 0x2 emits one-byte descriptors in its four output registers EAX, EBX, ECX, and EDX. For these descriptors to be valid, the most significant bit (MSB) of each register must be clear. Leaf 0x2 parsing at intel.c only validated the MSBs of EAX, EBX, and ECX, but left EDX unchecked. Validate EDX's most-significant bit as well. Fixes: e0ba94f14f74 ("x86/tlb_info: get last level TLB entry number of CPU") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250304085152.51092-3-darwi@linutronix.de
2025-03-04	x86/cacheinfo: Validate CPUID leaf 0x2 EDX output	Ahmed S. Darwish
	CPUID leaf 0x2 emits one-byte descriptors in its four output registers EAX, EBX, ECX, and EDX. For these descriptors to be valid, the most significant bit (MSB) of each register must be clear. The historical Git commit: 019361a20f016 ("- pre6: Intel: start to add Pentium IV specific stuff (128-byte cacheline etc)...") introduced leaf 0x2 output parsing. It only validated the MSBs of EAX, EBX, and ECX, but left EDX unchecked. Validate EDX's most-significant bit. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250304085152.51092-2-darwi@linutronix.de
2025-03-04	mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq	Uladzislau Rezki (Sony)
	Currently kvfree_rcu() APIs use a system workqueue which is "system_unbound_wq" to driver RCU machinery to reclaim a memory. Recently, it has been noted that the following kernel warning can be observed: <snip> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_scan_work is flushing !WQ_MEM_RECLAIM events_unbound:kfree_rcu_work WARNING: CPU: 21 PID: 330 at kernel/workqueue.c:3719 check_flush_dependency+0x112/0x120 Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) ... CPU: 21 UID: 0 PID: 330 Comm: kworker/u144:6 Tainted: G E 6.13.2-0_g925d379822da #1 Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM20 02/01/2023 Workqueue: nvme-wq nvme_scan_work RIP: 0010:check_flush_dependency+0x112/0x120 Code: 05 9a 40 14 02 01 48 81 c6 c0 00 00 00 48 8b 50 18 48 81 c7 c0 00 00 00 48 89 f9 48 ... RSP: 0018:ffffc90000df7bd8 EFLAGS: 00010082 RAX: 000000000000006a RBX: ffffffff81622390 RCX: 0000000000000027 RDX: 00000000fffeffff RSI: 000000000057ffa8 RDI: ffff88907f960c88 RBP: 0000000000000000 R08: ffffffff83068e50 R09: 000000000002fffd R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881001a4400 R13: 0000000000000000 R14: ffff88907f420fb8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88907f940000(0000) knlGS:0000000000000000 CR2: 00007f60c3001000 CR3: 000000107d010005 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xa4/0x140 ? check_flush_dependency+0x112/0x120 ? report_bug+0xe1/0x140 ? check_flush_dependency+0x112/0x120 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? timer_recalc_next_expiry+0x190/0x190 ? check_flush_dependency+0x112/0x120 ? check_flush_dependency+0x112/0x120 __flush_work.llvm.1643880146586177030+0x174/0x2c0 flush_rcu_work+0x28/0x30 kvfree_rcu_barrier+0x12f/0x160 kmem_cache_destroy+0x18/0x120 bioset_exit+0x10c/0x150 disk_release.llvm.6740012984264378178+0x61/0xd0 device_release+0x4f/0x90 kobject_put+0x95/0x180 nvme_put_ns+0x23/0xc0 nvme_remove_invalid_namespaces+0xb3/0xd0 nvme_scan_work+0x342/0x490 process_scheduled_works+0x1a2/0x370 worker_thread+0x2ff/0x390 ? pwq_release_workfn+0x1e0/0x1e0 kthread+0xb1/0xe0 ? __kthread_parkme+0x70/0x70 ret_from_fork+0x30/0x40 ? __kthread_parkme+0x70/0x70 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- <snip> To address this switch to use of independent WQ_MEM_RECLAIM workqueue, so the rules are not violated from workqueue framework point of view. Apart of that, since kvfree_rcu() does reclaim memory it is worth to go with WQ_MEM_RECLAIM type of wq because it is designed for this purpose. Fixes: 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"), Reported-by: Keith Busch <kbusch@kernel.org> Closes: https://lore.kernel.org/all/Z7iqJtCjHKfo8Kho@kbusch-mbp/ Cc: stable@vger.kernel.org Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04	usb: dwc3: Set SUSPENDENABLE soon after phy init	Thinh Nguyen
	After phy initialization, some phy operations can only be executed while in lower P states. Ensure GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are set soon after initialization to avoid blocking phy ops. Previously the SUSPENDENABLE bits are only set after the controller initialization, which may not happen right away if there's no gadget driver or xhci driver bound. Revise this to clear SUSPENDENABLE bits only when there's mode switching (change in GCTL.PRTCAPDIR). Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Cc: stable <stable@kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/633aef0afee7d56d2316f7cc3e1b2a6d518a8cc9.1738280911.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-03	KVM: arm64: nv: Fail KVM init if asking for NV without GICv3	Marc Zyngier
	Although there is nothing in NV that is fundamentally incompatible with the lack of GICv3, there is no HW implementation without one, at least on the virtual side (yes, even fruits have some form of vGICv3). We therefore make the decision to require GICv3, which will only affect models such as QEMU. Booting with a GICv2 or something even more exotic while asking for NV will result in KVM being disabled. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-17-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Allow userland to set VGIC maintenance IRQ	Andre Przywara
	The VGIC maintenance IRQ signals various conditions about the LRs, when the GIC's virtualization extension is used. So far we didn't need it, but nested virtualization needs to know about this interrupt, so add a userland interface to setup the IRQ number. The architecture mandates that it must be a PPI, on top of that this code only exports a per-device option, so the PPI is the same on all VCPUs. Signed-off-by: Andre Przywara <andre.przywara@arm.com> [added some bits of documentation] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-16-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup	Marc Zyngier
	Popular HW that is able to use NV also has a broken vgic implementation that requires trapping. On such HW, propagate the host trap bits into the guest's shadow ICH_HCR_EL2 register, making sure we don't allow an L2 guest to bring the system down. This involves a bit of tweaking so that the emulation code correctly poicks up the shadow state as needed, and to only partially sync ICH_HCR_EL2 back with the guest state to capture EOIcount. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-15-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Propagate used_lrs between L1 and L0 contexts	Marc Zyngier
	We have so far made sure that L1 and L0 vgic contexts were totally independent. There is however one spot of bother with this approach, and that's in the GICv3 emulation code required by our fruity friends. The issue is that the emulation code needs to know how many LRs are in flight. And while it is easy to reach the L0 version through the vcpu pointer, doing so for the L1 is much more complicated, as these structures are private to the nested code. We could simply expose that structure and pick one or the other depending on the context, but this seems extra complexity for not much benefit. Instead, just propagate the number of used LRs from the nested code into the L0 context, and be done with it. Should this become a burden, it can be easily rectified. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-14-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Request vPE doorbell upon nested ERET to L2	Oliver Upton
	Running an L2 guest with GICv4 enabled goes absolutely nowhere, and gets into a vicious cycle of nested ERET followed by nested exception entry into the L1. When KVM does a put on a runnable vCPU, it marks the vPE as nonresident but does not request a doorbell IRQ. Behind the scenes in the ITS driver's view of the vCPU, its_vpe::pending_last gets set to true to indicate that context is still runnable. This comes to a head when doing the nested ERET into L2. The vPE doesn't get scheduled on the redistributor as it is exclusively part of the L1's VGIC context. kvm_vgic_vcpu_pending_irq() returns true because the vPE appears runnable, and KVM does a nested exception entry into the L1 before L2 ever gets off the ground. This issue can be papered over by requesting a doorbell IRQ when descheduling a vPE as part of a nested ERET. KVM needs this anyway to kick the vCPU out of the L2 when an IRQ becomes pending for the L1. Link: https://lore.kernel.org/r/20240823212703.3576061-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-13-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting	Jintack Lim
	Forward exceptions due to WFI or WFE instructions to the virtual EL2 if they are not coming from the virtual EL2 and virtual HCR_EL2.TWx is set. Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-12-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Add Maintenance Interrupt emulation	Marc Zyngier
	Emulating the vGIC means emulating the dreaded Maintenance Interrupt. This is a two-pronged problem: - while running L2, getting an MI translates into an MI injected in the L1 based on the state of the HW. - while running L1, we must accurately reflect the state of the MI line, based on the in-memory state. The MI INTID is added to the distributor, as expected on any virtualisation-capable implementation, and further patches will allow its configuration. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-11-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Handle L2->L1 transition on interrupt injection	Marc Zyngier
	An interrupt being delivered to L1 while running L2 must result in the correct exception being delivered to L1. This means that if, on entry to L2, we found ourselves with pending interrupts in the L1 distributor, we need to take immediate action. This is done by posting a request which will prevent the entry in L2, and deliver an IRQ exception to L1, forcing the switch. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-10-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Nested GICv3 emulation	Marc Zyngier
	When entering a nested VM, we set up the hypervisor control interface based on what the guest hypervisor has set. Especially, we investigate each list register written by the guest hypervisor whether HW bit is set. If so, we translate hw irq number from the guest's point of view to the real hardware irq number if there is a mapping. Co-developed-by: Jintack Lim <jintack@cs.columbia.edu> Signed-off-by: Jintack Lim <jintack@cs.columbia.edu> [Christoffer: Redesigned execution flow around vcpu load/put] Co-developed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> [maz: Rewritten to support GICv3 instead of GICv2, NV2 support] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-9-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>