linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-21	ALSA: timer: Don't take register_mutex with copy_from/to_user()	Takashi Iwai
	The infamous mmap_lock taken in copy_from/to_user() can be often problematic when it's called inside another mutex, as they might lead to deadlocks. In the case of ALSA timer code, the bad pattern is with guard(mutex)(&register_mutex) that covers copy_from/to_user() -- which was mistakenly introduced at converting to guard(), and it had been carefully worked around in the past. This patch fixes those pieces simply by moving copy_from/to_user() out of the register mutex lock again. Fixes: 3923de04c817 ("ALSA: pcm: oss: Use guard() for setup") Reported-by: syzbot+2b96f44164236dda0f3b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67dd86c8.050a0220.25ae54.0059.GAE@google.com Link: https://patch.msgid.link/20250321172653.14310-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-21	MAINTAINERS: update bridge entry	Nikolay Aleksandrov
	Roopa has decided to withdraw as a bridge maintainer and Ido has agreed to step up and co-maintain the bridge with me. He has been very helpful in bridge patch reviews and has contributed a lot to the bridge over the years. Add an entry for Roopa to CREDITS and also add bridge's headers to its MAINTAINERS entry. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314100631.40999-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: mctp: Remove unnecessary cast in mctp_cb	Herbert Xu
	The void * cast in mctp_cb is unnecessary as it's already been done at the start of the function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/Z9PwOQeBSYlgZlHq@gondor.apana.org.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	Merge branch 'net-phy-remove-calls-to-devm_hwmon_sanitize_name'	Paolo Abeni
	Heiner Kallweit says: ==================== net: phy: remove calls to devm_hwmon_sanitize_name Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. ==================== Link: https://patch.msgid.link/198f3cd0-6c39-4783-afe7-95576a4b8539@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: phy: marvell-88q2xxx: remove call to devm_hwmon_sanitize_name	Heiner Kallweit
	Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/59c485e4-983c-42f6-9114-916703a62e3f@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: phy: mxl-gpy: remove call to devm_hwmon_sanitize_name	Heiner Kallweit
	Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/e34c4802-20ce-4556-a47c-812e602e8526@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: phy: tja11xx: remove call to devm_hwmon_sanitize_name	Heiner Kallweit
	Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Note that neither priv->hwmon_name nor priv->hwmon_dev are used outside tja11xx_hwmon_register. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/4452cb7e-1a2f-4213-b49f-9de196be9204@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: phy: realtek: remove call to devm_hwmon_sanitize_name	Heiner Kallweit
	Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/6e8d26f4-8d0a-4c83-aec3-378847a377eb@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	net: remove sb1000 cable modem driver	Arnd Bergmann
	This one is hilariously outdated, it provided a faster downlink over TV cable for users of analog modems in the 1990s, through an ISA card. The web page for the userspace tools has been broken for 25 years, and the driver has only ever seen mechanical updates. Link: http://web.archive.org/web/20000611165545/http://home.adelphia.net:80/~siglercm/sb1000.html Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250312085236.2531870-1-arnd@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21	platform/x86/amd/pmf: convert timeouts to secs_to_jiffies()	Easwar Hariharan
	Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced secs_to_jiffies(). As the value here is a multiple of 1000, use secs_to_jiffies() instead of msecs_to_jiffies() to avoid the multiplication This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with the following Coccinelle rules: @depends on patch@ expression E; @@ -msecs_to_jiffies +secs_to_jiffies (E - * $ 1000 \\| MSEC_PER_SEC $ ) Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250225-converge-secs-to-jiffies-part-two-v3-14-a43967e36c88@linux.microsoft.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-21	platform/x86: thinkpad_acpi: convert timeouts to secs_to_jiffies()	Easwar Hariharan
	Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced secs_to_jiffies(). As the value here is a multiple of 1000, use secs_to_jiffies() instead of msecs_to_jiffies() to avoid the multiplication This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with the following Coccinelle rules: @depends on patch@ expression E; @@ -msecs_to_jiffies +secs_to_jiffies (E - * $ 1000 \\| MSEC_PER_SEC $ ) Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250225-converge-secs-to-jiffies-part-two-v3-15-a43967e36c88@linux.microsoft.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-21	Merge tag 'perf-urgent-2025-03-21' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf events fixes from Ingo Molnar: "Two fixes: an RAPL PMU driver error handling fix, and an AMD IBS software filter fix" * tag 'perf-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/rapl: Fix error handling in init_rapl_pmus() perf/x86: Check data address for IBS software filter
2025-03-21	Merge tag 'sched-urgent-2025-03-21' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Revert a scheduler performance optimization that regressed other workloads" * tag 'sched-urgent-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "sched/core: Reduce cost of sched_move_task when config autogroup"
2025-03-21	irqdomain: platform/x86: Switch to irq_domain_create_linear()	Jiri Slaby (SUSE)
	irq_domain_add_linear() is going away as being obsolete now. Switch to the preferred irq_domain_create_linear(). That differs in the first parameter: It takes more generic struct fwnode_handle instead of struct device_node. The first parameter is NULL here so nothing else needs to be done. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20250319092951.37667-32-jirislaby@kernel.org [ij: Removed unnecessary details from the commit message.] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-21	ASoC: SDCA: Correct handling of selected mode DisCo property	Charles Keepax
	mipi-sdca-ge-selectedmode-controls-affected is actually required by the specification so the code should return an error if it is missing. Reported-by: Maciej Strozek <mstrozek@opensource.cirrus.com> Fixes: 13fe7497af19 ("ASoC: SDCA: Add support for GE Entity properties") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250321135324.380237-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-21	ASoC: amd: yc: update quirk data for new Lenovo model	Syed Saba kareem
	Update Quirk data for new Lenovo model 83J2 for YC platform. Signed-off-by: Syed Saba kareem <syed.sabakareem@amd.com> Link: https://patch.msgid.link/20250321122507.190193-1-syed.sabakareem@amd.com Reported-by: Reiner <Reiner.Proels@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219887 Tested-by: Reiner <Reiner.Proels@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-21	Merge tag 'i2c-host-fixes-6.14-rc8' of ↵	Wolfram Sang
	git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-host-fixes for v6.14-rc8 amd-mp2: fix double free of irq.
2025-03-21	dt-bindings: hwmon: Drop stray blank line in the header	Krzysztof Kozlowski
	There should be no blank line between the YAML opening header and the schema '$id'. Reported-by: Florin Leotescu (OSS) <florin.leotescu@oss.nxp.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250321080212.18013-1-krzysztof.kozlowski@linaro.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-03-21	hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register	Huisong Li
	When load this mode, we can see the following log: "power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info()." So replace hwmon_device_register with hwmon_device_register_with_info. These attributes, 'power_accuracy', 'power_cap_hyst', 'power_average_min' and 'power_average_max', should have been placed in hwmon_chip_info as power data type. But these attributes are displayed as string format on the following case: a) power1_accuracy --> display like '90.0%' b) power1_cap_hyst --> display 'unknown' when its value is 0xFFFFFFFF c) power1_average_min/max --> display 'unknown' when its value is negative. To avoid any changes in the display of these sysfs interfaces, we can't modifiy the type of these attributes in hwmon core and have to put them to extra_groups. Please note that the path of these sysfs interfaces are modified accordingly if use hwmon_device_register_with_info(): old: all sysfs interfaces are under acpi device, namely, /sys/class/hwmon/hwmonX/device/ now: all sysfs interfaces are under hwmon device, namely, /sys/class/hwmon/hwmonX/ The new ABI does not guarantee that the underlying path remains the same. But we have to accept this change so as to replace the deprecated API. Fortunately, some userspace application, like libsensors, would scan the two path and handles this automatically. So we can accept this change so as to drop the deprecated message. Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://lore.kernel.org/r/20250319020638.59925-1-lihuisong@huawei.com [groeck: Fixed some multi-line alignment issues; reverted to 32-bit arithmetic in power1_accuracy_show() fixed bad return code from power_meter_is_visible()] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-03-21	Merge tag 'asoc-fix-v6.14-rc7' of ↵	Takashi Iwai
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.14 A couple of small fixes, plus some new device IDs - nothing super urgent here.
2025-03-21	ALSA: hda/realtek: fix micmute LEDs on HP Laptops with ALC3247	Chris Chiu
	More HP EliteBook with Realtek HDA codec ALC3247 with combined CS35L56 Amplifiers need quirk ALC236_FIXUP_HP_GPIO_LED to fix the micmute LED. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Reviewed-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250321104914.544233-2-chris.chiu@canonical.com
2025-03-21	ALSA: hda/realtek: fix micmute LEDs on HP Laptops with ALC3315	Chris Chiu
	More HP laptops with Realtek HDA codec ALC3315 with combined CS35L56 Amplifiers need quirk ALC285_FIXUP_HP_GPIO_LED to fix the micmute LED. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Reviewed-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250321104914.544233-1-chris.chiu@canonical.com
2025-03-21	arm: defconfig: drop RT_GROUP_SCHED=y from bcm2835/tegra/omap2plus	Celeste Liu
	Commit 673ce00c5d6c ("ARM: omap2plus_defconfig: Add support for distros with systemd") said it's because of recommendation from systemd. But systemd changed their recommendation later.[1] For cgroup v1, if turned on, and there's any cgroup in the "cpu" hierarchy it needs an RT budget assigned, otherwise the processes in it will not be able to get RT at all. The problem with RT group scheduling is that it requires the budget assigned but there's no way we could assign a default budget, since the values to assign are both upper and lower time limits, are absolute, and need to be sum up to < 1 for each individal cgroup. That means we cannot really come up with values that would work by default in the general case.[2] For cgroup v2, it's almost unusable as well. If it turned on, the cpu controller can only be enabled when all RT processes are in the root cgroup. But it will lose the benefits of cgroup v2 if all RT process were placed in the same cgroup. Red Hat, Gentoo, Arch Linux and Debian all disable it. systemd also doesn't support it. [1]: https://github.com/systemd/systemd/commit/f4e74be1856b3ac058acbf1be321c31d5299f69f [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1229700 Tested-by: Stefan Wahren <wahrenst@gmx.net> Acked-by: Kevin Hilman <khilman@baylibre.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Celeste Liu <uwu@coelacanthus.name> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-21	netfilter: fib: avoid lookup if socket is available	Florian Westphal
	In case the fib match is used from the input hook we can avoid the fib lookup if early demux assigned a socket for us: check that the input interface matches sk-cached one. Rework the existing 'lo bypass' logic to first check sk, then for loopback interface type to elide the fib lookup. This speeds up fib matching a little, before: 93.08 GBit/s (no rules at all) 75.1 GBit/s ("fib saddr . iif oif missing drop" in prerouting) 75.62 GBit/s ("fib saddr . iif oif missing drop" in input) After: 92.48 GBit/s (no rules at all) 75.62 GBit/s (fib rule in prerouting) 90.37 GBit/s (fib rule in input). Numbers for the 'no rules' and 'prerouting' are expected to closely match in-between runs, the 3rd/input test case exercises the the 'avoid lookup if cached ifindex in sk matches' case. Test used iperf3 via veth interface, lo can't be used due to existing loopback test. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-03-21	mmc: core: Remove redundant null check	Avri Altman
	This change removes a redundant null check found by Smatch. Fixes: 403a0293f1c2 ("mmc: core: Add open-ended Ext memory addressing") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-mmc/345be6cd-f2f3-472e-a897-ca4b7c4cf826@stanley.mountain/ Signed-off-by: Avri Altman <avri.altman@sandisk.com> Link: https://lore.kernel.org/r/20250319203642.778016-1-avri.altman@sandisk.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-03-21	PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag	Roger Pau Monne
	Setting pci_msi_ignore_mask inhibits the toggling of the mask bit for both MSI and MSI-X entries globally, regardless of the IRQ chip they are using. Only Xen sets the pci_msi_ignore_mask when routing physical interrupts over event channels, to prevent PCI code from attempting to toggle the maskbit, as it's Xen that controls the bit. However, the pci_msi_ignore_mask being global will affect devices that use MSI interrupts but are not routing those interrupts over event channels (not using the Xen pIRQ chip). One example is devices behind a VMD PCI bridge. In that scenario the VMD bridge configures MSI(-X) using the normal IRQ chip (the pIRQ one in the Xen case), and devices behind the bridge configure the MSI entries using indexes into the VMD bridge MSI table. The VMD bridge then demultiplexes such interrupts and delivers to the destination device(s). Having pci_msi_ignore_mask set in that scenario prevents (un)masking of MSI entries for devices behind the VMD bridge. Move the signaling of no entry masking into the MSI domain flags, as that allows setting it on a per-domain basis. Set it for the Xen MSI domain that uses the pIRQ chip, while leaving it unset for the rest of the cases. Remove pci_msi_ignore_mask at once, since it was only used by Xen code, and with Xen dropping usage the variable is unneeded. This fixes using devices behind a VMD bridge on Xen PV hardware domains. Albeit Devices behind a VMD bridge are not known to Xen, that doesn't mean Linux cannot use them. By inhibiting the usage of VMD_FEAT_CAN_BYPASS_MSI_REMAP and the removal of the pci_msi_ignore_mask bodge devices behind a VMD bridge do work fine when use from a Linux Xen hardware domain. That's the whole point of the series. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Juergen Gross <jgross@suse.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-ID: <20250219092059.90850-4-roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-03-21	PCI: vmd: Disable MSI remapping bypass under Xen	Roger Pau Monne
	MSI remapping bypass (directly configuring MSI entries for devices on the VMD bus) won't work under Xen, as Xen is not aware of devices in such bus, and hence cannot configure the entries using the pIRQ interface in the PV case, and in the PVH case traps won't be setup for MSI entries for such devices. Until Xen is aware of devices in the VMD bus prevent the VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any kind of Xen guest. The MSI remapping bypass is an optional feature of VMD bridges, and hence when running under Xen it will be masked and devices will be forced to redirect its interrupts from the VMD bridge. That mode of operation must always be supported by VMD bridges and works when Xen is not aware of devices behind the VMD bridge. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-ID: <20250219092059.90850-3-roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-03-21	zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work ↵	Ingo Molnar
	around compiler segfault Due to pending percpu improvements in -next, GCC9 and GCC10 are crashing during the build with: lib/zstd/compress/huf_compress.c:1033:1: internal compiler error: Segmentation fault 1033 \| { \| ^ Please submit a full bug report, with preprocessed source if appropriate. See <file:///usr/share/doc/gcc-9/README.Bugs> for instructions. The DYNAMIC_BMI2 feature is a known-challenging feature of the ZSTD library, with an existing GCC quirk turning it off for GCC versions below 4.8. Increase the DYNAMIC_BMI2 version cutoff to GCC 11.0 - GCC 10.5 is the last version known to crash. Reported-by: Michael Kelley <mhklinux@outlook.com> Debugged-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: https://lore.kernel.org/r/SN6PR02MB415723FBCD79365E8D72CA5FD4D82@SN6PR02MB4157.namprd02.prod.outlook.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-21	x86/asm: Make asm export of __ref_stack_chk_guard unconditional	Ard Biesheuvel
	Clang does not tolerate the use of non-TLS symbols for the per-CPU stack protector very well, and to work around this limitation, the symbol passed via the -mstack-protector-guard-symbol= option is never defined in C code, but only in the linker script, and it is exported from an assembly file. This is necessary because Clang will fail to generate the correct %GS based references in a compilation unit that includes a non-TLS definition of the guard symbol being used to store the stack cookie. This problem is only triggered by symbol definitions, not by declarations, but nonetheless, the declaration in <asm/asm-prototypes.h> is conditional on __GENKSYMS__ being #define'd, so that only genksyms will observe it, but for ordinary compilation, it will be invisible. This is causing problems with the genksyms alternative gendwarfksyms, which does not #define __GENKSYMS__, does not observe the symbol declaration, and therefore lacks the information it needs to version it. Adding the #define creates problems in other places, so that is not a straight-forward solution. So take the easy way out, and drop the conditional on __GENKSYMS__, as this is not really needed to begin with. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250320213238.4451-2-ardb@kernel.org
2025-03-21	xen/pci: Do not register devices with segments >= 0x10000	Roger Pau Monne
	The current hypercall interface for doing PCI device operations always uses a segment field that has a 16 bit width. However on Linux there are buses like VMD that hook up devices into the PCI hierarchy at segment >= 0x10000, after the maximum possible segment enumerated in ACPI. Attempting to register or manage those devices with Xen would result in errors at best, or overlaps with existing devices living on the truncated equivalent segment values. Note also that the VMD segment numbers are arbitrarily assigned by the OS, and hence there would need to be some negotiation between Xen and the OS to agree on how to enumerate VMD segments and devices behind them. Skip notifying Xen about those devices. Given how VMD bridges can multiplex interrupts on behalf of devices behind them there's no need for Xen to be aware of such devices for them to be usable by Linux. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Juergen Gross <jgross@suse.com> Message-ID: <20250219092059.90850-2-roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-03-21	ext4: fix OOB read when checking dotdot dir	Acs, Jakub
	Mounting a corrupted filesystem with directory which contains '.' dir entry with rec_len == block size results in out-of-bounds read (later on, when the corrupted directory is removed). ext4_empty_dir() assumes every ext4 directory contains at least '.' and '..' as directory entries in the first data block. It first loads the '.' dir entry, performs sanity checks by calling ext4_check_dir_entry() and then uses its rec_len member to compute the location of '..' dir entry (in ext4_next_entry). It assumes the '..' dir entry fits into the same data block. If the rec_len of '.' is precisely one block (4KB), it slips through the sanity checks (it is considered the last directory entry in the data block) and leaves "struct ext4_dir_entry_2 *de" point exactly past the memory slot allocated to the data block. The following call to ext4_check_dir_entry() on new value of de then dereferences this pointer which results in out-of-bounds mem access. Fix this by extending __ext4_check_dir_entry() to check for '.' dir entries that reach the end of data block. Make sure to ignore the phony dir entries for checksum (by checking name_len for non-zero). Note: This is reported by KASAN as use-after-free in case another structure was recently freed from the slot past the bound, but it is really an OOB read. This issue was found by syzkaller tool. Call Trace: [ 38.594108] BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x67e/0x710 [ 38.594649] Read of size 2 at addr ffff88802b41a004 by task syz-executor/5375 [ 38.595158] [ 38.595288] CPU: 0 UID: 0 PID: 5375 Comm: syz-executor Not tainted 6.14.0-rc7 #1 [ 38.595298] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 38.595304] Call Trace: [ 38.595308] <TASK> [ 38.595311] dump_stack_lvl+0xa7/0xd0 [ 38.595325] print_address_description.constprop.0+0x2c/0x3f0 [ 38.595339] ? __ext4_check_dir_entry+0x67e/0x710 [ 38.595349] print_report+0xaa/0x250 [ 38.595359] ? __ext4_check_dir_entry+0x67e/0x710 [ 38.595368] ? kasan_addr_to_slab+0x9/0x90 [ 38.595378] kasan_report+0xab/0xe0 [ 38.595389] ? __ext4_check_dir_entry+0x67e/0x710 [ 38.595400] __ext4_check_dir_entry+0x67e/0x710 [ 38.595410] ext4_empty_dir+0x465/0x990 [ 38.595421] ? __pfx_ext4_empty_dir+0x10/0x10 [ 38.595432] ext4_rmdir.part.0+0x29a/0xd10 [ 38.595441] ? __dquot_initialize+0x2a7/0xbf0 [ 38.595455] ? __pfx_ext4_rmdir.part.0+0x10/0x10 [ 38.595464] ? __pfx___dquot_initialize+0x10/0x10 [ 38.595478] ? down_write+0xdb/0x140 [ 38.595487] ? __pfx_down_write+0x10/0x10 [ 38.595497] ext4_rmdir+0xee/0x140 [ 38.595506] vfs_rmdir+0x209/0x670 [ 38.595517] ? lookup_one_qstr_excl+0x3b/0x190 [ 38.595529] do_rmdir+0x363/0x3c0 [ 38.595537] ? __pfx_do_rmdir+0x10/0x10 [ 38.595544] ? strncpy_from_user+0x1ff/0x2e0 [ 38.595561] __x64_sys_unlinkat+0xf0/0x130 [ 38.595570] do_syscall_64+0x5b/0x180 [ 38.595583] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: ac27a0ec112a0 ("[PATCH] ext4: initial copy of files from ext3") Signed-off-by: Jakub Acs <acsjakub@amazon.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: linux-ext4@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Mahmoud Adam <mngyadam@amazon.com> Cc: stable@vger.kernel.org Cc: security@kernel.org Link: https://patch.msgid.link/b3ae36a6794c4a01944c7d70b403db5b@amazon.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: on a remount, only log the ro or r/w state when it has changed	Nicolas Bretz
	A user complained that a message such as: EXT4-fs (nvme0n1p3): re-mounted UUID ro. Quota mode: none. implied that the file system was previously mounted read/write and was now remounted read-only, when it could have been some other mount state that had changed by the "mount -o remount" operation. Fix this by only logging "ro"or "r/w" when it has changed. https://bugzilla.kernel.org/show_bug.cgi?id=219132 Signed-off-by: Nicolas Bretz <bretznic@gmail.com> Link: https://patch.msgid.link/20250319171011.8372-1-bretznic@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: correct the error handle in ext4_fallocate()	Zhang Yi
	The error out label of file_modified() should be out_inode_lock in ext4_fallocate(). Fixes: 2890e5e0f49e ("ext4: move out common parts into ext4_fallocate()") Reported-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250319023557.2785018-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: Make sb update interval tunable	Ojaswin Mujoo
	Currently, outside error paths, we auto commit the super block after 1 hour has passed and 16MB worth of updates have been written since last commit. This is a policy decision so make this tunable while keeping the defaults same. This is useful if user wants to tweak the superblock behavior or for debugging the codepath by allowing to trigger it more frequently. We can now tweak the super block update using sb_update_sec and sb_update_kb files in /sys/fs/ext4/<dev>/ Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/950fb8c9b2905620e16f02a3b9eeea5a5b6cb87e.1742279837.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: avoid journaling sb update on error if journal is destroying	Ojaswin Mujoo
	Presently we always BUG_ON if trying to start a transaction on a journal marked with JBD2_UNMOUNT, since this should never happen. However, while ltp running stress tests, it was observed that in case of some error handling paths, it is possible for update_super_work to start a transaction after the journal is destroyed eg: (umount) ext4_kill_sb kill_block_super generic_shutdown_super sync_filesystem /* commits all txns / evict_inodes / might start a new txn / ext4_put_super flush_work(&sbi->s_sb_upd_work) / flush the workqueue / jbd2_journal_destroy journal_kill_thread journal->j_flags \|= JBD2_UNMOUNT; jbd2_journal_commit_transaction jbd2_journal_get_descriptor_buffer jbd2_journal_bmap ext4_journal_bmap ext4_map_blocks ... ext4_inode_error ext4_handle_error schedule_work(&sbi->s_sb_upd_work) / work queue kicks in */ update_super_work jbd2_journal_start start_this_handle BUG_ON(journal->j_flags & JBD2_UNMOUNT) Hence, introduce a new mount flag to indicate journal is destroying and only do a journaled (and deferred) update of sb if this flag is not set. Otherwise, just fallback to an un-journaled commit. Further, in the journal destroy path, we have the following sequence: 1. Set mount flag indicating journal is destroying 2. force a commit and wait for it 3. flush pending sb updates This sequence is important as it ensures that, after this point, there is no sb update that might be journaled so it is safe to update the sb outside the journal. (To avoid race discussed in 2d01ddc86606) Also, we don't need a similar check in ext4_grp_locked_error since it is only called from mballoc and AFAICT it would be always valid to schedule work here. Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Reported-by: Mahesh Kumar <maheshkumar657g@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/9613c465d6ff00cd315602f99283d5f24018c3f7.1742279837.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: define ext4_journal_destroy wrapper	Ojaswin Mujoo
	Define an ext4 wrapper over jbd2_journal_destroy to make sure we have consistent behavior during journal destruction. This will also come useful in the next patch where we add some ext4 specific logic in the destroy path. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/c3ba78c5c419757e6d5f2d8ebb4a8ce9d21da86a.1742279837.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	ext4: hash: simplify kzalloc(n * 1, ...) to kzalloc(n, ...)	Ethan Carter Edwards
	sizeof(char) evaluates to 1. Remove the churn. Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://patch.msgid.link/20250316-ext4-hash-kcalloc-v2-1-2a99e93ec6e0@ethancedwards.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-21	jbd2: add a missing data flush during file and fs synchronization	Zhang Yi
	When the filesystem performs file or filesystem synchronization (e.g., ext4_sync_file()), it queries the journal to determine whether to flush the file device through jbd2_trans_will_send_data_barrier(). If the target transaction has not started committing, it assumes that the journal will submit the flush command, allowing the filesystem to bypass a redundant flush command. However, this assumption is not always valid. If the journal is not located on the filesystem device, the journal commit thread will not submit the flush command unless the variable ->t_need_data_flush is set to 1. Consequently, the flush may be missed, and data may be lost following a power failure or system crash, even if the synchronization appears to succeed. Unfortunately, we cannot determine with certainty whether the target transaction will flush to the filesystem device before it commits. However, if it has not started committing, it must be the running transaction. Therefore, fix it by always set its t_need_data_flush to 1, ensuring that the committing thread will flush the filesystem device. Fixes: bbd2be369107 ("jbd2: Add function jbd2_trans_will_send_data_barrier()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241206111327.4171337-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-20	Merge tag 'drm-fixes-2025-03-21' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Just the usual spread of a bunch for amdgpu, and small changes to others. scheduler: - fix fence reference leak xe: - Fix for an error if exporting a dma-buf multiple time amdgpu: - Fix video caps limits on several asics - SMU 14.x fixes - GC 12 fixes - eDP fixes - DMUB fix amdkfd: - GC 12 trap handler fix - GC 7/8 queue validation fix radeon: - VCE IB parsing fix v3d: - fix job error handling bugs qaic: - fix two integer overflows host1x: - fix NULL domain handling" * tag 'drm-fixes-2025-03-21' of https://gitlab.freedesktop.org/drm/kernel: (21 commits) drm/xe: Fix exporting xe buffers multiple times gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2 drm/amd/display: Fix incorrect fw_state address in dmub_srv drm/amd/display: Use HW lock mgr for PSR1 when only one eDP drm/amd/display: Fix message for support_edp0_on_dp1 drm/amdkfd: Fix user queue validation on Gfx7/8 drm/amdgpu: Restore uncached behaviour on GFX12 drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini() drm/amdkfd: Fix instruction hazard in gfx12 trap handler drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2 drm/amd/pm: add unique_id for gfx12 drm/amdgpu: Remove JPEG from vega and carrizo video caps drm/amdgpu: Fix JPEG video caps max size for navi1x and raven drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse() accel/qaic: Fix integer overflow in qaic_validate_req() accel/qaic: Fix possible data corruption in BOs > 2G drm/v3d: Set job pointer to NULL when the job's fence has an error drm/v3d: Don't run jobs that have errors flagged in its fence ...
2025-03-20	Merge tag 'v6.14-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds
	Pull smb client fix from Steve French: "smb3 client reconnect fix" * tag 'v6.14-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: smb: client: don't retry IO on failed negprotos with soft mounts
2025-03-20	io_uring: enable toggle of iowait usage when waiting on CQEs	Jens Axboe
	By default, io_uring marks a waiting task as being in iowait, if it's sleeping waiting on events and there are pending requests. This isn't necessarily always useful, and may be confusing on non-storage setups where iowait isn't expected. It can also cause extra power usage, by preventing the CPU from entering lower sleep states. This adds a new enter flag, IORING_ENTER_NO_IOWAIT. If set, then io_uring will not account the sleeping task as being in iowait. If the kernel supports this feature, then it will be marked by having the IORING_FEAT_NO_IOWAIT feature flag set. As the kernel currently does not support separating the iowait accounting and CPU frequency boosting, the IORING_ENTER_NO_IOWAIT controls both of these at the same time. In the future, if those do end up being split, then it'd be possible to control them separately. However, it seems more likely that the kernel will decouple iowait and CPU frequency boosting anyway. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20	selftests: ublk: fix write cache implementation	Ming Lei
	For loop target, write cache isn't enabled, and each write isn't be marked as DSYNC too. Fix it by enabling write cache, meantime fix FLUSH implementation by not taking LBA range into account, and there isn't such info for FLUSH command. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250321004758.152572-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21	Merge tag 'amd-drm-fixes-6.14-2025-03-20' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.14-2025-03-20: amdgpu: - Fix video caps limits on several asics - SMU 14.x fixes - GC 12 fixes - eDP fixes - DMUB fix amdkfd: - GC 12 trap handler fix - GC 7/8 queue validation fix radeon: - VCE IB parsing fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250320210800.1358992-1-alexander.deucher@amd.com
2025-03-21	Merge tag 'drm-xe-fixes-2025-03-20' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix for an error if exporting a dma-buf multiple time (Tomasz) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z9xalLaCWsNbh0P0@fedora
2025-03-21	Merge tag 'drm-misc-fixes-2025-03-20' of ↵	Dave Airlie
	ssh://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A sched fence reference leak fix, two fence fixes for v3d, two overflow fixes for quaic, and a iommu handling fix for host1x. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250320-valiant-outstanding-nightingale-e9acae@houat
2025-03-20	Merge tag 'nvme-6.15-2025-03-20' of git://git.infradead.org/nvme into ↵	Jens Axboe
	for-6.15/block Pull NVMe updates from Keith: "nvme updates for Linux 6.15 - Secure concatenation for TCP transport (Hannes) - Multipath sysfs visibility (Nilay) - Various cleanups (Qasim, Baruch, Wang, Chen, Mike, Damien, Li) - Correct use of 64-bit BARs for pci-epf target (Niklas) - Socket fix for selinux when used in containers (Peijie)" * tag 'nvme-6.15-2025-03-20' of git://git.infradead.org/nvme: (22 commits) nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi) nvme-tcp: fix selinux denied when calling sock_sendmsg nvmet: pci-epf: Always configure BAR0 as 64-bit nvmet: Remove duplicate uuid_copy nvme: zns: Simplify nvme_zone_parse_entry() nvmet: pci-epf: Remove redundant 'flush_workqueue()' calls nvmet-fc: Remove unused functions nvme-pci: remove stale comment nvme-fc: Utilise min3() to simplify queue count calculation nvme-multipath: Add visibility for queue-depth io-policy nvme-multipath: Add visibility for numa io-policy nvme-multipath: Add visibility for round-robin io-policy nvmet: add tls_concat and tls_key debugfs entries nvmet-tcp: support secure channel concatenation nvmet: Add 'sq' argument to alloc_ctrl_args nvme-fabrics: reset admin connection for secure concatenation nvme-tcp: request secure channel concatenation nvme-keyring: add nvme_tls_psk_refresh() nvme: add nvme_auth_derive_tls_psk() nvme: add nvme_auth_generate_digest() ...
2025-03-20	Merge tag 'dma-mapping-6.14-2025-03-21' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping fix from Marek Szyprowski: - fix missing clear bdr in check_ram_in_range_map() (Baochen Qiang) * tag 'dma-mapping-6.14-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: fix missing clear bdr in check_ram_in_range_map()
2025-03-20	nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi)	Li Haoran
	This patch replaces max(a, min(b, c)) by clamp(val, lo, hi) in the nvme driver. The clamp() macro explicitly expresses the intent of constraining a value within bounds, improving code readability. Signed-off-by: Li Haoran <li.haoran7@zte.com.cn> Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20	nvme-tcp: fix selinux denied when calling sock_sendmsg	Peijie Shao
	In a SELinux enabled kernel, socket_create() initializes the security label of the socket using the security label of the calling process, this typically works well. However, in a containerized environment like Kubernetes, problem arises when a privileged container(domain spc_t) connects to an NVMe target and mounts the NVMe as persistent storage for unprivileged containers(domain container_t). This is because the container_t domain cannot access resources labeled with spc_t, resulting in socket_sendmsg returning -EACCES. The solution is to use socket_create_kern() instead of socket_create(), which labels the socket context to kernel_t. Access control will then be handled by the VFS layer rather than the socket itself. Signed-off-by: Peijie Shao <shaopeijie@cestc.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20	nvmet: pci-epf: Always configure BAR0 as 64-bit	Niklas Cassel
	NVMe PCIe Transport Specification 1.1, section 2.1.10, claims that the BAR0 type is Implementation Specific. However, in NVMe 1.1, the type is required to be 64-bit. Thus, to make our PCI EPF work on as many host systems as possible, always configure the BAR0 type to be 64-bit. In the rare case that the underlying PCI EPC does not support configuring BAR0 as 64-bit, the call to pci_epc_set_bar() will fail, and we will return a failure back to the user. This should not be a problem, as most PCI EPCs support configuring a BAR as 64-bit (and those EPCs with .only_64bit set to true in epc_features only support configuring the BAR as 64-bit). Tested-by: Damien Le Moal <dlemoal@kernel.org> Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>