Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of cleanups for the error handling in the Freescale drivers"
* tag 'spi-fix-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fsl-spi: Remove redundant probe error message
spi: fsl-qspi: Fix double cleanup in probe error path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:
- Fix missing error checks during controller probe in the sata_sx4
driver (Wentao)
- Fix missing error checks during controller probe in the pata_pxa
driver (Henry)
* tag 'ata-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: sata_sx4: Add error handling in pdc20621_i2c_read()
ata: pata_pxa: Fix potential NULL pointer dereference in pxa_ata_probe()
|
|
Pull more block fixes from Jens Axboe:
"Apparently my internal clock was off, or perhaps it was just wishful
thinking, but I sent out block fixes yesterday as my brain assumed it
was Friday. Subsequently, that missed the NVMe fixes that should go
into this weeks release as well. Hence, here's a followup with those,
and another simple fix.
- NVMe pull request via Christoph:
- nvmet fc/fcloop refcounting fixes (Daniel Wagner)
- fix missed namespace/ANA scans (Hannes Reinecke)
- fix a use after free in the new TCP netns support (Kuniyuki
Iwashima)
- fix a NULL instead of false review in multipath (Uday Shankar)
- Use strscpy() for null_blk disk name copy"
* tag 'block-6.15-20250411' of git://git.kernel.dk/linux:
null_blk: Use strscpy() instead of strscpy_pad() in null_add_dev()
nvmet-fc: put ref when assoc->del_work is already scheduled
nvmet-fc: take tgtport reference only once
nvmet-fc: update tgtport ref per assoc
nvmet-fc: inline nvmet_fc_free_hostport
nvmet-fc: inline nvmet_fc_delete_assoc
nvmet-fcloop: add ref counting to lport
nvmet-fcloop: replace kref with refcount
nvmet-fcloop: swap list_add_tail arguments
nvme-tcp: fix use-after-free of netns by kernel TCP socket.
nvme: multipath: fix return value of nvme_available_path
nvme: re-read ANA log page after ns scan completes
nvme: requeue namespace scan on missed AENs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- Fix two crashes, one in core code and a NULL-ptr dereference in the
Mediatek IOMMU driver
- Dma_ops cleanup fix for core code
- Two fixes for Intel VT-d driver:
- Fix posted MSI issue when users change cpu affinity
- Remove invalid set_dma_ops() call in the iommu driver
- Warning fix for Tegra IOMMU driver
- Suspend/Resume fix for Exynos IOMMU driver
- Probe failure fix for Renesas IOMMU driver
- Cosmetic fix
* tag 'iommu-fixes-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/tegra241-cmdqv: Fix warnings due to dmam_free_coherent()
iommu: remove unneeded semicolon
iommu/mediatek: Fix NULL pointer deference in mtk_iommu_device_group
iommu/exynos: Fix suspend/resume with IDENTITY domain
iommu/ipmmu-vmsa: Register in a sensible order
iommu: Clear iommu-dma ops on cleanup
iommu/vt-d: Remove an unnecessary call set_dma_ops()
iommu/vt-d: Wire up irq_ack() to irq_move_irq() for posted MSIs
iommu: Fix crash in report_iommu_fault()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a recent regression in the ACPI button driver, add quirks
related to EC wakeups from suspend-to-idle and fix coding mistakes
related to the usage of sizeof() in the PPTT parser code:
Summary:
- Add suspend-to-idle EC wakeup quirks for Lenovo Go S (Mario
Limonciello)
- Prevent ACPI button from sending spurions KEY_POWER events to user
space in some cases after a recent update (Mario Limonciello)
- Compute the size of a structure instead of the size of a pointer in
two places in the PPTT parser code (Jean-Marc Eurin)"
* tag 'acpi-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls
ACPI: EC: Set ec_no_wakeup for Lenovo Go S
ACPI: button: Only send `KEY_POWER` for `ACPI_BUTTON_NOTIFY_STATUS`
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
"Note that besides two bug fixes this includes three commits for IBM
z17, which was announced this week.
- Add IBM z17 bits:
- Setup elf_platform for new machine types
- Allow to compile the kernel with z17 optimizations
- Add new performance counters
- Fix mismatch between indicator bits and queue indexes in virtio CCW code
- Fix double free in pmu setup error path"
* tag 's390-6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpumf: Fix double free on error in cpumf_pmu_event_init()
s390/cpumf: Update CPU Measurement facility extended counter set support
s390: Allow to compile with z17 optimizations
s390: Add z17 elf platform
s390/virtio_ccw: Don't allocate/assign airqs for non-existing queues
|
|
Commit 44605365f935 ("iwlwifi: mld: fix building with CONFIG_PM_SLEEP
disabled") sought to fix build breakage, but inadvertently introduced
a new issue:
iwl_mld_mac80211_start() no longer calls iwl_mld_start_fw() after having
called iwl_mld_stop_fw() in the error path of iwl_mld_no_wowlan_resume().
Fix it.
Fixes: 44605365f935 ("iwlwifi: mld: fix building with CONFIG_PM_SLEEP disabled")
Reported-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Closes: https://lore.kernel.org/r/MW5PR11MB58106D6BC6403845C330C7AAA3A22@MW5PR11MB5810.namprd11.prod.outlook.com/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/d3ba1006a1b72ceb58c593fa62b9bd7c73e2e4ed.1744366815.git.lukas@wunner.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The reset handshake attempts to kill the firmware, and it'll go
into a pretty much dead state once we do that. However, if it
times out, then we'll attempt to dump the firmware to be able
to see why it didn't respond. During this dump, we cannot treat
it as if it was still running, since we just tried to kill it,
otherwise dumping will attempt to send a DBGC stop command. As
this command will time out, we'll go into a reset loop.
For now, fix this by setting the trans->state to say firmware
isn't running before doing the reset handshake. In the longer
term, we should clean up the way this state is handled.
It's not entirely clear but it seems likely that this issue was
introduced by my rework of the error handling, prior to that it
would've been synchronous at that point and (I think) not have
attempted to reset since it was already doing down.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219967
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219968
Fixes: 7391b2a4f7db ("wifi: iwlwifi: rework firmware error handling")
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250411104054.63aa4f56894d.Ife70cfe997db03f0d07fdef2b164695739a05a63@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Merge updates of the ACPI EC and button drivers for 6.15-rc2:
- Add suspend-to-idle EC wakeup quirks for Lenovo Go S (Mario
Limonciello).
- Prevent ACPI button from sending spurions KEY_POWER events to user
space in some cases after a recent update (Mario Limonciello).
* acpi-ec:
ACPI: EC: Set ec_no_wakeup for Lenovo Go S
* acpi-button:
ACPI: button: Only send `KEY_POWER` for `ACPI_BUTTON_NOTIFY_STATUS`
|
|
blk_mq_alloc_disk() already zero-initializes the destination buffer,
making strscpy() sufficient for safely copying the disk's name. The
additional NUL-padding performed by strscpy_pad() is unnecessary.
If the destination buffer has a fixed length, strscpy() automatically
determines its size using sizeof() when the argument is omitted. This
makes the explicit size argument unnecessary.
The source string is also NUL-terminated and meets the __must_be_cstr()
requirement of strscpy().
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250410154727.883207-1-thorsten.blum@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Two WARNINGs are observed when SMMU driver rolls back upon failure:
arm-smmu-v3.9.auto: Failed to register iommu
arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22
------------[ cut here ]------------
WARNING: CPU: 5 PID: 1 at kernel/dma/mapping.c:74 dmam_free_coherent+0xc0/0xd8
Call trace:
dmam_free_coherent+0xc0/0xd8 (P)
tegra241_vintf_free_lvcmdq+0x74/0x188
tegra241_cmdqv_remove_vintf+0x60/0x148
tegra241_cmdqv_remove+0x48/0xc8
arm_smmu_impl_remove+0x28/0x60
devm_action_release+0x1c/0x40
------------[ cut here ]------------
128 pages are still in use!
WARNING: CPU: 16 PID: 1 at mm/page_alloc.c:6902 free_contig_range+0x18c/0x1c8
Call trace:
free_contig_range+0x18c/0x1c8 (P)
cma_release+0x154/0x2f0
dma_free_contiguous+0x38/0xa0
dma_direct_free+0x10c/0x248
dma_free_attrs+0x100/0x290
dmam_free_coherent+0x78/0xd8
tegra241_vintf_free_lvcmdq+0x74/0x160
tegra241_cmdqv_remove+0x98/0x198
arm_smmu_impl_remove+0x28/0x60
devm_action_release+0x1c/0x40
This is because the LVCMDQ queue memory are managed by devres, while that
dmam_free_coherent() is called in the context of devm_action_release().
Jason pointed out that "arm_smmu_impl_probe() has mis-ordered the devres
callbacks if ops->device_remove() is going to be manually freeing things
that probe allocated":
https://lore.kernel.org/linux-iommu/20250407174408.GB1722458@nvidia.com/
In fact, tegra241_cmdqv_init_structures() only allocates memory resources
which means any failure that it generates would be similar to -ENOMEM, so
there is no point in having that "falling back to standard SMMU" routine,
as the standard SMMU would likely fail to allocate memory too.
Remove the unwind part in tegra241_cmdqv_init_structures(), and return a
proper error code to ask SMMU driver to call tegra241_cmdqv_remove() via
impl_ops->device_remove(). Then, drop tegra241_vintf_free_lvcmdq() since
devres will take care of that.
Fixes: 483e0bd8883a ("iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250407201908.172225-1-nicolinc@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
cocci warnings:
drivers/iommu/dma-iommu.c:1788:2-3: Unneeded semicolon
so remove unneeded semicolon to fix cocci warnings.
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/tencent_73EEE47E6ECCF538229C9B9E6A0272DA2B05@qq.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Currently, mtk_iommu calls during probe iommu_device_register before
the hw_list from driver data is initialized. Since iommu probing issue
fix, it leads to NULL pointer dereference in mtk_iommu_device_group when
hw_list is accessed with list_first_entry (not null safe).
So, change the call order to ensure iommu_device_register is called
after the driver data are initialized.
Fixes: 9e3a2a643653 ("iommu/mediatek: Adapt sharing and non-sharing pgtable case")
Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org> # MT8183 Juniper, MT8186 Tentacruel
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Link: https://lore.kernel.org/r/20250403-fix-mtk-iommu-error-v2-1-fe8b18f8b0a8@collabora.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe
path") changed the sequence of probing the SYSMMU controller devices and
calls to arm_iommu_attach_device(), what results in resuming SYSMMU
controller earlier, when it is still set to IDENTITY mapping. Such change
revealed the bug in IDENTITY handling in the exynos-iommu driver. When
SYSMMU controller is set to IDENTITY mapping, data->domain is NULL, so
adjust checks in suspend & resume callbacks to handle this case
correctly.
Fixes: b3d14960e629 ("iommu/exynos: Implement an IDENTITY domain")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20250401202731.2810474-1-m.szyprowski@samsung.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
On some platforms it has been observed that STT limits are not being
applied properly causing poor performance as power limits are set too low.
STT limits that are sent to the platform are supposed to be in Q8.8
format. Convert them before sending.
Reported-by: Yijun Shen <Yijun.Shen@dell.com>
Fixes: 7c45534afa443 ("platform/x86/amd/pmf: Add support for PMF Policy Binary")
Cc: stable@vger.kernel.org
Tested-by: Yijun Shen <Yijun_Shen@Dell.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250407181915.1482450-1-superm1@kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
A warning is seen when running the latest kernel on a BlueField SOC:
[251.512704] ------------[ cut here ]------------
[251.512711] invalid sysfs_emit: buf:0000000003aa32ae
[251.512720] WARNING: CPU: 1 PID: 705264 at fs/sysfs/file.c:767 sysfs_emit+0xac/0xc8
The warning is triggered because the mlxbf-bootctl driver invokes
"sysfs_emit()" with a buffer pointer that is not aligned to the
start of the page. The driver should instead use "sysfs_emit_at()"
to support non-zero offsets into the destination buffer.
Fixes: 9886f575de5a ("platform/mellanox: mlxbf-bootctl: use sysfs_emit() instead of sprintf()")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20250407132558.2418719-1-davthompson@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The Vexia EDU ATLA 10 tablet comes in 2 different versions with
significantly different mainboards. The only outward difference is that
the charging barrel on one is marked 5V and the other is marked 9V.
Both are x86 ACPI tablets which ships with Android x86 as factory OS.
with a DSDT which contains a bunch of I2C devices which are not actually
there, causing various resource conflicts. Enumeration of these is skipped
through the acpi_quirk_skip_i2c_client_enumeration().
Extend the existing support for the 9V version by adding support for
manually instantiating the I2C devices which are actually present on
the 5V version by adding the necessary device info to
the x86-android-tablets module.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250407092017.273124-2-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The Vexia EDU ATLA 10 tablet comes in 2 different versions with
significantly different mainboards. The only outward difference is that
the charging barrel on one is marked 5V and the other is marked 9V.
Both need to be handled by the x86-android-tablets code. Add 9v to
the symbols for the existing support for the 9V Vexia EDU ATLA 10 tablet
symbols to prepare for adding support for the 5V version.
All this patch does is s/vexia_edu_atla10_info/vexia_edu_atla10_9v_info/.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250407092017.273124-1-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The value returned by acpi_evaluate_integer() is not checked,
but the result is not always successful, so it is necessary to
add a check of the returned value.
If the result remains negative during three iterations of the loop,
then the uninitialized variable 'val' will be used in the clamp_val()
macro, so it must be initialized with the current value of the 'curr'
variable.
In this case, the algorithm should be less noisy.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: b23910c2194e ("asus-laptop: Pegatron Lucid accelerometer")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Link: https://lore.kernel.org/r/20250403122603.18172-1-arefev@swemel.ru
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
IPMMU registers almost-initialised instances, but misses assigning the
drvdata to make them fully functional, so initial calls back into
ipmmu_probe_device() are likely to fail unnecessarily. Reorder this to
work as it should, also pruning the long-out-of-date comment and adding
the missing sysfs cleanup on error for good measure.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/53be6667544de65a15415b699e38a9a965692e45.1742481687.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
If iommu_device_register() encounters an error, it can end up tearing
down already-configured groups and default domains, however this
currently still leaves devices hooked up to iommu-dma (and even
historically the behaviour in this area was at best inconsistent across
architectures/drivers...) Although in the case that an IOMMU is present
whose driver has failed to probe, users cannot necessarily expect DMA to
work anyway, it's still arguable that we should do our best to put
things back as if the IOMMU driver was never there at all, and certainly
the potential for crashing in iommu-dma itself is undesirable. Make sure
we clean up the dev->dma_iommu flag along with everything else.
Reported-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Closes: https://lore.kernel.org/all/CAGXv+5HJpTYmQ2h-GD7GjyeYT7bL9EBCvu0mz5LgpzJZtzfW0w@mail.gmail.com/
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/e788aa927f6d827dd4ea1ed608fada79f2bab030.1744284228.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Do not touch per-device DMA ops when the driver has been converted to use
the dma-iommu API.
Fixes: c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Link: https://lore.kernel.org/r/20250403165605.278541-1-ptesarik@suse.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Set the posted MSI irq_chip's irq_ack() hook to irq_move_irq() instead of
a dummy/empty callback so that posted MSIs process pending changes to the
IRQ's SMP affinity. Failure to honor a pending set-affinity results in
userspace being unable to change the effective affinity of the IRQ, as
IRQD_SETAFFINITY_PENDING is never cleared and so irq_set_affinity_locked()
always defers moving the IRQ.
The issue is most easily reproducible by setting /proc/irq/xx/smp_affinity
multiple times in quick succession, as only the first update is likely to
be handled in process context.
Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs")
Cc: Robert Lippert <rlippert@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Wentao Yang <wentaoyang@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250321194249.1217961-1-seanjc@google.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The following crash is observed while handling an IOMMU fault with a
recent kernel:
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle page fault for address: ffff8c708299f700
PGD 19ee01067 P4D 19ee01067 PUD 101c10063 PMD 80000001028001e3
Oops: Oops: 0011 [#1] SMP NOPTI
CPU: 4 UID: 0 PID: 139 Comm: irq/25-AMD-Vi Not tainted 6.15.0-rc1+ #20 PREEMPT(lazy)
Hardware name: LENOVO 21D0/LNVNB161216, BIOS J6CN50WW 09/27/2024
RIP: 0010:0xffff8c708299f700
Call Trace:
<TASK>
? report_iommu_fault+0x78/0xd3
? amd_iommu_report_page_fault+0x91/0x150
? amd_iommu_int_thread+0x77/0x180
? __pfx_irq_thread_fn+0x10/0x10
? irq_thread_fn+0x23/0x60
? irq_thread+0xf9/0x1e0
? __pfx_irq_thread_dtor+0x10/0x10
? __pfx_irq_thread+0x10/0x10
? kthread+0xfc/0x240
? __pfx_kthread+0x10/0x10
? ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
? ret_from_fork_asm+0x1a/0x30
</TASK>
report_iommu_fault() checks for an installed handler comparing the
corresponding field to NULL. It can (and could before) be called for a
domain with a different cookie type - IOMMU_COOKIE_DMA_IOVA, specifically.
Cookie is represented as a union so we may end up with a garbage value
treated there if this happens for a domain with another cookie type.
Formerly there were two exclusive cookie types in the union.
IOMMU_DOMAIN_SVA has a dedicated iommu_report_device_fault().
Call the fault handler only if the passed domain has a required cookie
type.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 6aa63a4ec947 ("iommu: Sort out domain user data")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250408213342.285955-1-pchelkin@ispras.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Pull drm fixes from Dave Airlie:
"Weekly fixes, as expected it has a bit more in it than probably usual
for rc2. amdgpu/xe/i915 lead the way with fixes all over for a bunch
of other drivers. Nothing major stands out from what I can see.
tests:
- Clean up struct drm_display_mode in various places
i915:
- Fix scanline offset for LNL+ and BMG+
- Fix GVT unterminated-string-initialization build warning
- Fix DP rate limit when sink doesn't support TPS4
- Handle GDDR + ECC memory type detection
- Fix VRR parameter change check
- Fix fence not released on early probe errors
- Disable render power gating during live selftests
xe:
- Add another BMG PCI ID
- Fix UAFs on migration paths
- Fix shift-out-of-bounds access on TLB invalidation
- Ensure ccs_mode is correctly set on gt reset
- Extend some HW workarounds to Xe3
- Fix PM runtime get/put on sysfs files
- Fix u64 division on 32b
- Fix flickering due to missing L3 invalidations
- Fix missing error code return
amdgpu:
- MES FW version caching fixes
- Only use GTT as a fallback if we already have a backing store
- dma_buf fix
- IP discovery fix
- Replay and PSR with VRR fix
- DC FP fixes
- eDP fixes
- KIQ TLB invalidate fix
- Enable dmem groups support
- Allow pinning VRAM dma bufs if imports can do P2P
- Workload profile fixes
- Prevent possible division by 0 in fan handling
amdkfd:
- Queue reset fixes
imagination:
- Fix overflow
- Fix use-after-free
ivpu:
- Fix suspend/resume
nouveau:
- Do not deref dangling pointer
rockchip:
- Set DP/HDMI registers correctly
udmabuf:
- Fix overflow
virtgpu:
- Set reservation lock on dma-buf import
- Fix error handling in prepare_fb"
* tag 'drm-fixes-2025-04-11-1' of https://gitlab.freedesktop.org/drm/kernel: (58 commits)
drm/rockchip: dw_hdmi_qp: Fix io init for dw_hdmi_qp_rockchip_resume
drm/rockchip: vop2: Fix interface enable/mux setting of DP1 on rk3588
drm/amdgpu/mes12: optimize MES pipe FW version fetching
drm/amd/pm/smu11: Prevent division by zero
drm/amdgpu: cancel gfx idle work in device suspend for s0ix
drm/amd/display: pause the workload setting in dm
drm/amdgpu/pm/swsmu: implement pause workload profile
drm/amdgpu/pm: add workload profile pause helper
drm/i915/huc: Fix fence not released on early probe errors
drm/i915/vrr: Add vrr.vsync_{start, end} in vrr_params_changed
drm/tests: probe-helper: Fix drm_display_mode memory leak
drm/tests: modes: Fix drm_display_mode memory leak
drm/tests: modes: Fix drm_display_mode memory leak
drm/tests: cmdline: Fix drm_display_mode memory leak
drm/tests: modeset: Fix drm_display_mode memory leak
drm/tests: modeset: Fix drm_display_mode memory leak
drm/tests: helpers: Create kunit helper to destroy a drm_display_mode
drm/xe: Restore EIO errno return when GuC PC start fails
drm/xe: Invalidate L3 read-only cachelines for geometry streams too
drm/xe: avoid plain 64-bit division
...
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Add another BMG PCI ID
- Fix UAFs on migration paths
- Fix shift-out-of-bounds access on TLB invalidation
- Ensure ccs_mode is correctly set on gt reset
- Extend some HW workarounds to Xe3
- Fix PM runtime get/put on sysfs files
- Fix u64 division on 32b
- Fix flickering due to missing L3 invalidations
- Fix missing error code return
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/unq5j26aejbrjz5nuvmdtcgupyix5bacpoahod4bdohlvwrney@kekimsi5ossx
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
imagination:
- Fix overflow
- Fix use-after-free
ivpu:
- Fix suspend/resume
nouveau:
- Do not deref dangling pointer
rockchip:
- Set DP/HDMI registers correctly
tests:
- Clean up struct drm_display_mode in various places
udmabuf:
- Fix overflow
virtgpu:
- Set reservation lock on dma-buf import
- Fix error handling in prepare_fb
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250410122414.GA32202@2a02-2454-fd5e-fd00-d686-8907-6053-f8d8.dyn6.pyur.net
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc irqchip fixes from Ingo Molnar:
- Fix NULL pointer dereference crashes due to missing .chip_flags setup
in the sg2042-msi and irq-bcm2712-mip irqchip drivers
- Remove the davinci aintc irqchip driver's leftover header too
* tag 'irq-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm2712-mip: Set EOI/ACK flags in msi_parent_ops
irqchip/sg2042-msi: Add missing chip flags
irqchip/davinci: Remove leftover header
|
|
Remove nq variable from bnxt_re_create_srq() and bnxt_re_destroy_srq()
as it generates the following compilation warnings:
>> drivers/infiniband/hw/bnxt_re/ib_verbs.c:1777:24: warning: variable
'nq' set but not used [-Wunused-but-set-variable]
1777 | struct bnxt_qplib_nq *nq = NULL;
| ^
drivers/infiniband/hw/bnxt_re/ib_verbs.c:1828:24: warning: variable
'nq' set but not used [-Wunused-but-set-variable]
1828 | struct bnxt_qplib_nq *nq = NULL;
| ^
2 warnings generated.
Fixes: 6b395d31146a ("RDMA/bnxt_re: Fix budget handling of notification queue")
Link: https://patch.msgid.link/r/8a4343e217d7d1c0a5a786b785c4ac57cb72a2a0.1744288299.git.leonro@nvidia.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504091055.CzgXnk4C-lkp@intel.com/
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This prints an error message if the FW Dump trigger command fails. This
scenario is mainly observed in legacy chipsets 8987 and 8997 and also
IW416, where this feature is unavailable due to memory constraints.
Fixes: 998e447f443f ("Bluetooth: btnxpuart: Add support for HCI coredump feature")
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This reverts the change baudrate logic in nxp_shutdown.
Earlier, when the driver was removed, it restored the controller
baudrate to fw_init_baudrate, so that on re-loading the driver, things
work fine.
However, if the driver was removed while hci0 interface is down, the
change baudrate vendor command could not be sent by the driver. When the
driver was re-loaded, host and controller baudrate would be mismatched
and hci initialization would fail. The only way to recover would be to
reboot the system.
This issue was fixed by moving the restore baudrate logic from
nxp_serdev_remove() to nxp_shutdown().
This fix however caused another issue with the command "hciconfig hci0
reset", which makes hci0 DOWN and UP immediately.
Running "bluetoothctl power off" and "bluetoothctl power on" in a tight
loop works fine.
To maintain support for "hciconfig reset" command, the above mentioned fix
is reverted.
Fixes: 6fca6781d19d ("Bluetooth: btnxpuart: Move vendor specific initialization to .post_init")
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The QCA_WCN3950_SOC_ID_S should be using qca/cmnv13s.bin, rather than
qca/cmnv13u.bin file. Correct the variant suffix to be used for this SoC
ID.
Fixes: d5712c511cb3 ("Bluetooth: qca: add WCN3950 support")
Reported-by: Wojciech Slenska <wsl@trackunit.com>
Closes: https://github.com/qualcomm-linux/meta-qcom/pull/817#discussion_r2022866431
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The btrtl_initialize() function checks that rtl_load_file() either
had an error or it loaded a zero length file. However, if it loaded
a zero length file then the error code is not set correctly. It
results in an error pointer vs NULL bug, followed by a NULL pointer
dereference. This was detected by Smatch:
drivers/bluetooth/btrtl.c:592 btrtl_initialize() warn: passing zero to 'ERR_PTR'
Fixes: 26503ad25de8 ("Bluetooth: btrtl: split the device initialization into smaller parts")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- core: hold instance lock during NETDEV_CHANGE
- rtnetlink: fix bad unlock balance in do_setlink()
- ipv6:
- fix null-ptr-deref in addrconf_add_ifaddr()
- align behavior across nexthops during path selection
Previous releases - regressions:
- sctp: prevent transport UaF in sendmsg
- mptcp: only inc MPJoinAckHMacFailure for HMAC failures
Previous releases - always broken:
- sched:
- make ->qlen_notify() idempotent
- ensure sufficient space when sending filter netlink notifications
- sch_sfq: really don't allow 1 packet limit
- netfilter: fix incorrect avx2 match of 5th field octet
- tls: explicitly disallow disconnect
- eth: octeontx2-pf: fix VF root node parent queue priority"
* tag 'net-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
ethtool: cmis_cdb: Fix incorrect read / write length extension
selftests: netfilter: add test case for recent mismatch bug
nft_set_pipapo: fix incorrect avx2 match of 5th field octet
net: ppp: Add bound checking for skb data on ppp_sync_txmung
net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod.
ipv6: Align behavior across nexthops during path selection
net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
net: phy: move phy_link_change() prior to mdio_bus_phy_may_suspend()
selftests/tc-testing: sfq: check that a derived limit of 1 is rejected
net_sched: sch_sfq: move the limit validation
net_sched: sch_sfq: use a temporary work area for validating configuration
net: libwx: handle page_pool_dev_alloc_pages error
selftests: mptcp: validate MPJoin HMacFailure counters
mptcp: only inc MPJoinAckHMacFailure for HMAC failures
rtnetlink: Fix bad unlock balance in do_setlink().
net: ethtool: Don't call .cleanup_data when prepare_data fails
tc: Ensure we have enough buffer space when sending filter netlink notifications
net: libwx: Fix the wrong Rx descriptor field
octeontx2-pf: qos: fix VF root node parent queue index
selftests: tls: check that disconnect does nothing
...
|
|
Pull NVMe updates from Christoph:
"nvme updates for Linux 6.15
- nvmet fc/fcloop refcounting fixes (Daniel Wagner)
- fix missed namespace/ANA scans (Hannes Reinecke)
- fix a use after free in the new TCP netns support (Kuniyuki Iwashima)
- fix a NULL instead of false review in multipath (Uday Shankar)"
* tag 'nvme-6.15-2025-04-10' of git://git.infradead.org/nvme:
nvmet-fc: put ref when assoc->del_work is already scheduled
nvmet-fc: take tgtport reference only once
nvmet-fc: update tgtport ref per assoc
nvmet-fc: inline nvmet_fc_free_hostport
nvmet-fc: inline nvmet_fc_delete_assoc
nvmet-fcloop: add ref counting to lport
nvmet-fcloop: replace kref with refcount
nvmet-fcloop: swap list_add_tail arguments
nvme-tcp: fix use-after-free of netns by kernel TCP socket.
nvme: multipath: fix return value of nvme_available_path
nvme: re-read ANA log page after ns scan completes
nvme: requeue namespace scan on missed AENs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A simple fix adding the module description of the Xenbus frontend
module
- A fix correcting the xen-acpi-processor Kconfig dependency for PVH
Dom0 support
- A fix for the Xen balloon driver when running as Xen Dom0 in PVH mode
- A fix for PVH Dom0 in order to avoid problems with CPU idle and
frequency drivers conflicting with Xen
* tag 'for-linus-6.15a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: disable CPU idle and frequency drivers for PVH dom0
x86/xen: fix balloon target initialization for PVH dom0
xen: Change xen-acpi-processor dom0 dependency
xenbus: add module description
|
|
Pull block fixes from Jens Axboe:
- Add a missing ublk selftest script, from test additions added last
week
- Two fixes for ublk error recovery and reissue
- Cleanup of ublk argument passing
* tag 'block-6.15-20250410' of git://git.kernel.dk/linux:
ublk: pass ublksrv_ctrl_cmd * instead of io_uring_cmd *
ublk: don't fail request for recovery & reissue in case of ubq->canceling
ublk: fix handling recovery & reissue in ublk_abort_queue()
selftests: ublk: fix test_stripe_04
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix resource handling in gpio-tegra186
- fix wakeup source leaks in gpio-mpc8xxx and gpio-zynq
- fix minor issues with some GPIO OF quirks
- deprecate GPIOD_FLAGS_BIT_NONEXCLUSIVE and devm_gpiod_unhinge()
symbols and add a TODO task to track replacing them with a better
solution
* tag 'gpio-fixes-for-v6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: of: Move Atmel HSMCI quirk up out of the regulator comment
gpiolib: of: Fix the choice for Ingenic NAND quirk
gpio: zynq: Fix wakeup source leaks on device unbind
gpio: mpc8xxx: Fix wakeup source leaks on device unbind
gpio: TODO: track the removal of regulator-related workarounds
MAINTAINERS: add more keywords for the GPIO subsystem entry
gpio: deprecate devm_gpiod_unhinge()
gpio: deprecate the GPIOD_FLAGS_BIT_NONEXCLUSIVE flag
gpio: tegra186: fix resource handling in ACPI probe path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"Two important fixes: the build of the SPI NAND layer with old GCC
versions as well as the fix of the Qpic Makefile which was wrong in
the first place.
There are also two smaller fixes about a missing error and status
check"
* tag 'mtd/fixes-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: spinand: Fix build with gcc < 7.5
mtd: rawnand: Add status chack in r852_ready()
mtd: inftlcore: Add error check for inftl_read_oob()
mtd: nand: Drop explicit test for built-in CONFIG_SPI_QPIC_SNAND
|
|
Call cs_dsp_mock_xm_header_get_fw_version() to get the firmware version
from the dummy XM header data in cs_dsp_bin_err_test_common_init().
Make the same change to cs_dsp_bin_test_common_init() and remove the
cs_dsp_mock_xm_header_get_fw_version_from_regmap() function.
The code in cs_dsp_test_bin.c was correctly calling
cs_dsp_mock_xm_header_get_fw_version_from_regmap() to fetch the fw version
from a dummy header it wrote to XM registers. However in
cs_dsp_test_bin_error.c the test doesn't stuff a dummy header into XM, it
populates it the normal way using a wmfw file. It should have called
cs_dsp_mock_xm_header_get_fw_version() to get the data from its blob
buffer, but was calling cs_dsp_mock_xm_header_get_fw_version_from_regmap().
As nothing had been written to the registers this returned the value of
uninitialized data.
The only other use of cs_dsp_mock_xm_header_get_fw_version_from_regmap()
was cs_dsp_test_bin.c, but it doesn't need to use it. It already has a
blob buffer containing the dummy XM header so it can use
cs_dsp_mock_xm_header_get_fw_version() to read from that.
Fixes: cd8c058499b6 ("firmware: cs_dsp: Add KUnit testing of bin error cases")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20250410132129.1312541-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
An error message is already emitted by the driver core function
call_driver_probe() when the driver probe fails. Therefore, this
redundant probe error message is removed.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20250410-spi-v1-2-56e867cc19cf@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Commit 40369bfe717e ("spi: fsl-qspi: use devm function instead of driver
remove") introduced managed cleanup via fsl_qspi_cleanup(), but
incorrectly retain manual cleanup in two scenarios:
- On devm_add_action_or_reset() failure, the function automatically call
fsl_qspi_cleanup(). However, the current code still jumps to
err_destroy_mutex, repeating cleanup.
- After the fsl_qspi_cleanup() action is added successfully, there is no
need to manually perform the cleanup in the subsequent error path.
However, the current code still jumps to err_destroy_mutex on spi
controller failure, repeating cleanup.
Skip redundant manual cleanup calls to fix these issues.
Cc: stable@vger.kernel.org
Fixes: 40369bfe717e ("spi: fsl-qspi: use devm function instead of driver remove")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Link: https://patch.msgid.link/20250410-spi-v1-1-56e867cc19cf@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Ensure we have enough data in linear buffer from skb before accessing
initial bytes. This prevents potential out-of-bounds accesses
when processing short packets.
When ppp_sync_txmung receives an incoming package with an empty
payload:
(remote) gef➤ p *(struct pppoe_hdr *) (skb->head + skb->network_header)
$18 = {
type = 0x1,
ver = 0x1,
code = 0x0,
sid = 0x2,
length = 0x0,
tag = 0xffff8880371cdb96
}
from the skb struct (trimmed)
tail = 0x16,
end = 0x140,
head = 0xffff88803346f400 "4",
data = 0xffff88803346f416 ":\377",
truesize = 0x380,
len = 0x0,
data_len = 0x0,
mac_len = 0xe,
hdr_len = 0x0,
it is not safe to access data[2].
Reported-by: syzbot+29fc8991b0ecb186cf40@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=29fc8991b0ecb186cf40
Tested-by: syzbot+29fc8991b0ecb186cf40@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Arnaud Lecomte <contact@arnaud-lcm.com>
Link: https://patch.msgid.link/20250408-bound-checking-ppp_txmung-v2-1-94bb6e1b92d0@arnaud-lcm.com
[pabeni@redhat.com: fixed subj typo]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use cfg->ctrl_ops->io_init callback make it work for all platform.
And it's also gets rid of code duplication
Fixes: 3f60dbd40d3f ("drm/rockchip: dw_hdmi_qp: Add platform ctrl callback")
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250317102757.565679-1-andyshrk@163.com
|
|
This is a copy-paste error, which affects DP1 usage.
Fixes: 328e6885996c ("drm/rockchip: vop2: Add platform specific callback")
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250312064218.524143-1-andyshrk@163.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.15-2025-04-09:
amdgpu:
- MES FW version caching fixes
- Only use GTT as a fallback if we already have a backing store
- dma_buf fix
- IP discovery fix
- Replay and PSR with VRR fix
- DC FP fixes
- eDP fixes
- KIQ TLB invalidate fix
- Enable dmem groups support
- Allow pinning VRAM dma bufs if imports can do P2P
- Workload profile fixes
- Prevent possible division by 0 in fan handling
amdkfd:
- Queue reset fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250409165238.1180153-1-alexander.deucher@amd.com
|
|
phylink-controlled PHY
DSA has 2 kinds of drivers:
1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
2. Those who don't: all others. The above methods should be optional.
For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
These seem good candidates for setting mac_managed_pm = true because
that is essentially its definition [1], but that does not seem to be the
biggest problem for now, and is not what this change focuses on.
Talking strictly about the 2nd category of DSA drivers here (which
do not have MAC managed PM, meaning that for their attached PHYs,
mdio_bus_phy_suspend() and mdio_bus_phy_resume() should run in full),
I have noticed that the following warning from mdio_bus_phy_resume() is
triggered:
WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
phydev->state != PHY_UP);
because the PHY state machine is running.
It's running as a result of a previous dsa_user_open() -> ... ->
phylink_start() -> phy_start() having been initiated by the user.
The previous mdio_bus_phy_suspend() was supposed to have called
phy_stop_machine(), but it didn't. So this is why the PHY is in state
PHY_NOLINK by the time mdio_bus_phy_resume() runs.
mdio_bus_phy_suspend() did not call phy_stop_machine() because for
phylink, the phydev->adjust_link function pointer is NULL. This seems a
technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
machine restart on resume"). That commit was written before phylink
existed, and was intended to avoid crashing with consumer drivers which
don't use the PHY state machine - phylink always does, when using a PHY.
But phylink itself has historically not been developed with
suspend/resume in mind, and apparently not tested too much in that
scenario, allowing this bug to exist unnoticed for so long. Plus, prior
to the WARN_ON(), it would have likely been invisible.
This issue is not in fact restricted to type 2 DSA drivers (according to
the above ad-hoc classification), but can be extrapolated to any MAC
driver with phylink and MDIO-bus-managed PHY PM ops. DSA is just where
the issue was reported. Assuming mac_managed_pm is set correctly, a
quick search indicates the following other drivers might be affected:
$ grep -Zlr PHYLINK_NETDEV drivers/ | xargs -0 grep -L mac_managed_pm
drivers/net/ethernet/atheros/ag71xx.c
drivers/net/ethernet/microchip/sparx5/sparx5_main.c
drivers/net/ethernet/microchip/lan966x/lan966x_main.c
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
drivers/net/ethernet/freescale/ucc_geth.c
drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
drivers/net/ethernet/marvell/mvneta.c
drivers/net/ethernet/marvell/prestera/prestera_main.c
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/altera/altera_tse_main.c
drivers/net/ethernet/wangxun/txgbe/txgbe_phy.c
drivers/net/ethernet/meta/fbnic/fbnic_phylink.c
drivers/net/ethernet/tehuti/tn40_phy.c
drivers/net/ethernet/mscc/ocelot_net.c
Make the existing conditions dependent on the PHY device having a
phydev->phy_link_change() implementation equal to the default
phy_link_change() provided by phylib. Otherwise, we implicitly know that
the phydev has the phylink-provided phylink_phy_change() callback, and
when phylink is used, the PHY state machine always needs to be stopped/
started on the suspend/resume path. The code is structured as such that
if phydev->phy_link_change() is absent, it is a matter of time until the
kernel will crash - no need to further complicate the test.
Thus, for the situation where the PM is not managed by the MAC, we will
make the MDIO bus PM ops treat identically the phylink-controlled PHYs
with the phylib-controlled PHYs where an adjust_link() callback is
supplied. In both cases, the MDIO bus PM ops should stop and restart the
PHY state machine.
[1] https://lore.kernel.org/netdev/Z-1tiW9zjcoFkhwc@shell.armlinux.org.uk/
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Reported-by: Wei Fang <wei.fang@nxp.com>
Tested-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250407094042.2155633-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In an upcoming change, mdio_bus_phy_may_suspend() will need to
distinguish a phylib-based PHY client from a phylink PHY client.
For that, it will need to compare the phydev->phy_link_change() function
pointer with the eponymous phy_link_change() provided by phylib.
To avoid forward function declarations, the default PHY link state
change method should be moved upwards. There is no functional change
associated with this patch, it is only to reduce the noise from a real
bug fix.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250407093900.2155112-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzkaller triggered an oversized kvmalloc() warning.
Silence it by adding __GFP_NOWARN.
syzkaller log:
WARNING: CPU: 7 PID: 518 at mm/util.c:665 __kvmalloc_node_noprof+0x175/0x180
CPU: 7 UID: 0 PID: 518 Comm: c_repro Not tainted 6.11.0-rc6+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:__kvmalloc_node_noprof+0x175/0x180
RSP: 0018:ffffc90001e67c10 EFLAGS: 00010246
RAX: 0000000000000100 RBX: 0000000000000400 RCX: ffffffff8149d46b
RDX: 0000000000000000 RSI: ffff8881030fae80 RDI: 0000000000000002
RBP: 000000712c800000 R08: 0000000000000100 R09: 0000000000000000
R10: ffffc90001e67c10 R11: 0030ae0601000000 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000000
FS: 00007fde79159740(0000) GS:ffff88813bdc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000180 CR3: 0000000105eb4005 CR4: 00000000003706b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ib_umem_odp_get+0x1f6/0x390
mlx5_ib_reg_user_mr+0x1e8/0x450
ib_uverbs_reg_mr+0x28b/0x440
ib_uverbs_write+0x7d3/0xa30
vfs_write+0x1ac/0x6c0
ksys_write+0x134/0x170
? __sanitizer_cov_trace_pc+0x1c/0x50
do_syscall_64+0x50/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 37824952dc8f ("RDMA/odp: Use kvcalloc for the dma_list and page_list")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Link: https://patch.msgid.link/c6cb92379de668be94894f49c2cfa40e73f94d56.1742388096.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9e7b08d239c2f21e8f417854f81e5ff40edbebff)
Cc: stable@vger.kernel.org # 6.12.x
|