linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-11-28	platform/x86: asus-wmi: disable USB0 hub on ROG Ally before suspend	Luke D. Jones
	ASUS have worked around an issue in XInput where it doesn't support USB selective suspend, which causes suspend issues in Windows. They worked around this by adjusting the MCU firmware to disable the USB0 hub when the screen is switched off during the Microsoft DSM suspend path in ACPI. The issue we have with this however is one of timing - the call the tells the MCU to this isn't able to complete before suspend is done so we call this in a prepare() and add a small msleep() to ensure it is done. This must be done before the screen is switched off to prevent a variety of possible races. Further to this the MCU powersave option must also be disabled as it can cause a number of issues such as: - unreliable resume connection of N-Key - complete loss of N-Key if the power is plugged in while suspended Disabling the powersave option prevents this. Without this the MCU is unable to initialise itself correctly on resume. Signed-off-by: "Luke D. Jones" <luke@ljones.dev> Tested-by: Philip Mueller <philm@manjaro.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20231126230521.125708-2-luke@ljones.dev Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-11-28	drm/imagination: vm: fix drm_gpuvm reference count	Danilo Krummrich
	The driver specific reference count indicates whether the VM should be teared down, whereas GPUVM's reference count indicates whether the VM structure can finally be freed. Hence, free the VM structure in pvr_gpuvm_free() and drop the last GPUVM reference after tearing down the VM. Generally, this prevents lifetime issues such as the VM being freed as long as drm_gpuvm_bo structures still hold references to the VM. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Donald Robson <donald.robson@imgtec.com> Tested-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231124233650.152653-4-dakr@redhat.com
2023-11-28	drm/imagination: vm: check for drm_gpuvm_range_valid()	Danilo Krummrich
	Extend pvr_device_addr_and_size_are_valid() by the corresponding GPUVM sanity checks. This includes a, previously missing, overflow check for the base address and size of the requested mapping. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Donald Robson <donald.robson@imgtec.com> Tested-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231124233650.152653-3-dakr@redhat.com
2023-11-28	drm/imagination: vm: prevent duplicate drm_gpuvm_bo instances	Danilo Krummrich
	Use drm_gpuvm_bo_obtain() instead of drm_gpuvm_bo_create(). The latter should only be used in conjunction with drm_gpuvm_bo_obtain_prealloc(). drm_gpuvm_bo_obtain() re-uses existing instances of a given VM and BO combination, whereas drm_gpuvm_bo_create() would always create a new instance of struct drm_gpuvm_bo and hence leave us with duplicates. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Donald Robson <donald.robson@imgtec.com> Tested-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231124233650.152653-2-dakr@redhat.com
2023-11-28	drm/imagination: Remove unneeded semicolon	Yang Li
	./drivers/gpu/drm/imagination/pvr_free_list.c:258:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7635 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231127010430.119666-1-yang.lee@linux.alibaba.com
2023-11-28	drm/imagination: Fix a couple of spelling mistakes in literal strings	Colin Ian King
	There are a couple of spelling mistakes in literal strings in the stid_fmts array. Fix these. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231124163917.300685-1-colin.i.king@gmail.com
2023-11-28	r8169: prevent potential deadlock in rtl8169_close	Heiner Kallweit
	ndo_stop() is RTNL-protected by net core, and the worker function takes RTNL as well. Therefore we will deadlock when trying to execute a pending work synchronously. To fix this execute any pending work asynchronously. This will do no harm because netif_running() is false in ndo_stop(), and therefore the work function is effectively a no-op. However we have to ensure that no task is running or pending after rtl_remove_one(), therefore add a call to cancel_work_sync(). Fixes: abe5fc42f9ce ("r8169: use RTNL to protect critical sections") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/12395867-1d17-4cac-aa7d-c691938fcddf@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28	r8169: fix deadlock on RTL8125 in jumbo mtu mode	Heiner Kallweit
	The original change results in a deadlock if jumbo mtu mode is used. Reason is that the phydev lock is held when rtl_reset_work() is called here, and rtl_jumbo_config() calls phy_start_aneg() which also tries to acquire the phydev lock. Fix this by calling rtl_reset_work() asynchronously. Fixes: 621735f59064 ("r8169: fix rare issue with broken rx after link-down on RTL8125") Reported-by: Ian Chen <free122448@hotmail.com> Tested-by: Ian Chen <free122448@hotmail.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/caf6a487-ef8c-4570-88f9-f47a659faf33@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28	efi/unaccepted: Fix off-by-one when checking for overlapping ranges	Michael Roth
	When a task needs to accept memory it will scan the accepting_list to see if any ranges already being processed by other tasks overlap with its range. Due to an off-by-one in the range comparisons, a task might falsely determine that an overlapping range is being accepted, leading to an unnecessary delay before it begins processing the range. Fix the off-by-one in the range comparison to prevent this and slightly improve performance. Fixes: 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by parallel memory acceptance") Link: https://lore.kernel.org/linux-mm/20231101004523.vseyi5bezgfaht5i@amd.com/T/#me2eceb9906fcae5fe958b3fe88e41f920f8335b6 Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-11-28	xen/events: fix error code in xen_bind_pirq_msi_to_irq()	Dan Carpenter
	Return -ENOMEM if xen_irq_init() fails. currently the code returns an uninitialized variable or zero. Fixes: 5dd9ad32d775 ("xen/events: drop xen_allocate_irqs_dynamic()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Juergen Gross <jgross@ssue.com> Link: https://lore.kernel.org/r/3b9ab040-a92e-4e35-b687-3a95890a9ace@moroto.mountain Signed-off-by: Juergen Gross <jgross@suse.com>
2023-11-28	octeontx2-pf: Restore TC ingress police rules when interface is up	Subbaraya Sundeep
	TC ingress policer rules depends on interface receive queue contexts since the bandwidth profiles are attached to RQ contexts. When an interface is brought down all the queue contexts are freed. This in turn frees bandwidth profiles in hardware causing ingress police rules non-functional after the interface is brought up. Fix this by applying all the ingress police rules config to hardware in otx2_open. Also allow adding ingress rules only when interface is running since no contexts exist for the interface when it is down. Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930217-5707-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28	octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64	Geetha sowjanya
	When more than 64 VFs are enabled for a PF then mbox communication between VF and PF is not working as mbox work queueing for few VFs are skipped due to wrong calculation of VF numbers. Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930042-5400-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28	Merge v6.7-rc3 into drm-next	Daniel Vetter
	Thomas Zimermann needs 8d6ef26501 ("drm/ast: Disconnect BMC if physical connector is connected") for further ast work in -next. Minor conflicts in ivpu between 3de6d9597892 ("accel/ivpu: Pass D0i3 residency time to the VPU firmware") and 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset") changing adjacent lines. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-11-28	net: stmmac: xgmac: Disable FPE MMC interrupts	Furong Xu
	Commit aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") tries to disable MMC interrupts to avoid a storm of unhandled interrupts, but leaves the FPE(Frame Preemption) MMC interrupts enabled, FPE MMC interrupts can cause the same problem. Now we mask FPE TX and RX interrupts to disable all MMC interrupts. Fixes: aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/20231125060126.2328690-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28	drm/gpuvm: Fix deprecated license identifier	Thomas Hellström
	"GPL-2.0-only" in the license header was incorrectly changed to the now deprecated "GPL-2.0". Fix. Cc: Maxime Ripard <mripard@kernel.org> Cc: Danilo Krummrich <dakr@redhat.com> Reported-by: David Edelsohn <dje.gcc@gmail.com> Closes: https://lore.kernel.org/dri-devel/5lfrhdpkwhpgzipgngojs3tyqfqbesifzu5nf4l5q3nhfdhcf2@25nmiq7tfrew/T/#m5c356d68815711eea30dd94cc6f7ea8cd4344fe3 Fixes: f7749a549b4f ("drm/gpuvm: Dual-licence the drm_gpuvm code GPL-2.0 OR MIT") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231106114827.62492-1-thomas.hellstrom@linux.intel.com
2023-11-28	Revert "drm/bridge: panel: Add a device link between drm device and panel ↵	Linus Walleij
	device" This reverts commit 199cf07ebd2b0d41185ac79b895547d45610b681. This patch creates bugs on devices where the DRM device is the ancestor of the panel devices. Attempts to fix this have failed because it leads to using device core functionality which is questionable. Reported-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/lkml/CACRpkdaGzXD6HbiX7mVUNJAJtMEPG00Pp6+nJ1P0JrfJ-ArMvQ@mail.gmail.com/T/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-3-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-3-69bb05048dae@linaro.org
2023-11-28	Revert "driver core: Export device_is_dependent() to modules"	Linus Walleij
	This reverts commit 1d5e8f4bf06da86b71cc9169110d1a0e1e7af337. Greg says: "why exactly is this needed? Nothing outside of the driver core should be needing this function, it shouldn't be public at all (I missed that before.) So please, revert it for now, let's figure out why DRM thinks this is needed for it's devices, and yet no other bus/subsystem does." Link: https://lore.kernel.org/dri-devel/2023112739-willing-sighing-6bdd@gregkh/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-1-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-1-69bb05048dae@linaro.org
2023-11-28	Revert "drm/bridge: panel: Check device dependency before managing device link"	Linus Walleij
	This reverts commit 39d5b6a64ace77d0c11c398d272218df5f939abb. This patch was causing build errors by using an unexported function from the device core, which Greg questions the saneness in exporting. Link: https://lore.kernel.org/lkml/CACRpkdaGzXD6HbiX7mVUNJAJtMEPG00Pp6+nJ1P0JrfJ-ArMvQ@mail.gmail.com/T/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231128-revert-panel-fix-v1-2-69bb05048dae@linaro.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231128-revert-panel-fix-v1-2-69bb05048dae@linaro.org
2023-11-28	octeontx2-af: Fix possible buffer overflow	Elena Salomatkina
	A loop in rvu_mbox_handler_nix_bandprof_free() contains a break if (idx == MAX_BANDPROF_PER_PFFUNC), but if idx may reach MAX_BANDPROF_PER_PFFUNC buffer '(*req->prof_idx)[layer]' overflow happens before that check. The patch moves the break to the beginning of the loop. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support"). Signed-off-by: Elena Salomatkina <elena.salomatkina.cmc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20231124210802.109763-1-elena.salomatkina.cmc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-27	Merge tag 'media/v6.7-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab. * tag 'media/v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: pci: mgb4: add COMMON_CLK dependency media: v4l2-subdev: Fix a 64bit bug media: mgb4: Added support for T200 card variant media: vsp1: Remove unbalanced .s_stream(0) calls
2023-11-27	netkit: Reject IFLA_NETKIT_PEER_INFO in netkit_change_link	Daniel Borkmann
	The IFLA_NETKIT_PEER_INFO attribute can only be used during device creation, but not via changelink callback. Hence reject it there. Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Cc: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/e86a277a1e8d3b19890312779e42f790b0605ea4.1701115314.git.daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-27	nvme: check for valid nvme_identify_ns() before using it	Ewan D. Milne
	When scanning namespaces, it is possible to get valid data from the first call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second call in nvme_update_ns_info_block(). In particular, if the NSID becomes inactive between the two commands, a storage device may return a buffer filled with zero as per 4.1.5.1. In this case, we can get a kernel crash due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will be set to zero. PID: 326 TASK: ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10" #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926 [exception RIP: blk_stack_limits+434] RIP: ffffffff92191872 RSP: ffffad8f8702fc80 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff95efa0c91800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 00000000ffffffff R8: ffff95fec7df35a8 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff95fed33c09a8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core] #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core] This happened when the check for valid data was moved out of nvme_identify_ns() into one of the callers. Fix this by checking in both callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186 Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27	of: dynamic: Fix of_reconfig_get_state_change() return value documentation	Luca Ceresoli
	The documented numeric return values do not match the actual returned values. Fix them by using the enum names instead of raw numbers. Fixes: b53a2340d0d3 ("of/reconfig: Add of_reconfig_get_state_change() of notifier helper.") Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://lore.kernel.org/r/20231123-fix-of_reconfig_get_state_change-docs-v1-1-f51892050ff9@bootlin.com Signed-off-by: Rob Herring <robh@kernel.org>
2023-11-27	Merge branch 'cpufreq/arm/linux-next' of ↵	Rafael J. Wysocki
	git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge ARM cpufreq fixes for 6.7-rc4 from Viresh Kumar. These fix issues related to power domains in the qcom cpufreq driver and an OPP-related issue in the imx6q cpufreq driver. * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: pmdomain: qcom: rpmpd: Set GENPD_FLAG_ACTIVE_WAKEUP cpufreq: qcom-nvmem: Preserve PM domain votes in system suspend cpufreq: qcom-nvmem: Enable virtual power domain devices cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
2023-11-27	ACPI: video: Use acpi_video_device for cooling-dev driver data	Hans de Goede
	The acpi_video code was storing the acpi_video_device as driver_data in the acpi_device children of the acpi_video_bus acpi_device. But the acpi_video driver only binds to the bus acpi_device. It uses, but does not bind to, the children. Since it is not the driver it should not be using the driver_data of the children's acpi_device-s. Since commit 0d16710146a1 ("ACPI: bus: Set driver_data to NULL every time .add() fails") the childen's driver_data ends up getting set to NULL after a driver fails to bind to the children leading to a NULL pointer deref in video_get_max_state when registering the cooling-dev: [ 3.148958] BUG: kernel NULL pointer dereference, address: 0000000000000090 <snip> [ 3.149015] Hardware name: Sony Corporation VPCSB2X9R/VAIO, BIOS R2087H4 06/15/2012 [ 3.149021] RIP: 0010:video_get_max_state+0x17/0x30 [video] <snip> [ 3.149105] Call Trace: [ 3.149110] <TASK> [ 3.149114] ? __die+0x23/0x70 [ 3.149126] ? page_fault_oops+0x171/0x4e0 [ 3.149137] ? exc_page_fault+0x7f/0x180 [ 3.149147] ? asm_exc_page_fault+0x26/0x30 [ 3.149158] ? video_get_max_state+0x17/0x30 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149176] ? __pfx_video_get_max_state+0x10/0x10 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149192] __thermal_cooling_device_register.part.0+0xf2/0x2f0 [ 3.149205] acpi_video_bus_register_backlight.part.0.isra.0+0x414/0x570 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149227] acpi_video_register_backlight+0x57/0x80 [video 9b6f3f0d19d7b4a0e2df17a2d8b43bc19c2ed71f] [ 3.149245] intel_acpi_video_register+0x68/0x90 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.149669] intel_display_driver_register+0x28/0x50 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.150064] i915_driver_probe+0x790/0xb90 [i915 1f3a758130b32ef13d301d4f8f78c7d766d57f2a] [ 3.150402] local_pci_probe+0x45/0xa0 [ 3.150412] pci_device_probe+0xc1/0x260 <snip> Fix this by directly using the acpi_video_device as devdata for the cooling-device, which avoids the need to set driver-data on the children at all. Fixes: 0d16710146a1 ("ACPI: bus: Set driver_data to NULL every time .add() fails") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9718 Cc: 6.6+ <stable@vger.kernel.org> # 6.6+ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-27	drm: Use device_get_match_data()	Rob Herring
	Use preferred device_get_match_data() instead of of_match_device() to get the driver match data in a single step. With this, adjust the includes to explicitly include the correct headers. That also serves as preparation to remove implicit includes within the DT headers (of_device.h in particular). Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au> Link: https://lore.kernel.org/r/20231020125214.2930329-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-11-27	dma-buf: fix check in dma_resv_add_fence	Christian König
	It's valid to add the same fence multiple times to a dma-resv object and we shouldn't need one extra slot for each. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: a3f7c10a269d5 ("dma-buf/dma-resv: check if the new fence is really later") Cc: stable@vger.kernel.org # v5.19+ Link: https://patchwork.freedesktop.org/patch/msgid/20231115093035.1889-1-christian.koenig@amd.com
2023-11-27	nvme-core: fix a memory leak in nvme_ns_info_from_identify()	Maurizio Lombardi
	In case of error, free the nvme_id_ns structure that was allocated by nvme_identify_ns(). Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27	nvme: fine-tune sending of first keep-alive	Mark O'Donovan
	Keep-alive commands are sent half-way through the kato period. This normally works well but fails when the keep-alive system is started when we are more than half way through the kato. This can happen on larger setups or due to host delays. With this change we now time the initial keep-alive command from the controller initialisation time, rather than the keep-alive mechanism activation time. Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-11-27	vfio/pds: Fix possible sleep while in atomic context	Brett Creeley
	The driver could possibly sleep while in atomic context resulting in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is set: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Call Trace: <TASK> dump_stack_lvl+0x36/0x50 __might_resched+0x123/0x170 mutex_lock+0x1e/0x50 pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci] pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci] pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 This can happen if pds_vfio_put_restore_file() and/or pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock) while the spin_lock(&pds_vfio->reset_lock) is held, which can happen during while calling pds_vfio_state_mutex_unlock(). Fix this by changing the reset_lock to reset_mutex so there are no such conerns. Also, make sure to destroy the reset_mutex in the driver specific VFIO device release function. This also fixes a spinlock bad magic BUG that was caused by not calling spinlock_init() on the reset_lock. Since, the lock is being changed to a mutex, make sure to call mutex_init() on it. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/ Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-11-27	vfio/pds: Fix mutex lock->magic != lock warning	Brett Creeley
	The following BUG was found when running on a kernel with CONFIG_DEBUG_MUTEXES=y set: DEBUG_LOCKS_WARN_ON(lock->magic != lock) RIP: 0010:mutex_trylock+0x10d/0x120 Call Trace: <TASK> ? __warn+0x85/0x140 ? mutex_trylock+0x10d/0x120 ? report_bug+0xfc/0x1e0 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? mutex_trylock+0x10d/0x120 ? mutex_trylock+0x10d/0x120 pds_vfio_reset+0x3a/0x60 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 As shown, lock->magic != lock. This is because mutex_init(&pds_vfio->state_mutex) is called in the VFIO open path. So, if a reset is initiated before the VFIO device is opened the mutex will have never been initialized. Fix this by calling mutex_init(&pds_vfio->state_mutex) in the VFIO init path. Also, don't destroy the mutex on close because the device may be re-opened, which would cause mutex to be uninitialized. Fix this by implementing a driver specific vfio_device_ops.release callback that destroys the mutex before calling vfio_pci_core_release_dev(). Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-2-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-11-27	driver core: Export device_is_dependent() to modules	Liu Ying
	Export device_is_dependent() since the drm_kms_helper module is starting to use it. Signed-off-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231127051414.3783108-2-victor.liu@nxp.com
2023-11-27	pmdomain: arm: Avoid polling for scmi_perf_domain	Ulf Hansson
	It was a mistake to prefer polling based mode when setting a performance level for a domain. Let's instead rely on the protocol to decide what is best and thus avoid polling when possible. Reported-by: Nikunj Kela <nkela@quicinc.com> Fixes: 2af23ceb8624 ("pmdomain: arm: Add the SCMI performance domain") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20231127135033.136442-1-ulf.hansson@linaro.org
2023-11-27	drm/sched: Fix compilation issues with DRM priority rename	Luben Tuikov
	Fix compilation issues with DRM scheduler priority rename MIN to LOW. Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311252109.WgbJsSkG-lkp@intel.com/ Cc: Danilo Krummrich <dakr@redhat.com> Cc: Frank Binns <frank.binns@imgtec.com> Cc: Donald Robson <donald.robson@imgtec.com> Cc: Matt Coster <matt.coster@imgtec.com> Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> Fixes: fe375c74806dbd ("drm/sched: Rename priority MIN to LOW") Fixes: 38f922a563aac3 ("drm/sched: Reverse run-queue priority enumeration") Fixes: 5f03a507b29e44 ("drm/nouveau: implement 1:1 scheduler - entity relationship") Link: https://patchwork.freedesktop.org/patch/msgid/20231125192246.87268-2-ltuikov89@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/7429262c-6dea-4dcc-bf7e-54d2277dabf1@amd.com
2023-11-27	misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write	Su Hui
	Clang static analyzer complains that value stored to 'rets' is never read.Let 'buf_len = -EOVERFLOW' to make sure we can return '-EOVERFLOW'. Fixes: 8c8d964ce90f ("mei: move hbuf_depth from the mei device to the hw modules") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231120095523.178385-2-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-27	misc: mei: client.c: return negative error code in mei_cl_write	Su Hui
	mei_msg_hdr_init() return negative error code, rets should be 'PTR_ERR(mei_hdr)' rather than '-PTR_ERR(mei_hdr)'. Fixes: 0cd7c01a60f8 ("mei: add support for mei extended header.") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20231120095523.178385-1-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-27	mei: pxp: fix mei_pxp_send_message return value	Alexander Usyskin
	mei_pxp_send_message() should return zero on success and cannot propagate number of bytes as returned by internally called mei_cldev_send(). Fixes: ee5cb39348e6 ("mei: pxp: recover from recv fail under memory pressure") Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20231126092449.88310-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-27	wifi: mac80211: use wiphy locked debugfs helpers for agg_status	Johannes Berg
	The read is currently with RCU and the write can deadlock, convert both for the sake of illustration. Make mac80211 depend on cfg80211 debugfs to get the helpers, but mac80211 debugfs without it does nothing anyway. This also required some adjustments in ath9k. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-11-27	iommu/vt-d: Set variable intel_dirty_ops to static	Kunwu Chan
	Fix the following warning: drivers/iommu/intel/iommu.c:302:30: warning: symbol 'intel_dirty_ops' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20231120101025.1103404-1-chentao@kylinos.cn Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Fix incorrect cache invalidation for mm notification	Lu Baolu
	Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") moved the secondary TLB invalidations into the TLB invalidation functions to ensure that all secondary TLB invalidations happen at the same time as the CPU invalidation and added a flush-all type of secondary TLB invalidation for the batched mode, where a range of [0, -1UL) is used to indicates that the range extends to the end of the address space. However, using an end address of -1UL caused an overflow in the Intel IOMMU driver, where the end address was rounded up to the next page. As a result, both the IOTLB and device ATC were not invalidated correctly. Add a flush all helper function and call it when the invalidation range is from 0 to -1UL, ensuring that the entire caches are invalidated correctly. Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") Cc: stable@vger.kernel.org Cc: Huang Ying <ying.huang@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Tested-by: Luo Yuzhang <yuzhang.luo@intel.com> # QAT Tested-by: Tony Zhu <tony.zhu@intel.com> # DSA Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20231117090933.75267-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Add MTL to quirk list to skip TE disabling	Abdul Halim, Mohd Syazwan
	The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before switching address translation on or off and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. Add MTL to the quirk list for those devices and skips TE disabling if the qurik hits. Fixes: b1012ca8dc4f ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu") Cc: stable@vger.kernel.org Signed-off-by: Abdul Halim, Mohd Syazwan <mohd.syazwan.abdul.halim@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20231116022324.30120-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Make context clearing consistent with context mapping	Lu Baolu
	In the iommu probe_device path, domain_context_mapping() allows setting up the context entry for a non-PCI device. However, in the iommu release_device path, domain_context_clear() only clears context entries for PCI devices. Make domain_context_clear() behave consistently with domain_context_mapping() by clearing context entries for both PCI and non-PCI devices. Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20231114011036.70142-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Disable PCI ATS in legacy passthrough mode	Lu Baolu
	When IOMMU hardware operates in legacy mode, the TT field of the context entry determines the translation type, with three supported types (Section 9.3 Context Entry): - DMA translation without device TLB support - DMA translation with device TLB support - Passthrough mode with translated and translation requests blocked Device TLB support is absent when hardware is configured in passthrough mode. Disable the PCI ATS feature when IOMMU is configured for passthrough translation type in legacy (non-scalable) mode. Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20231114011036.70142-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Omit devTLB invalidation requests when TES=0	Lu Baolu
	The latest VT-d spec indicates that when remapping hardware is disabled (TES=0 in Global Status Register), upstream ATS Invalidation Completion requests are treated as UR (Unsupported Request). Consequently, the spec recommends in section 4.3 Handling of Device-TLB Invalidations that software refrain from submitting any Device-TLB invalidation requests when address remapping hardware is disabled. Verify address remapping hardware is enabled prior to submitting Device- TLB invalidation requests. Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20231114011036.70142-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu/vt-d: Support enforce_cache_coherency only for empty domains	Lu Baolu
	The enforce_cache_coherency callback ensures DMA cache coherency for devices attached to the domain. Intel IOMMU supports enforced DMA cache coherency when the Snoop Control bit in the IOMMU's extended capability register is set. Supporting it differs between legacy and scalable modes. In legacy mode, it's supported page-level by setting the SNP field in second-stage page-table entries. In scalable mode, it's supported in PASID-table granularity by setting the PGSNP field in PASID-table entries. In legacy mode, mappings before attaching to a device have SNP fields cleared, while mappings after the callback have them set. This means partial DMAs are cache coherent while others are not. One possible fix is replaying mappings and flipping SNP bits when attaching a domain to a device. But this seems to be over-engineered, given that all real use cases just attach an empty domain to a device. To meet practical needs while reducing mode differences, only support enforce_cache_coherency on a domain without mappings if SNP field is used. Fixes: fc0051cb9590 ("iommu/vt-d: Check domain force_snooping against attached devices") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20231114011036.70142-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	drm/ttm: Schedule delayed_delete worker closer	Rajneesh Bhardwaj
	Try to allocate system memory on the NUMA node the device is closest to and try to run delayed_delete workers on a CPU of this node as well. To optimize the memory clearing operation when a TTM BO gets freed by the delayed_delete worker, scheduling it closer to a NUMA node where the memory was initially allocated helps avoid the cases where the worker gets randomly scheduled on the CPU cores that are across interconnect boundaries such as xGMI, PCIe etc. This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD APU platforms such as GFXIP9.4.3. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231111130856.1168304-1-rajneesh.bhardwaj@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2023-11-27	iommu: Avoid more races around device probe	Robin Murphy
	It turns out there are more subtle races beyond just the main part of __iommu_probe_device() itself running in parallel - the dev_iommu_free() on the way out of an unsuccessful probe can still manage to trip up concurrent accesses to a device's fwspec. Thus, extend the scope of iommu_probe_device_lock() to also serialise fwspec creation and initial retrieval. Reported-by: Zhenhua Huang <quic_zhenhuah@quicinc.com> Link: https://lore.kernel.org/linux-iommu/e2e20e1c-6450-4ac5-9804-b0000acdf7de@quicinc.com/ Fixes: 01657bc14a39 ("iommu: Avoid races around device probe") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: André Draszik <andre.draszik@linaro.org> Tested-by: André Draszik <andre.draszik@linaro.org> Link: https://lore.kernel.org/r/16f433658661d7cadfea51e7c65da95826112a2b.1700071477.git.robin.murphy@arm.com Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu: Flow ERR_PTR out from __iommu_domain_alloc()	Jason Gunthorpe
	Most of the calling code now has error handling that can carry an error code further up the call chain. Keep the exported interface iommu_domain_alloc() returning NULL and reflow the internal code to use ERR_PTR not NULL for domain allocation failure. Optionally allow drivers to return ERR_PTR from any of the alloc ops. Many of the new ops (user, sva, etc) already return ERR_PTR, so having two rules is confusing and hard on drivers. This fixes a bug in DART that was returning ERR_PTR. Fixes: 482feb5c6492 ("iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/linux-iommu/b85e0715-3224-4f45-ad6b-ebb9f08c015d@moroto.mountain/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/0-v2-55ae413017b8+97-domain_alloc_err_ptr_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	drm/i915/psr: Add proper handling for disabling sel fetch for planes	Jouni Högander
	Currently we are enabling selective fetch for all planes that are visible. This is suboptimal as we might be fetching for memory for planes that are not part of selective update. Fix this by adding proper handling for disabling plane selective fetch: If plane previously part of selective update is now not part of update: Add it into updated planes and let the plane configuration to disable selective fetch for it. v3: Checkpatch warnings fixed v2: - Add setting sel_fetch_area->y1/y2 to -1 - Remove setting again local sel_fetch_area variable Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231120082606.3156488-3-jouni.hogander@intel.com
2023-11-27	drm/i915/psr: Move plane sel fetch configuration into plane source files	Jouni Högander
	Currently selective fetch configuration for planes is implemented in psr code. More suitable place for this code is where everything else is configured for planes -> move it into skl_universal_plane.c and intel_cursor.c. This also allows us to drop hooks for cursor handling. v3: Checkpatch warnings fixed v2: Removed setting sel_fetch_area->y1/y2 as -1 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231120082606.3156488-2-jouni.hogander@intel.com