linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-11-05	nvme-core: remove repeated wq flags	Chaitanya Kulkarni
	In nvme_core_init() nvme_wq, nvme_reset_wq, nvme_delete_wq share same flags :- WQ_UNBOUND \| WQ_MEM_RECLAIM \| WQ_SYSFS. Insated of repeating these flags in each call use the common variable. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-11-05	nvmet: make nvmet_wq visible in sysfs	Guixin Liu
	In some complex scenarios, we deploy multiple tasks on a single machine (hybrid deployment), such as Docker containers for function computation (background processing), real-time tasks, monitoring, event handling, and management, along with an NVMe target server. Each of these components is restricted to its own CPU cores to prevent mutual interference and ensure strict isolation. To achieve this level of isolation for nvmet_wq we need to use sysfs tunables such as cpumask that are currently not accessible. Add WQ_SYSFS flag to alloc_workqueue() when creating nvmet_wq so workqueue tunables are exported in the userspace via sysfs. with this patch :- nvme (nvme-6.13) # ls /sys/devices/virtual/workqueue/nvmet-wq/ affinity_scope affinity_strict cpumask max_active nice per_cpu power subsystem uevent Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-11-05	mfd: cgbc-core: Fix error handling paths in cgbc_init_device()	Christophe JAILLET
	If an error occurs after a cgbc_session_request() call, it should be balanced by a corresponding cgbc_session_release(), as already done in the remove function. Fixes: 6f1067cfbee7 ("mfd: Add Congatec Board Controller driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Thomas Richard <thomas.richard@bootlin.com> Link: https://lore.kernel.org/r/83194335554146efc52c331993f083bd765db6f9.1730205085.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lee Jones <lee@kernel.org>
2024-11-05	nvme-pci: use dma_alloc_noncontigous if possible	Christoph Hellwig
	Use dma_alloc_noncontigous to allocate a single IOVA-contigous segment when backed by an IOMMU. This allow to easily use bigger segments and avoids running into segment limits if we can avoid it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-11-05	nvme-pci: fix freeing of the HMB descriptor table	Christoph Hellwig
	The HMB descriptor table is sized to the maximum number of descriptors that could be used for a given device, but __nvme_alloc_host_mem could break out of the loop earlier on memory allocation failure and end up using less descriptors than planned for, which leads to an incorrect size passed to dma_free_coherent. In practice this was not showing up because the number of descriptors tends to be low and the dma coherent allocator always allocates and frees at least a page. Fixes: 87ad72a59a38 ("nvme-pci: implement host memory buffer support") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-11-05	drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read()	Alex Deucher
	Avoid a possible buffer overflow if size is larger than 4K. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5d873f5825b40d886d03bd2aede91d4cf002434) Cc: stable@vger.kernel.org
2024-11-05	drm/amdgpu: Adjust debugfs eviction and IB access permissions	Alex Deucher
	Users should not be able to run these. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ba9395430f611cfc101b1c2687732baafa239d5) Cc: stable@vger.kernel.org
2024-11-05	drm/amdgpu: Adjust debugfs register access permissions	Alex Deucher
	Regular users shouldn't have read access. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c0cfd2e652553d607b910be47d0cc5a7f3a78641) Cc: stable@vger.kernel.org
2024-11-05	drm/amdgpu: Fix DPX valid mode check on GC 9.4.3	Lijo Lazar
	For DPX mode, the number of memory partitions supported should be less than or equal to 2. Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 990c4f580742de7bb78fa57420ffd182fc3ab4cd) Cc: stable@vger.kernel.org
2024-11-05	clk: sunxi-ng: Use of_property_present() for non-boolean properties	Rob Herring (Arm)
	The use of of_property_read_bool() for non-boolean properties is deprecated in favor of of_property_present() when testing for property presence. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20241104190455.272527-1-robh@kernel.org Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-11-05	drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read()	Alex Deucher
	Avoid a possible buffer overflow if size is larger than 4K. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu: Adjust debugfs eviction and IB access permissions	Alex Deucher
	Users should not be able to run these. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu: Adjust debugfs register access permissions	Alex Deucher
	Regular users shouldn't have read access. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu: stop syncing PRT map operations	Christian König
	Requested by both Bas and Friedrich. Mapping PTEs as PRT doesn't need to sync for anything. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu: set the right AMDGPU sg segment limitation	Prike Liang
	The driver needs to set the correct max_segment_size; otherwise debug_dma_map_sg() will complain about the over-mapping of the AMDGPU sg length as following: WARNING: CPU: 6 PID: 1964 at kernel/dma/debug.c:1178 debug_dma_map_sg+0x2dc/0x370 [ 364.049444] Modules linked in: veth amdgpu(OE) amdxcp drm_exec gpu_sched drm_buddy drm_ttm_helper ttm(OE) drm_suballoc_helper drm_display_helper drm_kms_helper i2c_algo_bit rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter br_netfilter nvme_fabrics overlay nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge stp llc amd_atl intel_rapl_msr intel_rapl_common sunrpc sch_fq_codel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd binfmt_misc snd_hda_codec snd_pci_acp6x snd_hda_core snd_acp_config snd_hwdep snd_soc_acpi kvm_amd snd_pcm kvm snd_seq_midi snd_seq_midi_event crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_rawmidi sha256_ssse3 sha1_ssse3 aesni_intel snd_seq nls_iso8859_1 crypto_simd snd_seq_device cryptd snd_timer rapl input_leds snd [ 364.049532] ipmi_devintf wmi_bmof ccp serio_raw k10temp sp5100_tco soundcore ipmi_msghandler cm32181 industrialio mac_hid msr parport_pc ppdev lp parport drm efi_pstore ip_tables x_tables pci_stub crc32_pclmul nvme ahci libahci i2c_piix4 r8169 nvme_core i2c_designware_pci realtek i2c_ccgx_ucsi video wmi hid_generic cdc_ether usbnet usbhid hid r8152 mii [ 364.049576] CPU: 6 PID: 1964 Comm: rocminfo Tainted: G OE 6.10.0-custom #492 [ 364.049579] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 364.049582] RIP: 0010:debug_dma_map_sg+0x2dc/0x370 [ 364.049585] Code: 89 4d b8 e8 36 b1 86 00 8b 4d b8 48 8b 55 b0 44 8b 45 a8 4c 8b 4d a0 48 89 c6 48 c7 c7 00 4b 74 bc 4c 89 4d b8 e8 b4 73 f3 ff <0f> 0b 4c 8b 4d b8 8b 15 c8 2c b8 01 85 d2 0f 85 ee fd ff ff 8b 05 [ 364.049588] RSP: 0018:ffff9ca600b57ac0 EFLAGS: 00010286 [ 364.049590] RAX: 0000000000000000 RBX: ffff88b7c132b0c8 RCX: 0000000000000027 [ 364.049592] RDX: ffff88bb0f521688 RSI: 0000000000000001 RDI: ffff88bb0f521680 [ 364.049594] RBP: ffff9ca600b57b20 R08: 000000000000006f R09: ffff9ca600b57930 [ 364.049596] R10: ffff9ca600b57928 R11: ffffffffbcb46328 R12: 0000000000000000 [ 364.049597] R13: 0000000000000001 R14: ffff88b7c19c0700 R15: ffff88b7c9059800 [ 364.049599] FS: 00007fb2d3516e80(0000) GS:ffff88bb0f500000(0000) knlGS:0000000000000000 [ 364.049601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 364.049603] CR2: 000055610bd03598 CR3: 00000001049f6000 CR4: 0000000000350ef0 [ 364.049605] Call Trace: [ 364.049607] <TASK> [ 364.049609] ? show_regs+0x6d/0x80 [ 364.049614] ? __warn+0x8c/0x140 [ 364.049618] ? debug_dma_map_sg+0x2dc/0x370 [ 364.049621] ? report_bug+0x193/0x1a0 [ 364.049627] ? handle_bug+0x46/0x80 [ 364.049631] ? exc_invalid_op+0x1d/0x80 [ 364.049635] ? asm_exc_invalid_op+0x1f/0x30 [ 364.049642] ? debug_dma_map_sg+0x2dc/0x370 [ 364.049647] __dma_map_sg_attrs+0x90/0xe0 [ 364.049651] dma_map_sgtable+0x25/0x40 [ 364.049654] amdgpu_bo_move+0x59a/0x850 [amdgpu] [ 364.049935] ? srso_return_thunk+0x5/0x5f [ 364.049939] ? amdgpu_ttm_tt_populate+0x5d/0xc0 [amdgpu] [ 364.050095] ttm_bo_handle_move_mem+0xc3/0x180 [ttm] [ 364.050103] ttm_bo_validate+0xc1/0x160 [ttm] [ 364.050108] ? amdgpu_ttm_tt_get_user_pages+0xe5/0x1b0 [amdgpu] [ 364.050263] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0xa12/0xc90 [amdgpu] [ 364.050473] kfd_ioctl_alloc_memory_of_gpu+0x16b/0x3b0 [amdgpu] [ 364.050680] kfd_ioctl+0x3c2/0x530 [amdgpu] [ 364.050866] ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu] [ 364.051054] ? srso_return_thunk+0x5/0x5f [ 364.051057] ? tomoyo_file_ioctl+0x20/0x30 [ 364.051063] __x64_sys_ioctl+0x9c/0xd0 [ 364.051068] x64_sys_call+0x1219/0x20d0 [ 364.051073] do_syscall_64+0x51/0x120 [ 364.051077] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 364.051081] RIP: 0033:0x7fb2d2f1a94f Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu: Fix DPX valid mode check on GC 9.4.3	Lijo Lazar
	For DPX mode, the number of memory partitions supported should be less than or equal to 2. Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu/gfx11: Add cleaner shader for GFX11.0.3	Srinivasan Shanmugam
	This commit adds the cleaner shader microcode for GFX11.0.3 GPUs. The cleaner shader is a piece of GPU code that is used to clear or initialize certain GPU resources, such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Clearing these resources is important for ensuring data isolation between different workloads running on the GPU. Without the cleaner shader, residual data from a previous workload could potentially be accessed by a subsequent workload, leading to data leaks and incorrect computation results. The cleaner shader microcode is represented as an array of 32-bit words (`gfx_11_0_3_cleaner_shader_hex`). This array is the binary representation of the cleaner shader code, which is written in a low-level GPU instruction set. When the cleaner shader feature is enabled, the AMDGPU driver loads this array into a specific location in the GPU memory. The GPU then reads this memory location to fetch and execute the cleaner shader instructions. The cleaner shader is executed automatically by the GPU at the end of each workload, before the next workload starts. This ensures that all GPU resources are in a clean state before the start of each workload. This addition is part of the cleaner shader feature implementation. The cleaner shader feature helps resource utilization by cleaning up GPU resources after they are used. It also enhances security and reliability by preventing data leaks between workloads. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amd/pm: add zero RPM stop temperature OD setting support for SMU13	Wolfgang Müller
	Together with the feature to enable or disable zero RPM in the last commit, it also makes sense to expose the OD setting determining under which temperature the fan should stop if zero RPM is enabled. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amdgpu/mes: fetch fw version from firmware header	Alex Deucher
	We need this prior to the firmware being loaded so fetch from the header. v2: fetch directly from the firmware v3: store both fw versions Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	drm/amd/pm: add zero RPM OD setting support for SMU13	Wolfgang Müller
	Whilst we have support for setting fan curves there is no support for disabling the zero RPM feature. Since the relevant bits are already present in the OverDriveTable, hook them up to a sysctl setting so users can influence this behaviour. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3489 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-05	net: ethernet: ti: am65-cpsw: fix warning in am65_cpsw_nuss_remove_rx_chns()	Roger Quadros
	flow->irq is initialized to 0 which is a valid IRQ. Set it to -EINVAL in error path of am65_cpsw_nuss_init_rx_chns() so we do not try to free an unallocated IRQ in am65_cpsw_nuss_remove_rx_chns(). If user tried to change number of RX queues and am65_cpsw_nuss_init_rx_chns() failed due to any reason, the warning will happen if user tries to change the number of RX queues after the error condition. root@am62xx-evm:~# ethtool -L eth0 rx 3 [ 40.385293] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19 [ 40.393211] am65-cpsw-nuss 8000000.ethernet: Failed to init rx flow2 netlink error: Invalid argument root@am62xx-evm:~# ethtool -L eth0 rx 2 [ 82.306427] ------------[ cut here ]------------ [ 82.311075] WARNING: CPU: 0 PID: 378 at kernel/irq/devres.c:144 devm_free_irq+0x84/0x90 [ 82.469770] Call trace: [ 82.472208] devm_free_irq+0x84/0x90 [ 82.475777] am65_cpsw_nuss_remove_rx_chns+0x6c/0xac [ti_am65_cpsw_nuss] [ 82.482487] am65_cpsw_nuss_update_tx_rx_chns+0x2c/0x9c [ti_am65_cpsw_nuss] [ 82.489442] am65_cpsw_set_channels+0x30/0x4c [ti_am65_cpsw_nuss] [ 82.495531] ethnl_set_channels+0x224/0x2dc [ 82.499713] ethnl_default_set_doit+0xb8/0x1b8 [ 82.504149] genl_family_rcv_msg_doit+0xc0/0x124 [ 82.508757] genl_rcv_msg+0x1f0/0x284 [ 82.512409] netlink_rcv_skb+0x58/0x130 [ 82.516239] genl_rcv+0x38/0x50 [ 82.519374] netlink_unicast+0x1d0/0x2b0 [ 82.523289] netlink_sendmsg+0x180/0x3c4 [ 82.527205] __sys_sendto+0xe4/0x158 [ 82.530779] __arm64_sys_sendto+0x28/0x38 [ 82.534782] invoke_syscall+0x44/0x100 [ 82.538526] el0_svc_common.constprop.0+0xc0/0xe0 [ 82.543221] do_el0_svc+0x1c/0x28 [ 82.546528] el0_svc+0x28/0x98 [ 82.549578] el0t_64_sync_handler+0xc0/0xc4 [ 82.553752] el0t_64_sync+0x190/0x194 [ 82.557407] ---[ end trace 0000000000000000 ]--- Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx") Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05	net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7	Roger Quadros
	On J7 platforms, setting up multiple RX flows was failing as the RX free descriptor ring 0 is shared among all flows and we did not allocate enough elements in the RX free descriptor ring 0 to accommodate for all RX flows. This issue is not present on AM62 as separate pair of rings are used for free and completion rings for each flow. Fix this by allocating enough elements for RX free descriptor ring 0. However, we can no longer rely on desc_idx (descriptor based offsets) to identify the pages in the respective flows as free descriptor ring includes elements for all flows. To solve this, introduce a new swdata data structure to store flow_id and page. This can be used to identify which flow (page_pool) and page the descriptor belonged to when popped out of the RX rings. Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx") Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05	thunderbolt: Fix connection issue with Pluggable UD-4VPD dock	Mika Westerberg
	Rick reported that his Pluggable USB4 dock does not work anymore after upgrading to v6.10 kernel. It looks like commit c6ca1ac9f472 ("thunderbolt: Increase sideband access polling delay") makes the device router enumeration happen later than what might be expected by the dock (although there is no such limit in the USB4 spec) which probably makes it assume there is something wrong with the high-speed link and reset it. After the link is reset the same issue happens again and again. For this reason lower the sideband access delay from 5ms to 1ms. This seems to work fine according to Rick's testing. Reported-by: Rick Lahaye <rick@581238.xyz> Closes: https://lore.kernel.org/linux-usb/000f01db247b$d10e1520$732a3f60$@581238.xyz/ Tested-by: Rick Lahaye <rick@581238.xyz> Fixes: c6ca1ac9f472 ("thunderbolt: Increase sideband access polling delay") Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2024-11-05	i2c: muxes: Fix return value check in mule_i2c_mux_probe()	Yang Yingliang
	If dev_get_regmap() fails, it returns NULL pointer not ERR_PTR(), replace IS_ERR() with NULL pointer check, and return -ENODEV. Fixes: d0f8e97866bf ("i2c: muxes: add support for tsd,mule-i2c multiplexer") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-11-05	drm/ci: remove update-xfails.py	Vignesh Raman
	We can remove the xfails/update-xfails.py script as it is not used in CI jobs. Once ci-collate [1] is tested for drm-ci, we can use this tool directly to update fails and flakes. [1] https://gitlab.freedesktop.org/gfx-ci/ci-collate/ Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com> Reviewed-by: WangYuli <wangyuli@uniontech.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030091732.665428-1-vignesh.raman@collabora.com
2024-11-05	serial: 8250: omap: Move pm_runtime_get_sync	Bin Liu
	Currently in omap_8250_shutdown, the dma->rx_running flag is set to zero in omap_8250_rx_dma_flush. Next pm_runtime_get_sync is called, which is a runtime resume call stack which can re-set the flag. When the call omap_8250_shutdown returns, the flag is expected to be UN-SET, but this is not the case. This is causing issues the next time UART is re-opened and omap_8250_rx_dma is called. Fix by moving pm_runtime_get_sync before the omap_8250_rx_dma_flush. cc: stable@vger.kernel.org Fixes: 0e31c8d173ab ("tty: serial: 8250_omap: add custom DMA-RX callback") Signed-off-by: Bin Liu <b-liu@ti.com> [Judith: Add commit message] Signed-off-by: Judith Mendez <jm@ti.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Link: https://lore.kernel.org/r/20241031172315.453750-1-jm@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	drivers: core: fw_devlink: Make the error message a bit more useful	Saravana Kannan
	It would make it easier to debugs issues similar to the ones reported[1][2] recently where some devices didn't have the fwnode set. [1] - https://lore.kernel.org/all/7b995947-4540-4b17-872e-e107adca4598@notapiano/ [2] - https://lore.kernel.org/all/20240910130019.35081-1-jonathanh@nvidia.com/ Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20241024061347.1771063-4-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	phy: tegra: xusb: Set fwnode for xusb port devices	Saravana Kannan
	fwnode needs to be set for a device for fw_devlink to be able to track/enforce its dependencies correctly. Without this, you'll see error messages like this when the supplier has probed and tries to make sure all its fwnode consumers are linked to it using device links: tegra-xusb-padctl 3520000.padctl: Failed to create device link (0x180) with 1-0008 Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/all/20240910130019.35081-1-jonathanh@nvidia.com/ Tested-by: Jon Hunter <jonathanh@nvidia.com> Suggested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20241024061347.1771063-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	drm: display: Set fwnode for aux bus devices	Saravana Kannan
	fwnode needs to be set for a device for fw_devlink to be able to track/enforce its dependencies correctly. Without this, you'll see error messages like this when the supplier has probed and tries to make sure all its fwnode consumers are linked to it using device links: mediatek-drm-dp 1c500000.edp-tx: Failed to create device link (0x180) with backlight-lcd0 Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Closes: https://lore.kernel.org/all/7b995947-4540-4b17-872e-e107adca4598@notapiano/ Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Thierry Reding <treding@nvidia.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20241024061347.1771063-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	driver core: fw_devlink: Stop trying to optimize cycle detection logic	Saravana Kannan
	In attempting to optimize fw_devlink runtime, I introduced numerous cycle detection bugs by foregoing cycle detection logic under specific conditions. Each fix has further narrowed the conditions for optimization. It's time to give up on these optimization attempts and just run the cycle detection logic every time fw_devlink tries to create a device link. The specific bug report that triggered this fix involved a supplier fwnode that never gets a device created for it. Instead, the supplier fwnode is represented by the device that corresponds to an ancestor fwnode. In this case, fw_devlink didn't do any cycle detection because the cycle detection logic is only run when a device link is created between the devices that correspond to the actual consumer and supplier fwnodes. With this change, fw_devlink will run cycle detection logic even when creating SYNC_STATE_ONLY proxy device links from a device that is an ancestor of a consumer fwnode. Reported-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Closes: https://lore.kernel.org/all/1a1ab663-d068-40fb-8c94-f0715403d276@ideasonboard.com/ Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles") Cc: stable <stable@kernel.org> Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20241030171009.1853340-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	nvmem: core: Check read_only flag for force_ro in bin_attr_nvmem_write()	Marek Vasut
	The bin_attr_nvmem_write() must check the read_only flag and block writes on read-only devices, now that a nvmem device can be switched between read-write and read-only mode at runtime using the force_ro attribute. Add the missing check. Fixes: 9d7eb234ac7a ("nvmem: core: Implement force_ro sysfs attribute") Cc: Stable@vger.kernel.org Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20241030140253.40445-2-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	comedi: Flush partial mappings in error case	Jann Horn
	If some remap_pfn_range() calls succeeded before one failed, we still have buffer pages mapped into the userspace page tables when we drop the buffer reference with comedi_buf_map_put(bm). The userspace mappings are only cleaned up later in the mmap error path. Fix it by explicitly flushing all mappings in our VMA on the error path. See commit 79a61cc3fc04 ("mm: avoid leaving partial pfn mappings around in error case"). Cc: stable@vger.kernel.org Fixes: ed9eccbe8970 ("Staging: add comedi core") Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20241017-comedi-tlb-v3-1-16b82f9372ce@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	driver core: Constify attribute arguments of binary attributes	Thomas Weißschuh
	As preparation for the constification of struct bin_attribute, constify the arguments of the read and write callbacks. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-10-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	sysfs: treewide: constify attribute callback of bin_attribute::llseek()	Thomas Weißschuh
	The llseek() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-7-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	sysfs: treewide: constify attribute callback of bin_attribute::mmap()	Thomas Weißschuh
	The mmap() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-6-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	sysfs: treewide: constify attribute callback of bin_is_visible()	Thomas Weißschuh
	The is_bin_visible() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-5-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	nvmem: core: calculate bin_attribute size through bin_size()	Thomas Weißschuh
	Stop abusing the is_bin_visible() callback to calculate the attribute size. Instead use the new, dedicated bin_size() one. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-4-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	PCI/sysfs: Calculate bin_attribute size through bin_size()	Thomas Weißschuh
	Stop abusing the is_bin_visible() callback to calculate the attribute size. Instead use the new, dedicated bin_size() one. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-3-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	usb: typec: fix potential out of bounds in ucsi_ccg_update_set_new_cam_cmd()	Dan Carpenter
	The "*cmd" variable can be controlled by the user via debugfs. That means "new_cam" can be as high as 255 while the size of the uc->updated[] array is UCSI_MAX_ALTMODES (30). The call tree is: ucsi_cmd() // val comes from simple_attr_write_xsigned() -> ucsi_send_command() -> ucsi_send_command_common() -> ucsi_run_command() // calls ucsi->ops->sync_control() -> ucsi_ccg_sync_control() Fixes: 170a6726d0e2 ("usb: typec: ucsi: add support for separate DP altmode devices") Cc: stable <stable@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/325102b3-eaa8-4918-a947-22aca1146586@stanley.mountain Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	usb: dwc3: fix fault at system suspend if device was already runtime suspended	Roger Quadros
	If the device was already runtime suspended then during system suspend we cannot access the device registers else it will crash. Also we cannot access any registers after dwc3_core_exit() on some platforms so move the dwc3_enable_susphy() call to the top. Cc: stable@vger.kernel.org # v5.15+ Reported-by: William McVicker <willmcvicker@google.com> Closes: https://lore.kernel.org/all/ZyVfcUuPq56R2m1Y@google.com Fixes: 705e3ce37bcc ("usb: dwc3: core: Fix system suspend on TI AM62 platforms") Signed-off-by: Roger Quadros <rogerq@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Tested-by: Will McVicker <willmcvicker@google.com> Link: https://lore.kernel.org/r/20241104-am62-lpm-usb-fix-v1-1-e93df73a4f0d@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	usb: typec: qcom-pmic: init value of hdr_len/txbuf_len earlier	Rex Nie
	If the read of USB_PDPHY_RX_ACKNOWLEDGE_REG failed, then hdr_len and txbuf_len are uninitialized. This commit stops to print uninitialized value and misleading/false data. Cc: stable@vger.kernel.org Fixes: a4422ff22142 (" usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Signed-off-by: Rex Nie <rex.nie@jaguarmicro.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Link: https://lore.kernel.org/r/20241030133632.2116-1-rex.nie@jaguarmicro.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	net: hns3: fix kernel crash when uninstalling driver	Peiyang Wang
	When the driver is uninstalled and the VF is disabled concurrently, a kernel crash occurs. The reason is that the two actions call function pci_disable_sriov(). The num_VFs is checked to determine whether to release the corresponding resources. During the second calling, num_VFs is not 0 and the resource release function is called. However, the corresponding resource has been released during the first invoking. Therefore, the problem occurs: [15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... [15278.131557][T50670] Call trace: [15278.134686][T50670] klist_put+0x28/0x12c [15278.138682][T50670] klist_del+0x14/0x20 [15278.142592][T50670] device_del+0xbc/0x3c0 [15278.146676][T50670] pci_remove_bus_device+0x84/0x120 [15278.151714][T50670] pci_stop_and_remove_bus_device+0x6c/0x80 [15278.157447][T50670] pci_iov_remove_virtfn+0xb4/0x12c [15278.162485][T50670] sriov_disable+0x50/0x11c [15278.166829][T50670] pci_disable_sriov+0x24/0x30 [15278.171433][T50670] hnae3_unregister_ae_algo_prepare+0x60/0x90 [hnae3] [15278.178039][T50670] hclge_exit+0x28/0xd0 [hclge] [15278.182730][T50670] __se_sys_delete_module.isra.0+0x164/0x230 [15278.188550][T50670] __arm64_sys_delete_module+0x1c/0x30 [15278.193848][T50670] invoke_syscall+0x50/0x11c [15278.198278][T50670] el0_svc_common.constprop.0+0x158/0x164 [15278.203837][T50670] do_el0_svc+0x34/0xcc [15278.207834][T50670] el0_svc+0x20/0x30 For details, see the following figure. rmmod hclge disable VFs ---------------------------------------------------- hclge_exit() sriov_numvfs_store() ... device_lock() pci_disable_sriov() hns3_pci_sriov_configure() pci_disable_sriov() sriov_disable() sriov_disable() if !num_VFs : if !num_VFs : return; return; sriov_del_vfs() sriov_del_vfs() ... ... klist_put() klist_put() ... ... num_VFs = 0; num_VFs = 0; device_unlock(); In this patch, when driver is removing, we get the device_lock() to protect num_VFs, just like sriov_numvfs_store(). Fixes: 0dd8a25f355b ("net: hns3: disable sriov before unload hclge layer") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241101091507.3644584-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05	iommu/vt-d: Drain PRQs when domain removed from RID	Lu Baolu
	As this iommu driver now supports page faults for requests without PASID, page requests should be drained when a domain is removed from the RID2PASID entry. This results in the intel_iommu_drain_pasid_prq() call being moved to intel_pasid_tear_down_entry(). This indicates that when a translation is removed from any PASID entry and the PRI has been enabled on the device, page requests are drained in the domain detachment path. The intel_iommu_drain_pasid_prq() helper has been modified to support sending device TLB invalidation requests for both PASID and non-PASID cases. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20241101045543.70086-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Drop pasid requirement for prq initialization	Klaus Jensen
	PASID support within the IOMMU is not required to enable the Page Request Queue, only the PRS capability. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joel Granados <joel.granados@kernel.org> Link: https://lore.kernel.org/r/20241015-jag-iopfv8-v4-5-b696ca89ba29@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommufd: Enable PRI when doing the iommufd_hwpt_alloc	Joel Granados
	Add IOMMU_HWPT_FAULT_ID_VALID as part of the valid flags when doing an iommufd_hwpt_alloc allowing the use of an iommu fault allocation (iommu_fault_alloc) with the IOMMU_HWPT_ALLOC ioctl. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joel Granados <joel.granados@kernel.org> Link: https://lore.kernel.org/r/20241015-jag-iopfv8-v4-4-b696ca89ba29@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Move IOMMU_IOPF into INTEL_IOMMU	Joel Granados
	Move IOMMU_IOPF from under INTEL_IOMMU_SVM into INTEL_IOMMU. This certifies that the core intel iommu utilizes the IOPF library functions, independent of the INTEL_IOMMU_SVM config. Signed-off-by: Joel Granados <joel.granados@kernel.org> Link: https://lore.kernel.org/r/20241015-jag-iopfv8-v4-3-b696ca89ba29@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Remove the pasid present check in prq_event_thread	Klaus Jensen
	PASID is not strictly needed when handling a PRQ event; remove the check for the pasid present bit in the request. This change was not included in the creation of prq.c to emphasize the change in capability checks when handing PRQ events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joel Granados <joel.granados@kernel.org> Link: https://lore.kernel.org/r/20241015-jag-iopfv8-v4-2-b696ca89ba29@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Separate page request queue from SVM	Joel Granados
	IO page faults are no longer dependent on CONFIG_INTEL_IOMMU_SVM. Move all Page Request Queue (PRQ) functions that handle prq events to a new file in drivers/iommu/intel/prq.c. The page_req_des struct is now declared in drivers/iommu/intel/prq.c. No functional changes are intended. This is a preparation patch to enable the use of IO page faults outside the SVM/PASID use cases. Signed-off-by: Joel Granados <joel.granados@kernel.org> Link: https://lore.kernel.org/r/20241015-jag-iopfv8-v4-1-b696ca89ba29@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Fix checks and print in pgtable_walk()	Zhenzhong Duan
	There are some issues in pgtable_walk(): 1. Super page is dumped as non-present page 2. dma_pte_superpage() should not check against leaf page table entries 3. Pointer pte is never NULL so checking it is meaningless 4. When an entry is not present, it still makes sense to dump the entry content. Fix 1,2 by checking dma_pte_superpage()'s returned value after level check. Fix 3 by removing pte check. Fix 4 by checking present bit after printing. By this chance, change to print "page table not present" instead of "PTE not present" to be clearer. Fixes: 914ff7719e8a ("iommu/vt-d: Dump DMAR translation structure when DMA fault occurs") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/r/20241024092146.715063-3-zhenzhong.duan@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-11-05	iommu/vt-d: Fix checks and print in dmar_fault_dump_ptes()	Zhenzhong Duan
	There are some issues in dmar_fault_dump_ptes(): 1. return value of phys_to_virt() is used for checking if an entry is present. 2. dump is confusing, e.g., "pasid table entry is not present", confusing by unpresent pasid table vs. unpresent pasid table entry. Current code means the former. 3. pgtable_walk() is called without checking if page table is present. Fix 1 by checking present bit of an entry before dump a lower level entry. Fix 2 by removing "entry" string, e.g., "pasid table is not present". Fix 3 by checking page table present before walk. Take issue 3 for example, before fix: [ 442.240357] DMAR: pasid dir entry: 0x000000012c83e001 [ 442.246661] DMAR: pasid table entry[0]: 0x0000000000000000 [ 442.253429] DMAR: pasid table entry[1]: 0x0000000000000000 [ 442.260203] DMAR: pasid table entry[2]: 0x0000000000000000 [ 442.266969] DMAR: pasid table entry[3]: 0x0000000000000000 [ 442.273733] DMAR: pasid table entry[4]: 0x0000000000000000 [ 442.280479] DMAR: pasid table entry[5]: 0x0000000000000000 [ 442.287234] DMAR: pasid table entry[6]: 0x0000000000000000 [ 442.293989] DMAR: pasid table entry[7]: 0x0000000000000000 [ 442.300742] DMAR: PTE not present at level 2 After fix: ... [ 357.241214] DMAR: pasid table entry[6]: 0x0000000000000000 [ 357.248022] DMAR: pasid table entry[7]: 0x0000000000000000 [ 357.254824] DMAR: scalable mode page table is not present Fixes: 914ff7719e8a ("iommu/vt-d: Dump DMAR translation structure when DMA fault occurs") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/r/20241024092146.715063-2-zhenzhong.duan@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>