git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-08-23	platform/x86/amd/pmc: Fix SMU command submission path on new AMD platform	Shyam Sundar S K
	The commit 426463d94d45 ("platform/x86/amd/pmc: Send OS_HINT command for new AMD platform") was introduced to enable sending mailbox commands to PMFW on newer platforms. However, it was later discovered that the commit did not configure the correct message port ID (i.e., S2D or PMC). Without this configuration, all command submissions to PMFW are treated as invalid, leading to command failures. To address this issue, the CPU ID association for the new platform needs to be added in amd_pmc_get_ip_info(). This ensures that the correct SMU port IDs are selected. Fixes: 426463d94d45 ("platform/x86/amd/pmc: Send OS_HINT command for new AMD platform") Co-developed-by: Sanket Goswami <Sanket.Goswami@amd.com> Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20240822095357.395808-1-Shyam-sundar.S-k@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-23	net: sfp: Add helper to return the SFP bus name	Maxime Chevallier
	Knowing the bus name is helpful when we want to expose the link topology to userspace, add a helper to return the SFP bus name. This call will always be made while holding the RTNL which ensures that the SFP driver won't unbind from the device. The returned pointer to the bus name will only be used while RTNL is held. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23	net: phy: add helpers to handle sfp phy connect/disconnect	Maxime Chevallier
	There are a few PHY drivers that can handle SFP modules through their sfp_upstream_ops. Introduce Phylib helpers to keep track of connected SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the upstream PHY's netdev's namespace. By doing so, these SFP PHYs can be enumerated and exposed to users, which will be able to use their capabilities. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23	net: sfp: pass the phy_device when disconnecting an sfp module's PHY	Maxime Chevallier
	Pass the phy_device as a parameter to the sfp upstream .disconnect_phy operation. This is preparatory work to help track phy devices across a net_device's link. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23	net: phy: Introduce ethernet link topology representation	Maxime Chevallier
	Link topologies containing multiple network PHYs attached to the same net_device can be found when using a PHY as a media converter for use with an SFP connector, on which an SFP transceiver containing a PHY can be used. With the current model, the transceiver's PHY can't be used for operations such as cable testing, timestamping, macsec offload, etc. The reason being that most of the logic for these configuration, coming from either ethtool netlink or ioctls tend to use netdev->phydev, which in multi-phy systems will reference the PHY closest to the MAC. Introduce a numbering scheme allowing to enumerate PHY devices that belong to any netdev, which can in turn allow userspace to take more precise decisions with regard to each PHY's configuration. The numbering is maintained per-netdev, in a phy_device_list. The numbering works similarly to a netdevice's ifindex, with identifiers that are only recycled once INT_MAX has been reached. This prevents races that could occur between PHY listing and SFP transceiver removal/insertion. The identifiers are assigned at phy_attach time, as the numbering depends on the netdevice the phy is attached to. The PHY index can be re-used for PHYs that are persistent. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23	irqchip/irq-msi-lib: Check for NULL ops in msi_lib_irq_domain_select()	Maxime Chevallier
	The irq_domain passed to msi_lib_irq_domain_select() may not have msi_parent_ops set. There is a NULL pointer check for it, but unfortunately there is a dereference of the parent ops pointer before that. Move the NULL pointer test before the first use of that pointer. This was found on a MacchiatoBin (Marvell Armada 8K SoC), which uses the irq-mvebu-sei driver. Fixes: 72e257c6f058 ("irqchip: Provide irq-msi-lib") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240823100733.1900666-1-maxime.chevallier@bootlin.com Closes: https://lore.kernel.org/all/20240821165034.1af97bad@fedora-3.home/
2024-08-23	irqchip/gic-v3: Init SRE before poking sysregs	Mark Rutland
	The GICv3 driver pokes GICv3 system registers in gic_prio_init() before gic_cpu_sys_reg_init() ensures that GICv3 system registers have been enabled by writing to ICC_SRE_EL1.SRE. On arm64 this is benign as has_useable_gicv3_cpuif() runs earlier during cpufeature detection, and this enables the GICv3 system registers. On 32-bit arm when booting on an FVP using the boot-wrapper, the accesses in gic_prio_init() end up being UNDEFINED and crashes the kernel during boot. This is a regression introduced by the addition of gic_prio_init(). Fix this by factoring out the SRE initialization into a new function and calling it early in the three paths where SRE may not have been initialized: (1) gic_init_bases(), before the primary CPU pokes GICv3 sysregs in gic_prio_init(). (2) gic_starting_cpu(), before secondary CPUs initialize GICv3 sysregs in gic_cpu_init(). (3) gic_cpu_pm_notifier(), before CPUs re-initialize GICv3 sysregs in gic_cpu_sys_reg_init(). Fixes: d447bf09a4013541 ("irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
2024-08-23	OPP: Fix support for required OPPs for multiple PM domains	Ulf Hansson
	It has turned out that having _set_required_opps() to recursively call dev_pm_opp_set_opp() to set the required OPPs, doesn't really work as well as we expected. More precisely, at each recursive call to dev_pm_opp_set_opp() we are changing an OPP for a required_dev that belongs to a required-OPP table. The problem with this, is that we may have several devices sharing the same required-OPP table, which leads to an incorrect behaviour in regards to aggregating the per device votes. To fix the problem for a required-OPP table belonging to a PM domain, which is the only existing usecase for now, let's simply replace the call to dev_pm_opp_set_opp() in _set_required_opps() by a call to _set_opp_level(). Moving forward we may potentially need to add support for other types of required-OPP tables. In this case, the aggregation needs to be thought of. Fixes: e37440e7e2c2 ("OPP: Call dev_pm_opp_set_opp() for required OPPs") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20240822224547.385095-2-ulf.hansson@linaro.org
2024-08-23	Merge tag 'pwrseq-fixes-for-v6.11-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull power sequencing fix from Bartosz Golaszewski: - request the wlan-enable GPIO "as-is" to fix an issue with the wifi module being already powered up before linux boots * tag 'pwrseq-fixes-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: power: sequencing: request the WLAN enable GPIO as-is
2024-08-23	Merge tag 'pmdomain-v6.11-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: - imx: Remove duplicated clocks for scu power domain - imx: Wait for SSAR to complete power-on for i.MX93 power domain * tag 'pmdomain-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: imx: wait SSAR when i.MX93 power domain on pmdomain: imx: scu-pd: Remove duplicated clocks
2024-08-23	iommu: Handle iommu faults for a bad iopf setup	Pranjal Shrivastava
	The iommu_report_device_fault function was updated to return void while assuming that drivers only need to call iommu_report_device_fault() for reporting an iopf. This implementation causes following problems: 1. The drivers rely on the core code to call it's page_reponse, however, when a fault is received and no fault capable domain is attached / iopf_param is NULL, the ops->page_response is NOT called causing the device to stall in case the fault type was PAGE_REQ. 2. The arm_smmu_v3 driver relies on the returned value to log errors returning void from iommu_report_device_fault causes these events to be missed while logging. Modify the iommu_report_device_fault function to return -EINVAL for cases where no fault capable domain is attached or iopf_param was NULL and calls back to the driver (ops->page_response) in case the fault type was IOMMU_FAULT_PAGE_REQ. The returned value can be used by the drivers to log the fault/event as needed. Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Closes: https://lore.kernel.org/all/6147caf0-b9a0-30ca-795e-a1aa502a5c51@huawei.com/ Fixes: 3dfa64aecbaf ("iommu: Make iommu_report_device_fault() return void") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240816104906.1010626-1-praan@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-23	Merge tag 'ata-6.11-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Damien Le Moal: - Fix the max segment size and max number of segments supported by the pata_macio driver to fix issues with BIO splitting leading to an overflow of the adapter DMA table (from Michael) - Related to the previous fix, change BUG_ON() calls for incorrect command buffer segmentation into WARN_ON() and an error return (from Michael) * tag 'ata-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: pata_macio: Use WARN instead of BUG ata: pata_macio: Fix DMA table overflow
2024-08-22	scsi: sd: Ignore command SYNCHRONIZE CACHE error if format in progress	Yihang Li
	If formatting a suspended disk (such as formatting with different DIF type), the disk will be resuming first, and then the format command will submit to the disk through SG_IO ioctl. When the disk is processing the format command, the system does not submit other commands to the disk. Therefore, the system attempts to suspend the disk again and sends the SYNCHRONIZE CACHE command. However, the SYNCHRONIZE CACHE command will fail because the disk is in the formatting process. This will cause the runtime_status of the disk to error and it is difficult for user to recover it. Error info like: [ 669.925325] sd 6:0:6:0: [sdg] Synchronizing SCSI cache [ 670.202371] sd 6:0:6:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK [ 670.216300] sd 6:0:6:0: [sdg] Sense Key : 0x2 [current] [ 670.221860] sd 6:0:6:0: [sdg] ASC=0x4 ASCQ=0x4 To solve the issue, ignore the error and return success/0 when format is in progress. Cc: stable@vger.kernel.org Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20240819090934.2130592-1-liyihang9@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22	scsi: aacraid: Fix double-free on probe failure	Ben Hutchings
	aac_probe_one() calls hardware-specific init functions through the aac_driver_ident::init pointer, all of which eventually call down to aac_init_adapter(). If aac_init_adapter() fails after allocating memory for aac_dev::queues, it frees the memory but does not clear that member. After the hardware-specific init function returns an error, aac_probe_one() goes down an error path that frees the memory pointed to by aac_dev::queues, resulting.in a double-free. Reported-by: Michael Gordon <m.gordon.zelenoborsky@gmail.com> Link: https://bugs.debian.org/1075855 Fixes: 8e0c5ebde82b ("[SCSI] aacraid: Newer adapter communication iterface support") Signed-off-by: Ben Hutchings <benh@debian.org> Link: https://lore.kernel.org/r/ZsZvfqlQMveoL5KQ@decadent.org.uk Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22	scsi: lpfc: Fix overflow build issue	Sherry Yang
	Build failed while enabling "CONFIG_GCOV_KERNEL=y" and "CONFIG_GCOV_PROFILE_ALL=y" with following error: BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_get_cgnbuf_info': BUILDSTDERR: ./include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' accessing 18446744073709551615 bytes at offsets 0 and 0 overlaps 9223372036854775807 bytes at offset -9223372036854775808 [-Werror=restrict] BUILDSTDERR: 114 \| #define __underlying_memcpy __builtin_memcpy BUILDSTDERR: \| ^ BUILDSTDERR: ./include/linux/fortify-string.h:637:9: note: in expansion of macro '__underlying_memcpy' BUILDSTDERR: 637 \| __underlying_##op(p, q, __fortify_size); \ BUILDSTDERR: \| ^~~~~~~~~~~~~ BUILDSTDERR: ./include/linux/fortify-string.h:682:26: note: in expansion of macro '__fortify_memcpy_chk' BUILDSTDERR: 682 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ BUILDSTDERR: \| ^~~~~~~~~~~~~~~~~~~~ BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c:5468:9: note: in expansion of macro 'memcpy' BUILDSTDERR: 5468 \| memcpy(cgn_buff, cp, cinfosz); BUILDSTDERR: \| ^~~~~~ This happens from the commit 06bb7fc0feee ("kbuild: turn on -Wrestrict by default"). Address this issue by using size_t type. Signed-off-by: Sherry Yang <sherry.yang@oracle.com> Link: https://lore.kernel.org/r/20240821065131.1180791-1-sherry.yang@oracle.com Reviewed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.h c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops") f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC") Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22	net: atlantic: Avoid warning about potential string truncation	Simon Horman
	W=1 builds with GCC 14.2.0 warn that: .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=] 278 \| snprintf(tc_string, 8, "TC%d ", tc); \| ^~ .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254] 278 \| snprintf(tc_string, 8, "TC%d ", tc); \| ^~~~~~~ .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8 278 \| snprintf(tc_string, 8, "TC%d ", tc); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8, the range is 0 - 255. Further, on inspecting the code, it seems that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so the range is actually 0 - 8. So, it seems that the condition that GCC flags will not occur. But, nonetheless, it would be nice if it didn't emit the warning. It seems that this can be achieved by changing the format specifier from %d to %u, in which case I believe GCC recognises an upper bound on the range of tc of 0 - 255. After some experimentation I think this is due to the combination of the use of %u and the type of cfg->tcs (u8). Empirically, updating the type of the tc variable to unsigned int has the same effect. As both of these changes seem to make sense in relation to what the code is actually doing - iterating over unsigned values - do both. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-23	Merge tag 'net-6.11-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Current release - regressions: - virtio_net: avoid crash on resume - move netdev_tx_reset_queue() call before RX napi enable Current release - new code bugs: - net/mlx5e: fix page leak and incorrect header release w/ HW GRO Previous releases - regressions: - udp: fix receiving fraglist GSO packets - tcp: prevent refcount underflow due to concurrent execution of tcp_sk_exit_batch() Previous releases - always broken: - ipv6: fix possible UAF when incrementing error counters on output - ip6: tunnel: prevent merging of packets with different L2 - mptcp: pm: fix IDs not being reusable - bonding: fix potential crashes in IPsec offload handling - Bluetooth: HCI: - MGMT: add error handling to pair_device() to avoid a crash - invert LE State quirk to be opt-out rather then opt-in - fix LE quote calculation - drv: dsa: VLAN fixes for Ocelot driver - drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings - drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192 Misc: - netpoll: do not export netpoll_poll_[disable\|enable]() - MAINTAINERS: update the list of networking headers" * tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) s390/iucv: Fix vargs handling in iucv_alloc_device() net: ovs: fix ovs_drop_reasons error net: xilinx: axienet: Fix dangling multicast addresses net: xilinx: axienet: Always disable promiscuous mode MAINTAINERS: Mark JME Network Driver as Odd Fixes MAINTAINERS: Add header files to NETWORKING sections MAINTAINERS: Add limited globs for Networking headers MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS octeontx2-af: Fix CPT AF register offset calculation net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F net: ngbe: Fix phy mode set to external phy netfilter: flowtable: validate vlan header bnxt_en: Fix double DMA unmapping for XDP_REDIRECT ipv6: prevent possible UAF in ip6_xmit() ipv6: fix possible UAF in ip6_finish_output2() ipv6: prevent UAF in ip6_send_skb() netpoll: do not export netpoll_poll_[disable\|enable]() selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path udp: fix receiving fraglist GSO packets ...
2024-08-23	Merge tag 'kbuild-fixes-v6.11-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Eliminate the fdtoverlay command duplication in scripts/Makefile.lib - Fix 'make compile_commands.json' for external modules - Ensure scripts/kconfig/merge_config.sh handles missing newlines - Fix some build errors on macOS * tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: fix typos "prequisites" to "prerequisites" Documentation/llvm: turn make command for ccache into code block kbuild: avoid scripts/kallsyms parsing /dev/null treewide: remove unnecessary <linux/version.h> inclusion scripts: kconfig: merge_config: config files: add a trailing newline Makefile: add $(srctree) to dependency of compile_commands.json target kbuild: clean up code duplication in cmd_fdtoverlay
2024-08-23	Merge tag 'drm-xe-fixes-2024-08-22' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 (Geert) Driver Changes: - Fix opregion leak (Lucas) - Fix OA sysfs entry (Ashutosh) - Fix VM dma-resv lock (Brost) - Fix tile fini sequence (Brost) - Prevent UAF around preempt fence (Auld) - Fix DGFX display suspend/resume (Maarten) - Many Xe/Xe2 critical workarounds (Auld, Ngai-Mint, Bommu, Tejas, Daniele) - Fix devm/drmm issues (Daniele) - Fix missing workqueue destroy in xe_gt_pagefault (Stuart) - Drop HW fence pointer to HW fence ctx (Brost) - Free job before xe_exec_queue_put (Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZsdVe0XI2Pq8C-ON@intel.com
2024-08-23	Merge tag 'drm-misc-fixes-2024-08-22' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: nouveau: - firmware: use dma non-coherent allocator Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240822123907.GA234335@localhost.localdomain
2024-08-23	Merge tag 'drm-intel-fixes-2024-08-22' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix for HDCP timeouts Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZsbPMm6XfzimmZW0@jlahtine-mobl.ger.corp.intel.com
2024-08-23	Merge tag 'amd-drm-fixes-6.11-2024-08-21' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.11-2024-08-21: amdgpu: - GFX10 firmware loading fix - SDMA 5.2 fix - Debugfs parameter validation fix - eGPU hotplug fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821172810.302416-1-alexander.deucher@amd.com
2024-08-23	Merge tag 'drm-msm-fixes-2024-08-19' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.11-rc5 1) Fixes from the virtual plane series, namely - fix the list of formats for QCM2290 since it has no YUV support - minor fix in dpu_plane_atomic_check_pipe() to check only for csc and not csc and scaler while allowing yuv formats - take rotation into account while allocating virtual planes 2) Fix to cleanup FB if dpu_format_populate_layout() fails. This fixes the warning splat during DRM file closure 3) Fix to reset the phy link params before re-starting link training. This fixes the 100% link training failure when someone starts modetest while cable is connected 4) Long pending fix to fix a visual corruption seen for 4k modes. Root-cause was we cannot support 4k@30 with 30bpp with 2 lanes so this is a critical fix to use 24bpp for such cases 5) Fix to move dpu encoder's connector assignment to atomic_enable(). This fixes the NULL ptr crash for cases when there is an atomic_enable() without atomic_modeset() after atomic_disable() . This happens for connectors_changed case of crtc. It fixes a NULL ptr crash reported during hotplug. 6) Fix to simplify DPU's debug macros without which dynamic debug does not work as expected 7) Fix the highest bank bit setting for sc7180 8) adreno: fix error return if missing firmware-name Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvxF2p3-AsjUydmSYrA0Vb+Ea7nh3VtNX0pT0Ae_Me-Kw@mail.gmail.com
2024-08-22	cpufreq/amd-pstate: Use topology_logical_package_id() instead of ↵	Gautham R. Shenoy
	logical_die_id() After the commit 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"), the topolgy_logical_die_id() function returns the logical Core Chiplet Die (CCD) ID instead of the logical socket ID. Since this is currently used to set MSR_AMD_CPPC_ENABLE, which needs to be set on any one of the threads of the socket, it is prudent to use topology_logical_package_id() in place of topology_logical_die_id(). Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf") cc: stable@vger.kernel.org # 6.10 Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Link: https://lore.kernel.org/lkml/20240801124509.3650-1-Dhananjay.Ugwekar@amd.com/ Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22	cpufreq: amd-pstate: Fix uninitialized variable in amd_pstate_cpu_boost_update()	Dan Carpenter
	Smatch complains that "ret" could be uninitialized: drivers/cpufreq/amd-pstate.c:734 amd_pstate_cpu_boost_update() error: uninitialized symbol 'ret'. This seems like it probably is a real issue. Initialize "ret" to zero to be safe. Fixes: c8c68c38b56f ("cpufreq: amd-pstate: initialize core precision boost state") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Perry Yuan <perry.yuan@amd.com> Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/lkml/7ff53543-6c04-48a0-8d99-7dc010b93b3a@stanley.mountain/T/ Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22	nvme: Remove unused field	Nilay Shroff
	The "name" field in struct nvme_ctrl is unsued so removing it. This would help save 12 bytes of space for each nvme_ctrl instance created. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme: move stopping keep-alive into nvme_uninit_ctrl()	Ming Lei
	Commit 4733b65d82bd ("nvme: start keep-alive after admin queue setup") moves starting keep-alive from nvme_start_ctrl() into nvme_init_ctrl_finish(), but don't move stopping keep-alive into nvme_uninit_ctrl(), so keep-alive work can be started and keep pending after failing to start controller, finally use-after-free is triggered if nvme host driver is unloaded. This patch fixes kernel panic when running nvme/004 in case that connection failure is triggered, by moving stopping keep-alive into nvme_uninit_ctrl(). This way is reasonable because keep-alive is now started in nvme_init_ctrl_finish(). Fixes: 3af755a46881 ("nvme: move nvme_stop_keep_alive() back to original position") Cc: Hannes Reinecke <hare@suse.de> Cc: Mark O'Donovan <shiftee@posteo.net> Reported-by: Changhui Zhong <czhong@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-target: do not check authentication status for admin commands twice	Hannes Reinecke
	nvmet_check_ctrl_status() checks the authentication status, so we don't need to do that prior to calling it. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvmet-auth: allow to clear DH-HMAC-CHAP keys	Hannes Reinecke
	As we can set DH-HMAC-CHAP keys, we should also be able to unset them. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-sysfs: add 'tls_keyring' attribute	Hannes Reinecke
	Add a 'tls_keyring' attribute to display the contents of the --keyring option from the connect string. Adding this attribute allows us to recreate the original connect string from sysfs settings. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-sysfs: add 'tls_configured_key' sysfs attribute	Hannes Reinecke
	There is a difference between the negotiated TLS key (which is always present for a TLS encrypted connection) and the configured TLS key (which is specified with the --tls_key command line option). To differentate between these two add a new sysfs attribute 'tls_configured_key' to hold the specified on the command line. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme: split off TLS sysfs attributes into a separate group	Hannes Reinecke
	Split off TLS sysfs attributes into a separate group to improve readability and to keep all TLS related handling in one section. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme: add a newline to the 'tls_key' sysfs attribute	Hannes Reinecke
	Print a newline for easier userspace handling. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-tcp: check for invalidated or revoked key	Hannes Reinecke
	key_lookup() will always return a key, even if that key is revoked or invalidated. So check for invalid keys before continuing. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-tcp: sanitize TLS key handling	Hannes Reinecke
	There is a difference between TLS configured (ie the user has provisioned/requested a key) and TLS enabled (ie the connection is encrypted with TLS). This becomes important for secure concatenation, where the initial authentication is run on an unencrypted connection (ie with TLS configured, but not enabled), and then the queue is reset to run over TLS (ie TLS configured _and_ enabled). So to differentiate between those two states store the generated key in opts->tls_key (as we're using the same TLS key for all queues), the key serial of the resulting TLS handshake in ctrl->tls_pskid (to signal that TLS on the admin queue is enabled), and a simple flag for the queues to indicated that TLS has been enabled. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	nvme-keyring: restrict match length for version '1' identifiers	Hannes Reinecke
	TP8018 introduced a new TLS PSK identifier version (version 1), which appended a PSK hash value to the existing identifier (cf NVMe TCP specification v1.1, section 3.6.1.3 'TLS PSK and PSK Identity Derivation'). An original (version 0) identifier has the form: NVMe0<type><hmac> <hostnqn> <subsysnqn> and a version 1 identifier has the form: NVMe1<type><hmac> <hostnqn> <subsysnqn> <hash> This patch modifies the lookup algorthm to compare only the first part of the identifier (excluding the hash value) to handle both version 0 and version 1 identifiers. And the spec declares 'version 0' identifiers obsolete, so the lookup algorithm is modified to prever v1 identifiers. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22	net: xilinx: axienet: Fix dangling multicast addresses	Sean Anderson
	If a multicast address is removed but there are still some multicast addresses, that address would remain programmed into the frame filter. Fix this by explicitly setting the enable bit for each filter. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22	net: xilinx: axienet: Always disable promiscuous mode	Sean Anderson
	If promiscuous mode is disabled when there are fewer than four multicast addresses, then it will not be reflected in the hardware. Fix this by always clearing the promiscuous mode flag even when we program multicast addresses. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22	firmware: microchip: fix incorrect error report of programming:timeout on ↵	Steve Wilkins
	success After successfully programming the SPI flash with an MFPS auto update image, the error sysfs attribute reports programming:timeout. This is caused by an incorrect check on the return value from wait_for_completion_timeout() in mpfs_auto_update_poll_complete(). Fixes: ec5b0f1193ad ("firmware: microchip: add PolarFire SoC Auto Update support") Signed-off-by: Steve Wilkins <steve.wilkins@raymarine.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-08-22	cpuidle: remove dead code from cpuidle_enter_state()	Dhruva Gole
	Checking for index < 0 is useless because the find_deepest_state() function never really returns a negative value. Since this hasn't been reported in over 9 years it's dead code, so remove it. Signed-off-by: Dhruva Gole <d-gole@ti.com> Link: https://patch.msgid.link/20240821114250.1416421-1-d-gole@ti.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22	thermal: of: Fix OF node leak in of_thermal_zone_find() error paths	Krzysztof Kozlowski
	Terminating for_each_available_child_of_node() loop requires dropping OF node reference, so bailing out on errors misses this. Solve the OF node reference leak with scoped for_each_available_child_of_node_scoped(). Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-3-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22	thermal: of: Fix OF node leak in thermal_of_zone_register()	Krzysztof Kozlowski
	thermal_of_zone_register() calls of_thermal_zone_find() which will iterate over OF nodes with for_each_available_child_of_node() to find matching thermal zone node. When it finds such, it exits the loop and returns the node. Prematurely ending for_each_available_child_of_node() loops requires dropping OF node reference, thus success of of_thermal_zone_find() means that caller must drop the reference. Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-2-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22	thermal: of: Fix OF node leak in thermal_of_trips_init() error path	Krzysztof Kozlowski
	Terminating for_each_child_of_node() loop requires dropping OF node reference, so bailing out after thermal_of_populate_trip() error misses this. Solve the OF node reference leak with scoped for_each_child_of_node_scoped(). Fixes: d0c75fa2c17f ("thermal/of: Initialize trip points separately") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22	nvme_core: scan namespaces asynchronously	Stuart Hayes
	Use async function calls to make namespace scanning happen in parallel. Without the patch, NVME namespaces are scanned serially, so it can take a long time for all of a controller's namespaces to become available, especially with a slower (TCP) interface with large number of namespaces. It is not uncommon to have large numbers (hundreds or thousands) of namespaces on nvme-of with storage servers. The time it took for all namespaces to show up after connecting (via TCP) to a controller with 1002 namespaces was measured on one system: network latency without patch with patch 0 6s 1s 50ms 210s 10s 100ms 417s 18s Measurements taken on another system show the effect of the patch on the time nvme_scan_work() took to complete, when connecting to a linux nvme-of target with varying numbers of namespaces, on a network of 400us. namespaces without patch with patch 1 16ms 14ms 2 24ms 16ms 4 49ms 22ms 8 101ms 33ms 16 207ms 56ms 100 1.4s 0.6s 1000 12.9s 2.0s On the same system, connecting to a local PCIe NVMe drive (a Samsung PM1733) instead of a network target: namespaces without patch with patch 1 13ms 12ms 2 41ms 13ms Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2024-08-22	thermal: imx: Use the .should_bind() thermal zone callback	Rafael J. Wysocki
	Make the imx_thermal driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. In the imx_thermal case, it only needs to return 'true' for the passive trip point and it will match any cooling device passed to it, in analogy with the old-style imx_bind() callback function. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/2485070.jE0xQCEvom@rjwysocki.net
2024-08-22	mlxsw: core_thermal: Use the .should_bind() thermal zone callback	Rafael J. Wysocki
	Make the mlxsw core_thermal driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. It replaces the .bind() and .unbind() thermal zone callbacks (in 3 places) which assumed the same trip points ordering in the driver and in the thermal core (that may not be true any more in the future). The .bind() callbacks used loops over trip point indices to call thermal_zone_bind_cooling_device() for the same cdev (once it had been verified) and all of the trip points, but they passed different 'upper' and 'lower' values to it for each trip. To retain the original functionality, the .should_bind() callbacks need to use the same 'upper' and 'lower' values that would be used by the corresponding .bind() callbacks when they are about to return 'true'. To that end, the 'priv' field of each trip is set during the thermal zone initialization to point to the corresponding 'state' object containing the maximum and minimum cooling states of the cooling device. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/2216931.Icojqenx9y@rjwysocki.net
2024-08-22	platform/x86: acerhdf: Use the .should_bind() thermal zone callback	Rafael J. Wysocki
	Make the acerhdf driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. The previously existing acerhdf_bind() function bound cooling devices to thermal trip point 0 only, so the new callback needs to return 'true' for trip point 0. However, it is straightforward to observe that trip point 0 is an active trip point and the only other trip point in the driver's thermal zone is a critical one, so it is sufficient to return 'true' from that callback if the type of the given trip point is THERMAL_TRIP_ACTIVE. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Peter Kästle <peter@piie.net> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3779411.MHq7AAxBmi@rjwysocki.net
2024-08-22	thermal: core: Unexport thermal_bind_cdev_to_trip() and ↵	Rafael J. Wysocki
	thermal_unbind_cdev_from_trip() Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip() are only called locally in the thermal core now, they can be static, so change their definitions accordingly and drop their headers from the global thermal header file. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3512161.QJadu78ljV@rjwysocki.net
2024-08-22	thermal: ACPI: Use the .should_bind() thermal zone callback	Rafael J. Wysocki
	Make the ACPI thermal zone driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. This replaces the .bind() and .unbind() thermal zone callbacks which allows the code to be simplified quite significantly while providing the same functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/1812827.VLH7GnMWUR@rjwysocki.net