linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-12-04	HID: i2c-hid: Add IDEA5002 to i2c_hid_acpi_blacklist[]	Mario Limonciello
	Users have reported problems with recent Lenovo laptops that contain an IDEA5002 I2C HID device. Reports include fans turning on and running even at idle and spurious wakeups from suspend. Presumably in the Windows ecosystem there is an application that uses the HID device. Maybe that puts it into a lower power state so it doesn't cause spurious events. This device doesn't serve any functional purpose in Linux as nothing interacts with it so blacklist it from being probed. This will prevent the GPIO driver from setting up the GPIO and the spurious interrupts and wake events will not occur. Cc: stable@vger.kernel.org # 6.1 Reported-and-tested-by: Marcus Aram <marcus+oss@oxar.nl> Reported-and-tested-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2812 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-04	pinctrl: amd: Mask non-wake source pins with interrupt enabled at suspend	Mario Limonciello
	If a pin isn't marked as a wake source processing any interrupts is just going to destroy battery life. The APU may wake up from a hardware sleep state to process the interrupt but not return control to the OS. Mask interrupt for all non-wake source pins at suspend. They'll be re-enabled at resume. Reported-and-tested-by: Marcus Aram <marcus+oss@oxar.nl> Reported-and-tested-by: Mark Herbert <mark.herbert42@gmail.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2812 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20231203032431.30277-3-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-12-04	usb: typec: class: fix typec_altmode_put_partner to put plugs	RD Babiera
	When typec_altmode_put_partner is called by a plug altmode upon release, the port altmode the plug belongs to will not remove its reference to the plug. The check to see if the altmode being released evaluates against the released altmode's partner instead of the calling altmode itself, so change adev in typec_altmode_put_partner to properly refer to the altmode being released. typec_altmode_set_partner is not run for port altmodes, so also add a check in typec_altmode_release to prevent typec_altmode_put_partner() calls on port altmode release. Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20231129192349.1773623-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-04	USB: gadget: core: adjust uevent timing on gadget unbind	Roy Luo
	The KOBJ_CHANGE uevent is sent before gadget unbind is actually executed, resulting in inaccurate uevent emitted at incorrect timing (the uevent would have USB_UDC_DRIVER variable set while it would soon be removed). Move the KOBJ_CHANGE uevent to the end of the unbind function so that uevent is sent only after the change has been made. Fixes: 2ccea03a8f7e ("usb: gadget: introduce UDC Class") Cc: stable@vger.kernel.org Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20231128221756.2591158-1-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-04	platform/mellanox: Check devm_hwmon_device_register_with_groups() return value	Kunwu Chan
	devm_hwmon_device_register_with_groups() returns an error pointer upon failure. Check its return value for errors. Compile-tested only. Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver") Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Suggested-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20231201055447.2356001-1-chentao@kylinos.cn [ij: split the change into two] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-04	platform/mellanox: Add null pointer checks for devm_kasprintf()	Kunwu Chan
	devm_kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Compile-tested only. Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver") Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Suggested-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20231201055447.2356001-1-chentao@kylinos.cn [ij: split the change into two] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-04	mlxbf-bootctl: correctly identify secure boot with development keys	David Thompson
	The secure boot state of the BlueField SoC is represented by two bits: 0 = production state 1 = secure boot enabled 2 = non-secure (secure boot disabled) 3 = RMA state There is also a single bit to indicate whether production keys or development keys are being used when secure boot is enabled. This single bit (specified by MLXBF_BOOTCTL_SB_DEV_MASK) only has meaning if secure boot state equals 1 (secure boot enabled). The secure boot states are as follows: - “GA secured” is when secure boot is enabled with official production keys. - “Secured (development)” is when secure boot is enabled with development keys. Without this fix “GA Secured” is displayed on development cards which is misleading. This patch updates the logic in "lifecycle_state_show()" to handle the case where the SoC is configured for secure boot and is using development keys. Fixes: 79e29cb8fbc5c ("platform/mellanox: Add bootctl driver for Mellanox BlueField Soc") Reviewed-by: Khalil Blaiech <kblaiech@nvidia.com> Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/20231130183515.17214-1-davthompson@nvidia.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-12-04	regmap: fix bogus error on regcache_sync success	Matthias Reichl
	Since commit 0ec7731655de ("regmap: Ensure range selector registers are updated after cache sync") opening pcm512x based soundcards fail with EINVAL and dmesg shows sync cache and pm_runtime_get errors: [ 228.794676] pcm512x 1-004c: Failed to sync cache: -22 [ 228.794740] pcm512x 1-004c: ASoC: error at snd_soc_pcm_component_pm_runtime_get on pcm512x.1-004c: -22 This is caused by the cache check result leaking out into the regcache_sync return value. Fix this by making the check local-only, as the comment above the regcache_read call states a non-zero return value means there's nothing to do so the return value should not be altered. Fixes: 0ec7731655de ("regmap: Ensure range selector registers are updated after cache sync") Cc: stable@vger.kernel.org Signed-off-by: Matthias Reichl <hias@horus.com> Link: https://lore.kernel.org/r/20231203222216.96547-1-hias@horus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-04	firmware: arm_scmi: Fix possible frequency truncation when using level ↵	Sudeep Holla
	indexing mode The multiplier is already promoted to unsigned long, however the frequency calculations done when using level indexing mode doesn't use the multiplier computed. It instead hardcodes the multiplier value of 1000 at all the usage sites. Clean that up by assigning the multiplier value of 1000 when using the perf level indexing mode and update the frequency calculations to use the multiplier instead. It should fix the possible frequency truncation for all the values greater than or equal to 4GHz on 64-bit machines. Fixes: 31c7c1397a33 ("firmware: arm_scmi: Add v3.2 perf level indexing mode support") Reported-by: Sibi Sankar <quic_sibis@quicinc.com> Closes: https://lore.kernel.org/all/20231129065748.19871-3-quic_sibis@quicinc.com/ Cc: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20231130204343.503076-2-sudeep.holla@arm.com Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-12-04	firmware: arm_scmi: Fix frequency truncation by promoting multiplier type	Sudeep Holla
	Fix the possible frequency truncation for all values equal to or greater 4GHz on 64bit machines by updating the multiplier 'mult_factor' to 'unsigned long' type. It is also possible that the multiplier itself can be greater than or equal to 2^32. So we need to also fix the equation computing the value of the multiplier. Fixes: a9e3fbfaa0ff ("firmware: arm_scmi: add initial support for performance protocol") Reported-by: Sibi Sankar <quic_sibis@quicinc.com> Closes: https://lore.kernel.org/all/20231129065748.19871-3-quic_sibis@quicinc.com/ Cc: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20231130204343.503076-1-sudeep.holla@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-12-04	r8152: Add RTL8152_INACCESSIBLE to r8153_aldps_en()	Douglas Anderson
	Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set so that they don't delay too long if the device becomes inaccessible. Add the break to the loop in r8153_aldps_en(). Fixes: 4214cc550bf9 ("r8152: check if disabling ALDPS is finished") Reviewed-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-04	r8152: Add RTL8152_INACCESSIBLE to r8153_pre_firmware_1()	Douglas Anderson
	Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set so that they don't delay too long if the device becomes inaccessible. Add the break to the loop in r8153_pre_firmware_1(). Fixes: 9370f2d05a2a ("r8152: support request_firmware for RTL8153") Reviewed-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-04	r8152: Add RTL8152_INACCESSIBLE to r8156b_wait_loading_flash()	Douglas Anderson
	Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set so that they don't delay too long if the device becomes inaccessible. Add the break to the loop in r8156b_wait_loading_flash(). Fixes: 195aae321c82 ("r8152: support new chips") Reviewed-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-04	r8152: Add RTL8152_INACCESSIBLE checks to more loops	Douglas Anderson
	Previous commits added checks for RTL8152_INACCESSIBLE in the loops in the driver. There are still a few more that keep tripping the driver up in error cases and make things take longer than they should. Add those in. All the loops that are part of this commit existed in some form or another since the r8152 driver was first introduced, though RTL8152_INACCESSIBLE was known as RTL8152_UNPLUG before commit 715f67f33af4 ("r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE") Fixes: ac718b69301c ("net/usb: new driver for RTL8152") Reviewed-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-04	r8152: Hold the rtnl_lock for all of reset	Douglas Anderson
	As of commit d9962b0d4202 ("r8152: Block future register access if register access fails") there is a race condition that can happen between the USB device reset thread and napi_enable() (not) getting called during rtl8152_open(). Specifically: * While rtl8152_open() is running we get a register access error that's _not_ -ENODEV and queue up a USB reset. * rtl8152_open() exits before calling napi_enable() due to any reason (including usb_submit_urb() returning an error). In that case: * Since the USB reset is perform in a separate thread asynchronously, it can run at anytime USB device lock is not held - even before rtl8152_open() has exited with an error and caused __dev_open() to clear the __LINK_STATE_START bit. * The rtl8152_pre_reset() will notice that the netif_running() returns true (since __LINK_STATE_START wasn't cleared) so it won't exit early. * rtl8152_pre_reset() will then hang in napi_disable() because napi_enable() was never called. We can fix the race by making sure that the r8152 reset routines don't run at the same time as we're opening the device. Specifically we need the reset routines in their entirety rely on the return value of netif_running(). The only way to reliably depend on that is for them to hold the rntl_lock() mutex for the duration of reset. Grabbing the rntl_lock() mutex for the duration of reset seems like a long time, but reset is not expected to be common and the rtnl_lock() mutex is already held for long durations since the core grabs it around the open/close calls. Fixes: d9962b0d4202 ("r8152: Block future register access if register access fails") Reviewed-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-04	pinctrl: starfive: jh7100: ignore disabled device tree nodes	Nam Cao
	The driver always registers pin configurations in device tree. This can cause some inconvenience to users, as pin configurations in the base device tree cannot be disabled in the device tree overlay, even when the relevant devices are not used. Ignore disabled pin configuration nodes in device tree. Fixes: ec648f6b7686 ("pinctrl: starfive: Add pinctrl driver for StarFive SoCs") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/fe4c15dcc3074412326b8dc296b0cbccf79c49bf.1701422582.git.namcao@linutronix.de Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-12-04	pinctrl: starfive: jh7110: ignore disabled device tree nodes	Nam Cao
	The driver always registers pin configurations in device tree. This can cause some inconvenience to users, as pin configurations in the base device tree cannot be disabled in the device tree overlay, even when the relevant devices are not used. Ignore disabled pin configuration nodes in device tree. Fixes: 447976ab62c5 ("pinctrl: starfive: Add StarFive JH7110 sys controller driver") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/fd8bf044799ae50a6291ae150ef87b4f1923cacb.1701422582.git.namcao@linutronix.de Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-12-04	greybus: gb-beagleplay: Ensure le for values in transport	Ayush Singh
	Ensure that the following values are little-endian: - header->pad (which is used for cport_id) - header->size Fixes: ec558bbfea67 ("greybus: Add BeaglePlay Linux Driver") Reported-by: kernel test robot <yujie.liu@intel.com> Closes: https://lore.kernel.org/r/202311072329.Xogj7hGW-lkp@intel.com/ Signed-off-by: Ayush Singh <ayushdevel1325@gmail.com> Link: https://lore.kernel.org/r/20231203075312.255233-1-ayushdevel1325@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-04	greybus: BeaglePlay driver needs CRC_CCITT	Randy Dunlap
	The gb-beagleplay driver uses crc_ccitt(), so it should select CRC_CCITT to make sure that the function is available. Fixes these build errors: s390-linux-ld: drivers/greybus/gb-beagleplay.o: in function `hdlc_append_tx_u8': gb-beagleplay.c:(.text+0x2c0): undefined reference to `crc_ccitt' s390-linux-ld: drivers/greybus/gb-beagleplay.o: in function `hdlc_rx_frame': gb-beagleplay.c:(.text+0x6a0): undefined reference to `crc_ccitt' Fixes: ec558bbfea67 ("greybus: Add BeaglePlay Linux Driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Ayush Singh <ayushdevel1325@gmail.com> Cc: Johan Hovold <johan@kernel.org> Cc: Alex Elder <elder@kernel.org> Cc: greybus-dev@lists.linaro.org Link: https://lore.kernel.org/r/20231031040909.21201-1-rdunlap@infradead.org Reviewed-by: Ayush Singh <ayushdevel1325@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-04	Merge 6.7-rc4 into char-misc-linus	Greg Kroah-Hartman
	We need 6.7-rc4 in here as we need to revert one of the debugfs changes that came in that release through the wireless tree. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-03	hwmon: (nzxt-kraken2) Fix error handling path in kraken2_probe()	Christophe JAILLET
	There is no point in calling hid_hw_stop() if hid_hw_start() has failed. There is no point in calling hid_hw_close() if hid_hw_open() has failed. Update the error handling path accordingly. Fixes: 82e3430dfa8c ("hwmon: add driver for NZXT Kraken X42/X52/X62/X72") Reported-by: Aleksa Savic <savicaleksa83@gmail.com> Closes: https://lore.kernel.org/all/121470f0-6c1f-418a-844c-7ec2e8a54b8e@gmail.com/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jonas Malaco <jonas@protocubo.io> Link: https://lore.kernel.org/r/a768e69851a07a1f4e29f270f4e2559063f07343.1701617030.git.christophe.jaillet@wanadoo.fr Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-12-03	Merge tag 'firewire-fixes-6.7-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A single patch to fix long-standing issue of memory leak at failure of device registration for fw_unit. We rarely encounter the issue, but it should be applied to stable releases, since it fixes inappropriate API usage" * tag 'firewire-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: fix possible memory leak in create_units()
2023-12-03	Merge tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfio	Linus Torvalds
	Pull vfio fixes from Alex Williamson: - Fix the lifecycle of a mutex in the pds variant driver such that a reset prior to opening the device won't find it uninitialized. Implement the release path to symmetrically destroy the mutex. Also switch a different lock from spinlock to mutex as the code path has the potential to sleep and doesn't need the spinlock context otherwise (Brett Creeley) - Fix an issue detected via randconfig where KVM tries to symbol_get an undeclared function. The symbol is temporarily declared unconditionally here, which resolves the problem and avoids churn relative to a series pending for the next merge window which resolves some of this symbol ugliness, but also fixes Kconfig dependencies (Sean Christopherson) * tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfio: vfio: Drop vfio_file_iommu_group() stub to fudge around a KVM wart vfio/pds: Fix possible sleep while in atomic context vfio/pds: Fix mutex lock->magic != lock warning
2023-12-03	Merge tag 'for-linus-6.7a-rc4-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for the Xen event driver setting the correct return value when experiencing an allocation failure - A fix for allocating space for a struct in the percpu area to not cross page boundaries (this one is for x86, a similar one for Arm was already in the pull request for rc3) * tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events: fix error code in xen_bind_pirq_msi_to_irq() x86/xen: fix percpu vcpu_info allocation
2023-12-02	r8169: fix rtl8125b PAUSE frames blasting when suspended	ChunHao Lin
	When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state. When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network. This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel): 1) ping -f $CHROMEBOX (from workstation on same local network) 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX 3) ping $ROUTER (wait until ping fails from workstation) Takes about ~20-30 seconds after step 2 for the local network to stop working. Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state. Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler <grundler@chromium.org> Tested-by: Grant Grundler <grundler@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin <hau@realtek.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/20231129155350.5843-1-hau@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-01	hv_netvsc: rndis_filter needs to select NLS	Randy Dunlap
	rndis_filter uses utf8s_to_utf16s() which is provided by setting NLS, so select NLS to fix the build error: ERROR: modpost: "utf8s_to_utf16s" [drivers/net/hyperv/hv_netvsc.ko] undefined! Fixes: 1ce09e899d28 ("hyperv: Add support for setting MAC from within guests") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20231130055853.19069-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-01	Merge tag 'md-fixes-20231201-1' of ↵	Jens Axboe
	https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.7 Pull MD fix from Song: "This change fixes issue with raid456 reshape." * tag 'md-fixes-20231201-1' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid6: use valid sector values to determine if an I/O should wait on the reshape
2023-12-01	net/tg3: fix race condition in tg3_reset_task()	Thinh Tran
	When an EEH error is encountered by a PCI adapter, the EEH driver modifies the PCI channel's state as shown below: enum { /* I/O channel is in normal state / pci_channel_io_normal = (__force pci_channel_state_t) 1, / I/O to channel is blocked / pci_channel_io_frozen = (__force pci_channel_state_t) 2, / PCI card is dead */ pci_channel_io_perm_failure = (__force pci_channel_state_t) 3, }; If the same EEH error then causes the tg3 driver's transmit timeout logic to execute, the tg3_tx_timeout() function schedules a reset task via tg3_reset_task_schedule(), which may cause a race condition between the tg3 and EEH driver as both attempt to recover the HW via a reset action. EEH driver gets error event --> eeh_set_channel_state() and set device to one of error state above scheduler: tg3_reset_task() get returned error from tg3_init_hw() --> dev_close() shuts down the interface tg3_io_slot_reset() and tg3_io_resume() fail to reset/resume the device To resolve this issue, we avoid the race condition by checking the PCI channel state in the tg3_reset_task() function and skip the tg3 driver initiated reset when the PCI channel is not in the normal state. (The driver has no access to tg3 device registers at this point and cannot even complete the reset task successfully without external assistance.) We'll leave the reset procedure to be managed by the EEH driver which calls the tg3_io_error_detected(), tg3_io_slot_reset() and tg3_io_resume() functions as appropriate. Adding the same checking in tg3_dump_state() to avoid dumping all device registers when the PCI channel is not in the normal state. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> Tested-by: Venkata Sai Duggi <venkata.sai.duggi@ibm.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201001911.656-1-thinhtr@linux.vnet.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-02	Merge tag 'pm-6.7-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix issues in two cpufreq drivers, in the AMD P-state driver and in the power-capping DTPM framework. Specifics: - Fix the AMD P-state driver's EPP sysfs interface in the cases when the performance governor is in use (Ayush Jain) - Make the ->fast_switch() callback in the AMD P-state driver return the target frequency as expected (Gautham R. Shenoy) - Allow user space to control the range of frequencies to use via scaling_min_freq and scaling_max_freq when AMD P-state driver is in use (Wyes Karny) - Prevent power domains needed for wakeup signaling from being turned off during system suspend on Qualcomm systems and prevent performance states votes from runtime-suspended devices from being lost across a system suspend-resume cycle in qcom-cpufreq-nvmem (Stephan Gerhold) - Fix disabling the 792 Mhz OPP in the imx6q cpufreq driver for the i.MX6ULL types that can run at that frequency (Christoph Niedermaier) - Eliminate unnecessary and harmful conversions to uW from the DTPM (dynamic thermal and power management) framework (Lukasz Luba)" * tag 'pm-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq/amd-pstate: Only print supported EPP values for performance governor cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update powercap: DTPM: Fix unneeded conversions to micro-Watts cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch() pmdomain: qcom: rpmpd: Set GENPD_FLAG_ACTIVE_WAKEUP cpufreq: qcom-nvmem: Preserve PM domain votes in system suspend cpufreq: qcom-nvmem: Enable virtual power domain devices cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
2023-12-02	Merge tag 'acpi-6.7-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "This fixes a recently introduced build issue on ARM32 and a NULL pointer dereference in the ACPI backlight driver due to a design issue exposed by a recent change in the ACPI bus type code. Specifics: - Fix a recently introduced build issue on ARM32 platforms caused by an inadvertent header file breakage (Dave Jiang) - Eliminate questionable usage of acpi_driver_data() in the ACPI backlight cooling device code that leads to NULL pointer dereferences after recent ACPI core changes (Hans de Goede)" * tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: video: Use acpi_video_device for cooling-dev driver data ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-02	Merge tag 'iommu-fixes-v6.7-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix race conditions in device probe path - Handle ERR_PTR() returns in __iommu_domain_alloc() path - Update MAINTAINERS entry for Qualcom IOMMUs - Printk argument fix in device tree specific code - Several Intel VT-d fixes from Lu Baolu: - Do not support enforcing cache coherency for non-empty domains - Avoid devTLB invalidation if iommu is off - Disable PCI ATS in legacy passthrough mode - Support non-PCI devices when clearing context - Fix incorrect cache invalidation for mm notification - Add MTL to quirk list to skip TE disabling - Set variable intel_dirty_ops to static * tag 'iommu-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix printk arg in of_iommu_get_resv_regions() iommu/vt-d: Set variable intel_dirty_ops to static iommu/vt-d: Fix incorrect cache invalidation for mm notification iommu/vt-d: Add MTL to quirk list to skip TE disabling iommu/vt-d: Make context clearing consistent with context mapping iommu/vt-d: Disable PCI ATS in legacy passthrough mode iommu/vt-d: Omit devTLB invalidation requests when TES=0 iommu/vt-d: Support enforce_cache_coherency only for empty domains iommu: Avoid more races around device probe MAINTAINERS: list all Qualcomm IOMMU drivers in the QUALCOMM IOMMU entry iommu: Flow ERR_PTR out from __iommu_domain_alloc()
2023-12-02	Merge tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Weekly fixes, mostly amdgpu fixes with a scattering of nouveau, i915, and a couple of reverts. Hopefully it will quieten down in coming weeks. drm: - Revert unexport of prime helpers for fd/handle conversion dma_resv: - Do not double add fences in dma_resv_add_fence. gpuvm: - Fix GPUVM license identifier. i915: - Mark internal GSC engine with reserved uabi class - Take VGA converters into account in eDP probe - Fix intel_pre_plane_updates() call to ensure workarounds get applied panel: - Revert panel fixes as they require exporting device_is_dependent. nouveau: - fix oversized allocations in new vm path - fix zero-length array - remove a stray lock nt36523: - Fix error check for nt36523. amdgpu: - DMUB fix - DCN 3.5 fixes - XGMI fix - DCN 3.2 fixes - Vangogh suspend fix - NBIO 7.9 fix - GFX11 golden register fix - Backlight fix - NBIO 7.11 fix - IB test overflow fix - DCN 3.1.4 fixes - fix a runtime pm ref count - Retimer fix - ABM fix - DCN 3.1.5 fix - Fix AGP addressing - Fix possible memory leak in SMU error path - Make sure PME is enabled in D3 - Fix possible NULL pointer dereference in debugfs - EEPROM fix - GC 9.4.3 fix amdkfd: - IP version check fix - Fix memory leak in pqm_uninit()" * tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drm: (53 commits) Revert "drm/prime: Unexport helpers for fd/handle conversion" drm/amdgpu: Use another offset for GC 9.4.3 remap drm/amd/display: Fix some HostVM parameters in DML drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit drm/amdgpu: Update EEPROM I2C address for smu v13_0_0 drm/amd/display: Allow DTBCLK disable for DCN35 drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer drm/amd: Enable PCIe PME from D3 drm/amd/pm: fix a memleak in aldebaran_tables_init drm/amdgpu: fix AGP addressing when GART is not at 0 drm/amd/display: update dcn315 lpddr pstate latency drm/amd/display: fix ABM disablement drm/amd/display: Fix black screen on video playback with embedded panel drm/amd/display: Fix conversions between bytes and KB drm/amdkfd: Use common function for IP version check drm/amd/display: Remove config update drm/amd/display: Update DCN35 clock table policy drm/amd/display: force toggle rate wa for first link training for a retimer drm/amdgpu: correct the amdgpu runtime dereference usage count drm/amd/display: Update min Z8 residency time to 2100 for DCN314 ...
2023-12-01	md/raid6: use valid sector values to determine if an I/O should wait on the ↵	David Jeffery
	reshape During a reshape or a RAID6 array such as expanding by adding an additional disk, I/Os to the region of the array which have not yet been reshaped can stall indefinitely. This is from errors in the stripe_ahead_of_reshape function causing md to think the I/O is to a region in the actively undergoing the reshape. stripe_ahead_of_reshape fails to account for the q disk having a sector value of 0. By not excluding the q disk from the for loop, raid6 will always generate a min_sector value of 0, causing a return value which stalls. The function's max_sector calculation also uses min() when it should use max(), causing the max_sector value to always be 0. During a backwards rebuild this can cause the opposite problem where it allows I/O to advance when it should wait. Fixing these errors will allow safe I/O to advance in a timely manner and delay only I/O which is unsafe due to stripes in the middle of undergoing the reshape. Fixes: 486f60558607 ("md/raid5: Check all disks in a stripe_head for reshape progress") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231128181233.6187-1-djeffery@redhat.com
2023-12-01	spi: atmel: Drop unused defines	Miquel Raynal
	These defines are leftovers from previous versions of the blamed commit, they are simply unused so drop them. Fixes: e0205d6203c2 ("spi: atmel: Prevent false timeouts on long transfers") Reported-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20231127095842.389631-2-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-01	spi: atmel: Do not cancel a transfer upon any signal	Miquel Raynal
	The intended move from wait_for_completion_() to wait_for_completion_interruptible_() was to allow (very) long spi memory transfers to be stopped upon user request instead of freezing the machine forever as the timeout value could now be significantly bigger. However, depending on the user logic, applications can receive many signals for their own "internal" purpose and have nothing to do with the requested kernel operations, hence interrupting spi transfers upon any signal is probably not a wise choice. Instead, let's switch to wait_for_completion_killable_*() to only catch the "important" signals. This was likely the intended behavior anyway. Fixes: e0205d6203c2 ("spi: atmel: Prevent false timeouts on long transfers") Cc: stable@vger.kernel.org Reported-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20231127095842.389631-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-02	Merge tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux	Linus Torvalds
	Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Invalid namespace identification error handling (Marizio Ewan, Keith) - Fabrics keep-alive tuning (Mark) - Fix for a bad error check regression in bcache (Markus) - Fix for a performance regression with O_DIRECT (Ming) - Fix for a flush related deadlock (Ming) - Make the read-only warn on per-partition (Yu) * tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux: nvme-core: check for too small lba shift blk-mq: don't count completed flush data request as inflight in case of quiesce block: Document the role of the two attribute groups block: warn once for each partition in bio_check_ro() block: move .bd_inode into 1st cacheline of block_device nvme: check for valid nvme_identify_ns() before using it nvme-core: fix a memory leak in nvme_ns_info_from_identify() nvme: fine-tune sending of first keep-alive bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
2023-12-02	Merge tag 'dm-6.7/dm-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM verity target's FEC support to always initialize IO before it frees it. Also fix alignment of struct dm_verity_fec_io within the per-bio-data - Fix DM verity target to not FEC failed readahead IO - Update DM flakey target to use MAX_ORDER rather than MAX_ORDER - 1 * tag 'dm-6.7/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-flakey: start allocating with MAX_ORDER dm-verity: align struct dm_verity_fec_io properly dm verity: don't perform FEC for failed readahead IO dm verity: initialize fec io before freeing it
2023-12-02	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small fixes, one in drivers. The core changes are to the internal representation of flags in scsi_devices which removes space wasting bools in favour of single bit flags and to add a flag to force a runtime resume which is used by ATA devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Fix system start for ATA devices scsi: Change SCSI device boolean fields to single bit flags scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode
2023-12-02	Merge tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs	Linus Torvalds
	Pull more bcachefs bugfixes from Kent Overstreet: - bcache & bcachefs were broken with CFI enabled; patch for closures to fix type punning - mark erasure coding as extra-experimental; there are incompatible disk space accounting changes coming for erasure coding, and I'm still seeing checksum errors in some tests - several fixes for durability-related issues (durability is a device specific setting where we can tell bcachefs that data on a given device should be counted as replicated x times) - a fix for a rare livelock when a btree node merge then updates a parent node that is almost full - fix a race in the device removal path, where dropping a pointer in a btree node to a device would be clobbered by an in flight btree write updating the btree node key on completion - fix one SRCU lock hold time warning in the btree gc code - ther's still a bunch more of these to fix - fix a rare race where we'd start copygc before initializing the "are we rw" percpu refcount; copygc would think we were already ro and die immediately * tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits) bcachefs: Extra kthread_should_stop() calls for copygc bcachefs: Convert gc_alloc_start() to for_each_btree_key2() bcachefs: Fix race between btree writes and metadata drop bcachefs: move journal seq assertion bcachefs: -EROFS doesn't count as move_extent_start_fail bcachefs: trace_move_extent_start_fail() now includes errcode bcachefs: Fix split_race livelock bcachefs: Fix bucket data type for stripe buckets bcachefs: Add missing validation for jset_entry_data_usage bcachefs: Fix zstd compress workspace size bcachefs: bpos is misaligned on big endian bcachefs: Fix ec + durability calculation bcachefs: Data update path won't accidentaly grow replicas bcachefs: deallocate_extra_replicas() bcachefs: Proper refcounting for journal_keys bcachefs: preserve device path as device name bcachefs: Fix an endianness conversion bcachefs: Start gc, copygc, rebalance threads after initing writes ref bcachefs: Don't stop copygc thread on device resize bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops ...
2023-12-01	Merge branch 'powercap'	Rafael J. Wysocki
	Merge a power capping fix for 6.7-rc4 which eliminates unnecessary and harmful conversions to uW from the DTPM (dynamic thermal and power management) framework (Lukasz Luba). * powercap: powercap: DTPM: Fix unneeded conversions to micro-Watts
2023-12-01	nvme-core: check for too small lba shift	Keith Busch
	The block layer doesn't support logical block sizes smaller than 512 bytes. The nvme spec doesn't support that small either, but the driver isn't checking to make sure the device responded with usable data. Failing to catch this will result in a kernel bug, either from a division by zero when stacking, or a zero length bio. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-01	pds_vdpa: set features order	Shannon Nelson
	Fix up the order that the device and negotiated features are checked to get a more reliable difference when things get changed. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-4-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2023-12-01	pds_vdpa: clear config callback when status goes to 0	Shannon Nelson
	If the client driver is setting status to 0, something is getting shutdown and possibly removed. Make sure we clear the config_cb so that it doesn't end up crashing when trying to call a bogus callback. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-3-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2023-12-01	pds_vdpa: fix up format-truncation complaint	Shannon Nelson
	Our friendly kernel test robot has recently been pointing out some format-truncation issues. Here's a fix for one of them. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311040109.RfgJoE7L-lkp@intel.com/ Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-2-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2023-12-01	vdpa/mlx5: preserve CVQ vringh index	Steve Sistare
	mlx5_vdpa does not preserve userland's view of vring base for the control queue in the following sequence: ioctl VHOST_SET_VRING_BASE ioctl VHOST_VDPA_SET_STATUS VIRTIO_CONFIG_S_DRIVER_OK mlx5_vdpa_set_status() setup_cvq_vring() vringh_init_iotlb() vringh_init_kern() vrh->last_avail_idx = 0; ioctl VHOST_GET_VRING_BASE To fix, restore the value of cvq->vring.last_avail_idx after calling vringh_init_iotlb. Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1699014387-194368-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-12-01	octeontx2-af: Check return value of nix_get_nixlf before using nixlf	Subbaraya Sundeep
	If a NIXLF is not attached to a PF/VF device then nix_get_nixlf function fails and returns proper error code. But npc_get_default_entry_action does not check it and uses garbage value in subsequent calls. Fix this by cheking the return value of nix_get_nixlf. Fixes: 967db3529eca ("octeontx2-af: add support for multicast/promisc packet replication feature") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-01	octeontx2-pf: Add missing mutex lock in otx2_get_pauseparam	Subbaraya Sundeep
	All the mailbox messages sent to AF needs to be guarded by mutex lock. Add the missing lock in otx2_get_pauseparam function. Fixes: 75f36270990c ("octeontx2-pf: Support to enable/disable pause frames via ethtool") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-01	gpiolib: sysfs: Fix error handling on failed export	Boerge Struempfel
	If gpio_set_transitory() fails, we should free the GPIO again. Most notably, the flag FLAG_REQUESTED has previously been set in gpiod_request_commit(), and should be reset on failure. To my knowledge, this does not affect any current users, since the gpio_set_transitory() mainly returns 0 and -ENOTSUPP, which is converted to 0. However the gpio_set_transitory() function calles the .set_config() function of the corresponding GPIO chip and there are some GPIO drivers in which some (unlikely) branches return other values like -EPROBE_DEFER, and -EINVAL. In these cases, the above mentioned FLAG_REQUESTED would not be reset, which results in the pin being blocked until the next reboot. Fixes: e10f72bf4b3e ("gpio: gpiolib: Generalise state persistence beyond sleep") Signed-off-by: Boerge Struempfel <boerge.struempfel@gmail.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2023-12-01	iommu: Fix printk arg in of_iommu_get_resv_regions()	Daniel Mentz
	The variable phys is defined as (struct resource ) which aligns with the printk format specifier %pr. Taking the address of it results in a value of type (struct resource *) which is incompatible with the format specifier %pr. Therefore, remove the address of operator (&). Fixes: a5bf3cfce8cb ("iommu: Implement of_iommu_get_resv_regions()") Signed-off-by: Daniel Mentz <danielmentz@google.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20231108062226.928985-1-danielmentz@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-01	drm/i915: Check pipe active state in {planes,vrr}_{enabling,disabling}()	Ville Syrjälä
	{planes,vrr}_{enabling,disabling}() are supposed to indicate whether the specific hardware feature is supposed to be enabling or disabling. That can only makes sense if the pipe is active overall. So check for that before we go poking at the hardware. I think we're semi-safe currently on due to: - intel_pre_plane_update() doesn't get called when the pipe was not-active prior to the commit, but this is actually a bug. This saves vrr_disabling(), and vrr_enabling() is called from deeper down where we have already checked hw.active. - active_planes mirrors the crtc's hw.active Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231121054324.9988-2-ville.syrjala@linux.intel.com (cherry picked from commit bc53c4d56eb24dbe56cd2c66ef4e9fc9393b1533) Signed-off-by: Jani Nikula <jani.nikula@intel.com>